Search
SE Radio Guest Dan Lorenc

SE Radio 712: Dan Lorenc on Sigstore

Dan Lorenc, co-founder and CEO of Chainguard, joins host Priyanka Raghavan to explore Sigstore and its role in securing the software supply chain. They unpack the challenges of supply chain security, including verifying the origin and integrity of software artifacts, and explain the problems Sigstore is designed to solve. The conversation goes under the hood to examine how Sigstore works, covering key components such as code signing, verification, the certificate authority model, and transparency logs—often compared conceptually to blockchain for their auditability. The episode also highlights real-world adoption, community resources for getting started, and closes with a discussion of Chainguard Images and how development teams can use them to build with more secure base images.

Brought to you by IEEE Computer Society and IEEE Software magazine.



Show Notes

Related Episodes

References

  1. Overview
  2. sigstore
  3. Sigstore – Open Source Security Foundation
  4. Sigstore
  5. Sigstore Proves That Effective Supply Chain Security Doesn’t Have to Hurt – Sigstore Blog
  6. https://www.linkedin.com/in/danlorenc/

Transcript

Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.

Priyanka Raghavan 00:00:18 Welcome to Software Engineering Radio. I’m Priyanka Raghavan, and today we’ll be discussing the topic Sigstore. Our guest today is Dan Lorenc, co-founder and CEO of software supply chain security company Chainguard. Dan has been working on and worrying about containers for a very long time. We listen to a lot of his podcasts, and I wanted to previously talk about something that Dan has mentioned in a lot of his podcasts that I’ve listened to that he really was passionate about securing supply chains right from 2015 as an engineer and manager. He was on SE Radio discussing Software Supply Chain Attacks in Episode 535 and at SE Radio, we’ve also done other episodes on software supply chain security, such as Episode 541 on Securing the Supply Chain and Episode 606 on Third Party Supply Chain Risks. So great to have you back on the show, Dan, and welcome.

Dan Lorenc 00:01:17 Thanks for having me back.

Priyanka Raghavan 00:01:19 I just wanted to set the stage. It’s 2026 and why do software supply chain attacks matter even today?

Dan Lorenc 00:01:26 I’ll start by explaining what a software supply chain attack is and then kind of give my take on why they matter today and why they’re happening today. A software supply chain attack is an attack on the software supply chain. That’s probably basic, but the software supply chain is all of the tools and steps and libraries and components and ingredients that go into the end software you’re using. When you use any piece of software, a cell phone app and you swipe your debit card and an ATM, you check in for a flight, there’s an end application that you’re interacting with there, but that’s not written hundred percent from scratch. Every time you use a piece of software, that software is made up of tons of reusable components. That’s how teams are able to ship so much amazing software today. You’re not starting from scratch every single time, and that software supply chain is very large today.

Dan Lorenc 00:02:11 There’s tons of different programming languages and the libraries and tools and build systems that go into getting those raw ingredients, those things at the very, very start into that finished piece of software that you get to use. It’s a complicated system that goes across companies, it goes across organizations, it goes across countries, especially when you take open source into account. And so, there’s a lot of links to that supply chain, like any chain that’s only as strong as the weakest ones. So, an attack is some kind of an attacker or a nation state, a criminal, someone exploits some vulnerabilities somewhere in that supply chain to get access to the end result. So instead of just hacking into a system by coming in through the front door, they’re getting into that system by coming in through the back door, the supply chain itself. And once you get code into that finished product, it’s game over code execution.

Dan Lorenc 00:03:01 And attackers in recent years have started to pay more and more attention to this space. It’s not necessarily easier to do than other ways of hacking systems, but over the last 10, 15 years, the industry has gotten a lot better at securing all the other ways in. And if you log into your bank today, you have to use two-factor authentication and things like that, that weren’t around six or seven years ago or were around but only used by really security critical organizations. Now it’s everywhere. And those text messages you get when you log into any of your accounts online, that’s all done to make it harder to steal people’s passwords to break into systems. So, because we’ve done such a good job keeping attackers out in all of these other ways, they pivot to the next easiest way, which is the supply chain. And at the same time, open source has grown in ubiquity, grown in usage. It’s made its way into some of these most sensitive environments. And so, there’s a lot more of a surface area for these attacks to take place on.

Priyanka Raghavan 00:03:53 And I think when we talk about securing the software supply chain, last year there was this huge attack towards the end of the year, which was on the npm registry where it’s called “Shai-Hulud” and the, I think there was some 500 repositories which were affected and it was basically a replicating worm. I wanted to ask you about that. Could you shed some light on that before we move into Sigstore?

Dan Lorenc 00:04:19 So, Shai-Hulud was the name of the attack. It was the name the attackers chose. A way to detect it was it created this public repository called Shai-Hulud. This was chosen by then, but it was intact on the software supply chain in npm, in the Node.js package manager. They compromised a couple packages at the start by doing basic account takeovers for the maintainers of those packages, and instead of just attacking those packages, it was a self-replicating worm in the npm registry. So, one of the things they did was steal credentials. So, if you use one of the affected libraries while you were publishing other npm packages, which is very common, it stole credentials to publish packages to npm and it inserted that malware into all of the affected packages. That’s why it spread very, very quickly. I don’t remember the final stats, but I saw there was like double digit percentages of the internet were basically affected by this very, very quickly because of that exponential nature.

Dan Lorenc 00:05:11 They had one, and then it spread that way. It was taken down by npm, but then since the attackers have done it two or three more times, there were Shai-Hulud part one, part two, and part three. It was a big wakeup call one because of how fast it spread, and I think two, because it was one of the first ones that stole credentials in that way. These attacks have been happening weekly for years, but for the most part, the attackers have kind of settled on installing crypto miners on people’s laptops or in systems like that as a way of making money off of the attack. That’s not great, but it’s also not the end of the world. There are a lot worse things attackers can do, like ransomware or stealing sensitive data or something like that once they have that code execution in your environment. And so, this is one of the first times the attack actually went after more than just free cryptos where your electric bill goes up a little bit once it gets installed. So, it’s kind of a big wake up moment for the industry.

Priyanka Raghavan 00:06:00 And let’s now come into Sigstore. What is it? Can you explain that to our listeners?

Dan Lorenc 00:06:05 Yeah I’ll start with the problem that Shai-Hulud exploited and then connect that back to how Sigstore designed to help prevent some of these. And then we can kind of talk about the architecture and how it all works. But when we first talk about open source, the definition of open source is a legal definition. It’s not anything related to programming or GitHub or access or anything like that. There’s a group called the OSI, the Open Source Initiative, and they maintain a bunch of definitions about what type of license software has to be released under to be considered open-source. There’s a bunch of principles in there that licenses have to follow, and they review licenses and have a list of ones that they’ve reviewed that they will attest meet kind of the standard. But that applies to the source code itself. That applies to the JavaScript files themselves.

Dan Lorenc 00:06:46 If you’re talking about Node.js or the .qi Python files for Python, unfortunately that’s not actually how most organizations consume open source. They don’t get it in that source code form. They get these packages which have been transformed from source code into something that’s a lot easier to run. In the JavaScript and Python cases, they include things like native builds depending on what platform you’re running on and things like that. So, if you’re looking at the source code for package and then you install that package, they might be two separate things. There’s no direct link from the source code that is up on GitHub to the open-source package that you have downloaded and installed on your computer. In fact, most people, when they publish a package, they download that source code, they compile it on a laptop, and then they upload it to a different system.

Dan Lorenc 00:07:30 So that’s where npm comes in from the node ecosystem. People aren’t npm installing packages from GitHub, they’re installing from npm. So, two separate systems. And there’s no real evidence that the npm package you’ve installed has any correlation to the source code for that package pack over on GitHub. And that’s the link that a lot of these attackers have been exploiting. You might review the source and say, this is great. There’s no malware in here, there are no vulnerabilities. I’ve checked it all. But then when you install it, boom, there’s malware in there because somebody slipped it in at that step. There’s a lot of steps in the supply chain. There are a lot of places this stuff could be inserted, but that one is the most common. Sigstore is designed to provide a cryptographic link between any of these steps in the supply chain signatures signed by people, signed by systems, something like that, that says, this npm package actually came from this piece of source code.

Dan Lorenc 00:08:17 It’s an irreversible step. You can’t actually go back and check this kind of thing. But with signatures, you get kind of this promise from some person with an identity in the signature there that says, yes, I’m this person. I took this software from here and I put it over here. And systems like Sigstore, it’s not the first one to kind of do signing in this way, cryptographic signing, but it was one of the first ones designed to do that scale for open source. There have been systems around before like PGP where people do key signing parties and stuff like that to verify identities, but Sigstore can kind of get this verifiable proof that the source code hasn’t been tampered with along the way. But if you go back to other analogies in like the food safety industry, Sigstore’s like that, tamper proof seal on the jar of pasta sauce or something you buy. It doesn’t tell you if the ingredients are good. It doesn’t really tell you if it’s expired, if something happened in the manufacturing and they have to be recalled, it can’t really prevent that. But it can tell you that the tamper proof seal hasn’t been removed in shipping that jar to you. And so, it’s one piece in an important ecosystem of trying to secure and harden all of those links of the supply chain.

Priyanka Raghavan 00:09:26 Okay. that’s a great analogy. I wanted to ask you when you sign this or make it tamper proof, and you talked mainly about open source, but can this also be used for internal libraries and containers?

Dan Lorenc 00:09:40 Lots of organizations run this same technology internally. That’s an important part of securing internal supply chains. Insider risk is a pretty important problem inside of large companies, making sure that individuals, whether they’re doing it on purpose or unintentionally if say their laptop has compromised or something, don’t have unilateral permissions to change the running code in production, these things have to be reviewed by multiple people. And Sigstore is also used by a lot of companies internally to do all of that. If you have a whole bunch of controls and policies in place saying that things have to be built on this build system to make it to production, Sigstore can be used in that way too.

Priyanka Raghavan 00:10:13 Would you have like a risk profile to save for the thing that’s signed? I know you said it’s, you can’t really check what is inside, but is there some kind of a risk profile maintained?

Dan Lorenc 00:10:25 No. So, you can kind of build those systems on top of Sigstore, but at the end of the day, it comes down to trust. You have to trust someone. If a bunch of organizations have tested it and verified that it works correctly, they can publish that on the internet too. You can kind of build higher level systems on top of Sigstore. You could say, I’m only going to run something that three other people have said is good and safe and trustworthy. And the Ledger, which is a core component of Sigstore, is a place where all of those pieces of evidence can be cryptographically signed and verified that if Dan said this is good software and he’s fine with it, then I know that Dan actually said that.

Priyanka Raghavan 00:10:58 So one of the things I also was listening to and reading a couple of posts when I was researching for Sigstore was the popularity or something that what Sigstore wants to achieve is like, Let’s Encrypt where they drove the adoption of HTTPS. So, can you talk a little bit about that?

Dan Lorenc 00:11:15 Let’s Encrypt, I talked about before we’ve done a good job hardening other systems and so now attackers are pivoting to the supply chain. Let’s Encrypt is probably one of the reasons software supply chain attacks are happening because they made it really easy to get certificates that are used to secure the web signed certificates. When you at the top of your browser, that little green lock icon that goes up showing that it’s a secure connection, it’s been around for decades. But to get one of these certificates in the past, you had to fill out a lot of paperwork, you had to find a domain registrar, verify who your company identity was, and then they would give you certificate. And a certificate is really just a digital document, a text file with that domain registrar or certificate of authority has signed saying, okay, Chainguard.net, that’s my company’s website that’s actually owned by Chainguard a company. And then when you log onto the website, you can make sure that you haven’t been man in the middle attacked, right?

Dan Lorenc 00:12:04 Someone in the internet in between you and that company isn’t tampering with the website anymore. The bits you see on your screen are exactly the ones that that company wants you to be seeing. And if you send data to them, it’s going to, that company and no one in the middle can read it. It’s been around for a very long time, but it was annoying to set up. So most people didn’t, unless you were a bank or something like that where people were typing their passwords and you wanted to make sure that was encrypted and going to, the right person let’s encrypt have automated all of this, instead of having to do this manual process and writing checks and logging credit cards and that kind of thing, you would just set up an automated system and they would automatically prove that you controlled that domain name and that nothing had been tampered with along the way. As a result of making it free and easy to use

Dan Lorenc 00:12:44 and automatic, encryption on the internet went from low double digits to close to a hundred percent, and now browsers require it. And if you go to a domain that doesn’t have one of those, they make you type in like, I’m okay accepting this risk before they’ll even show you the page. That only happens when you get to that level of ubiquity. And doing that at internet scale is very hard, right? We’re talking millions of websites, tons of people, tons of different web browsers involved. It’s an ecosystem level change that they were able to roll out by taking something that was hard and making it free and easy for anyone to adopt. Code signing and that kind of thing was in a similar place before Sigstore, it was really hard to get code signing certificates. No one was really doing it and nothing and open-source was really signed, even if it was, and no one was checking the signatures because they didn’t expect it to be signed and they didn’t know how.

Dan Lorenc 00:13:29 So as we designed and built Sigstore, we were heavily inspired by the Let’s Encrypt system and how they operate and how they prove that they haven’t been attacked and things like that at the same time. So, make it free, make it easy, and make it automatic for developers to do that signing so we can eventually get to a world where enough stuff is signed that people can start requiring it. And it looks weird if you download something that hasn’t been signed the same way your browser won’t even show you a website that doesn’t have a certificate today.

Priyanka Raghavan 00:13:53 Exactly. So that’s the vision to go to that kind of an adoption.

Dan Lorenc 00:13:57 I don’t know what attackers are going to go to next when open-source gets to that point, but that’s how this game is played.

Priyanka Raghavan 00:14:03 Okay, so let’s move on to the next part of the podcast where we want to talk a little bit about the main components and what makes Sigstore work. So, could you tell us a little bit about the main components of the Sigstore? How does the signing work?

Dan Lorenc 00:14:16 So signing and certificates and all that kind of stuff, at the end of the day, boil down to identity. Certificates themselves are just a bunch of random numbers. You can’t really read them. I’m not going to remember your massive random number if you told it to me. They’re way too long to remember. But a certificate is tying that to some kind of identity, whether it’s a person’s name, a pseudonym, a GitHub account, an email address, the name of a build system running somewhere. They’re way of stepping up a level from private keys, which are a bunch of random numbers to identities. And that’s the first problem to solve in open source because email management is very hard. Some people — Linus Torvalds, everyone knows who he is: he’s the maintainer of Linux. But for any given package on IPI, you might not know who the maintainers are or who their names are. They might not want you to know all of that, but you need some kind of root for that identity. Some kind of shared system.

Dan Lorenc 00:15:05 Well, it turns out that there’s already this protocol for that called Open Id Connect. When you go to log into a website and they don’t make you make an account, they have login with Google, login with Apple, those kinds of buttons: that’s using that protocol. So, it’s a way to tie someone’s identity to an email address basically at the end of the day. There are other forms of that identity for servers and different naming schemes, but you can think of it as that login with Google button that appears on most websites. For most people, that is their digital identity at the end of the day, their email address. And so, when you start with Sigstore, that’s kind of the root, it’s the email addresses of people that are maintaining these projects. So, everything is kind of built up from there.

Dan Lorenc 00:15:43 There’s a system that does that Login with Google flow, so you don’t need to have certificates anymore. Every time you want to sign something; you get this temporary certificate that’s just used for signing that one piece of code that’s tied to an email address. So, you can’t really say which private key signs something at any given time. That doesn’t make sense because they’re just random numbers, but you can say, this email address signed this thing at this time. At its root, that’s what Sigstore is. There’s a whole bunch of components that make that work and make that secure, but that’s how it’s able to be accessible to developers. They don’t have to worry about all these key management and what happens if a key gets leaked or what happens if I lose it.

Priyanka Raghavan 00:16:20 Great. So, the keys are basically your email address, that’s the main thing basically.

Dan Lorenc 00:16:25 And then there’s a bunch of cryptography behind the scenes to make that work.

Priyanka Raghavan 00:16:29 I was just thinking when I was asking this question to you, why not GitHub IDs? Is that because you don’t want to be tied to GitHub?

Dan Lorenc 00:16:37 Yeah. They’re one of the possible providers. So, I mentioned mostly email addresses. GitHub is also supported. There’s a bunch of different forms of identity that you can use in here, but it’s either, the GitHub account is usually tied to an email address too.

Priyanka Raghavan 00:16:48 Right, right. That makes sense. And also, maybe there’s somebody using some other source control. So, by having the,

Dan Lorenc 00:16:54 Oh, and Enterprises can configure this internally, they can set it up with their active directory or their Okta or whatever identity provider use inside of their companies.

Priyanka Raghavan 00:17:02 So the protocol is openID Connect, but then behind the scenes it could be any kind of identity provider.

Dan Lorenc 00:17:09 Anything IdP protocol.

Priyanka Raghavan 00:17:10 And I guess the artifacts that are used that are signed, are they just the containers or binaries?

Dan Lorenc 00:17:17 It can be anything. There’s a bunch of things that are supported as first class in Sigstore, but at the end of the day, you’re just signing a digest. You’re not really signing the container, you’re signing the digest of the container, you’re signing the digest of the Python package, you’re signing the digest of the Java package, something like that.

Priyanka Raghavan 00:17:31 Okay.

Dan Lorenc 00:17:31 There’s a bunch of tooling to make it easy to work with containers or Python packages or node packages. But you can sign a note, you can sign a tweet, you can sign an email. Yeah.

Priyanka Raghavan 00:17:41 And I think the next thing is of course, you sign it, and the big thing is also to verify it?

Dan Lorenc 00:17:47 Yeah. That’s the hard part. Anyone can sign something, but it only works if people are checking those signatures. So, when you go to check one of these signatures, cryptography again, just works with random numbers basically. But because it’s the way Sigstore is designed, if you want to check a signature, you can check which email address assigned it. And that’s kind of the flow. You sign it with your email address, you can check the signature, and it’ll tell you which email addresses have signed something. So that’s how if, again, Sigstore can’t tell you if something is good, you have to know if you trust that email address. But it can tell you that that email address signed something, and you have to decide if that’s meaningful to you. And if you trust that.

Priyanka Raghavan 00:18:21 What about the audit logs, which they call transparency logs? I saw a lot about that.

Dan Lorenc 00:18:26 Yeah. That’s kind of how all this works under the covers. If you’re someone signing things with your email address, you’re trusting this other system, you’re trusting Sigstore at the end of the day to not lie or to not be compromised or to not publish something, that wasn’t really your email address. Sigstore were to get compromised. If your email address where to get compromised any of these things, then people could be publishing signatures on your behalf. That’s bad because if people trust you and all of a sudden, those signatures end up out there, then they’re not going to trust you anymore. And there’s this other cryptographic primitive, again, it’s used by Let’s Encrypt and it’s used by all HTTPS certificates in the world, but they’re called transparency logs. There’s this append only ledger that you can prove hasn’t been tampered with over time, and that everyone is kind of seeing that same ledger.

Dan Lorenc 00:19:11 So it’s kind of trust through openness. At any given time, I can go check that Sigstore ledger for everything that I have assigned as a person. And I know that the signatures won’t validate unless they’re in that transparency log. So, it’s kind of a couple steps, but it’s, did this email address sign it, and is it in that log? Again, that’s dependent on people checking that log to make sure that things aren’t showing up in there. That shouldn’t be, but that’s kind of the only way all these moving parts fit together in a place where developers don’t really have to trust Sigstore. They have to trust that everything will be in the open. And then if something does end up in there, you can trace it back to figure out what happened. Was it that someone compromised my email address? Was it the infrastructure got compromised?

Dan Lorenc 00:19:48 And you have these timestamps, and you can see exactly what happened when you don’t have to invalidate everything. You can say, oh, out of these 100 signatures, 99 of them were valid and actually me, but this one isn’t. And you can, everyone can then be notified that that piece had been tampered with. Similar in some ways to blockchain, but a different use case for it. But every time you go to Google.com, you get a certificate for Google and Google is trusting a certificate authority to not lie, not be compromised and not issue certificates to attackers. The way they’re able to trust that other party is through transparency logs there too. There are people at Google monitoring that log to make sure no certificates end up in there. They don’t belong to them. This is important at internet scale because again, internet scale is global, and different countries control different certificate authorities and things like that. And there’s no real universal law that applies here. And this what we’re operating in this world where nation states are funding large cyber-attacks, it’s hard to really trust any of these components in their mobile world. So, trust through transparency is what, what this whole ecosystem is built on.

Priyanka Raghavan 00:20:48 So that means I, as a maintainer of an open-source package, as soon as I sign something, I should go and check the transparency log. I mean that happens automatically, I guess part of the infrastructure. But then someone who’s picking up a package that I have built would go and check the transparency log. I mean, how does it work?

Dan Lorenc 00:21:07 So that’s all automatic and built in. Okay, you go to check that signature, you see it was this signed by Dan, it goes to make sure that signature is in the transparency log. There are systems people have set up to send you an email every time a signature for your email ends up in that transparency log and things like that. So, it’s kind of on you to be checking these systems and everything, but it makes it easy to spot. Every time my signature appears in there, I get an email address, or I get an email to my email address and if I didn’t publish anything that day, that’s a big red flag.

Priyanka Raghavan 00:21:34 So I think the other thing I wanted to ask you is I saw a little bit about the key tools, which is like co-sign and then full Q, which is like the certificate manager, and I think Recall, which is for the audit logs. Now these are the infrastructure, which is provided from what I see, just you could just use this kind of command line tools, right?

Dan Lorenc 00:21:56 Yeah. So, most people in the open-source world when they interact with it will just use one of the tools, like Co-sign. Co-sign is for signing containers, container signing. There are other versions for Python, other versions for Node. And that interacts with those systems. Fulcio is the piece that does that identity check. So, like a little, when you go, when you run Co-sign sign, your browser pops up and you click yes, sign in with Google, that’s Fulcio. Then there’s that ledger we talked about that transparency log, that’s Rekor. So, when see issues those certificates, when you do those signatures, all of that ends up in the transparency log. So that’s kind of all wrapped up into one of those interfaces, like co-sign or the Python version or the Node.js version or whatever it is depending on what you’re signing. Sigstore is a pretty interesting product because it’s an open-source project.

Dan Lorenc 00:22:39 But those two components — Fulcio and Rekor — they’re actually operated as a shared infrastructure. So, the open SSF, the Open-Source Security Foundation, it’s part of Linux Foundation runs a, it’s called the Public Good instance. It’s run, there’s a team of people that carry pagers from volunteers from different companies. A few folks from my company Chamber run that rotation and they keep the system running. There’s databases involved, there’s servers, things must be upgraded. But those are run for the benefit of everyone in open source that are signing things and verifying things. If you’re a company running this stuff internally, you could publish things in there, but you might not want everyone to know how many deployments you did that week or something like that. Some people view that as sensitive information. So, companies can run their own instances of that, and they’ll have a team running those. They publish their internal signatures and keys and logs too.

Priyanka Raghavan 00:23:26 If I was to integrate this with a CICD pipeline, how does it work? Are there like tools for that too?

Dan Lorenc 00:23:32 Yeah. So, we talked about email address as being the main form of identity. It’s not the only one. An open ID Connect supports a bunch of other ones. Another common one is called workload identity. If you’re doing this from a CICD system, doesn’t an email address, what’s the email address of your server or something like that. There are other ways to do identity at that level. So, you can, if you’re on a cloud provider, they have a lot of this stuff baked in. So, you can say only this CICD system running in Amazon in this zone on this instance is trusted. And you can get a different form of identity tied to that exact virtual machine or that exact Kubernetes pod or something like that. Wherever these things are running, it’s still just a strain. It kind of looks like an email address, but it’s not an email address. And so those systems can get certificates tied to the workload in whatever way you want to define workload. All the cloud providers support different ways of doing that. GitHub actions have a big thing. So, you can say, this GitHub action running from this repository on this branch is allowed to publish into this registry or into this cluster. It’s just a different way of formatting that piece of identity.

Priyanka Raghavan 00:24:36 While you are answering this my mind also into like AI agents, autonomous agents, so it’s also would be the same type of workload identities, right? Or

Dan Lorenc 00:24:45 Yeah. There’s been a lot of work in that space both around signing models themselves, making sure that the model you get say from hugging face or something like that is the real model. And that’s kind of similar to a container, right? It’s just a bunch of bites at the end of the day as well as if you want to say this AI agent running here is allowed to, which commits to this repository that’s a similar identity system.

Priyanka Raghavan 00:25:06 Probably it also has a specific identity tied to not an email address but a workload identity. Okay. Exactly. Let’s go a little bit about the real-world impact. I wanted to ask you how give me some examples of projects using Sigstore.

Dan Lorenc 00:25:22 The Kubernetes project signs all of their release artifacts with Sigstore. Python itself, like if you go download a build of Python that’s signed with Sigstore and then it’s also baked into PyPi, now it’s opt in, you’re not required to use it, but there’s a whole flow called trusted publishing inside of PyPi at npm where you can turn it on and say only this GitHub action is allowed to publish. These wheel files to PyPi or these npm packages to npm and it’s kind of starting to build up that you don’t lock icon at the top of your browser being used all over open source and it’s Maven Central supports it for publishing Java artifacts now too. So, it’s been a fast growth over the last couple of years.

Priyanka Raghavan 00:26:04 And when you talk about npm, I was just thinking, because I use it a lot in my day job right now. So, if I usually do this npm install and then get the packages, right? So, can I also give some instructions to say that I should only install packages that are signed in Sigstore?

Dan Lorenc 00:26:19 I forget how they baked it into the client at verification, but that would be the flow. Something like that. They allow signed packages, check signatures, the policy that my organization has defined around who’s allowed to which type of packages we trust.

Priyanka Raghavan 00:26:33 Okay, so you can actually set up something in your config and then do a,

Dan Lorenc 00:26:37 I don’t use npm too much but pretty sure Ö

Priyanka Raghavan 00:26:40 We shall probably look it up and add a show note to the same. I also wanted to ask you in terms of when you’re using Sigstore, how much of a maintenance impact is it on your project? Because it’s one more thing to do, right? Is it really easy? I mean you hearing what you’re saying, it sounds easy, but what are the overheads?

Dan Lorenc 00:26:58 Yeah. Signing stuff with it is really easy. It’s just running an extra command inside of every build. The hard part is the second half making sure people are verifying them. That’s the second piece.

Priyanka Raghavan 00:27:08 And then when you sign an artifact, can you prevent supply chain attacks?

Dan Lorenc 00:27:13 Yeah. If people are checking them. Signing does nothing by itself. But if the maintainers of the first set of packages that were hit with Shai-Hulud had been signing things, the signatures wouldn’t have validated after someone else directly bypassed that and uploaded directly. Probably you taking action as a maintainer to allow your consumers to protect themselves kind the way I would explain it. They still have to take that step to protect themselves, but you’ve now made it possible for them.

Priyanka Raghavan 00:27:40 So if you see some kind of wrong entry on a package that you are maintaining, then you kind of notify all the people who are using your package?

Dan Lorenc 00:27:47 Yeah, you can take it down. You can publish something on Hacker News. Say, “hey this has been compromised. Stop using it.”

Priyanka Raghavan 00:27:54 Okay. It seems like things like Shai-Hulud will be less of a problem if you use things like this now?

Dan Lorenc 00:28:00 Everything was using it now.

Priyanka Raghavan 00:28:02 We talked a little bit briefly about signing models that I saw that even Google is basically signing all their ML models with the Sigstore. Can you talk a little bit about what have you heard?

Dan Lorenc 00:28:16 Yeah it’s similar to signing anything a little bit different. The models much larger, there’s different protocols and things for where the signatures get placed. But, when you’re a company distributing stuff like that and people are rehosting lots of different websites. I mean if one of those were to get compromised that would look bad for you because people thought it was the Google model or something like that. So, it’s a way of kind securing all those links. There’s a working group again in the open San Fran around how to sign models and how to validate them and making that into all the tooling as well to make it easy. It’s all based on that; those core Sigstore primitives.

Priyanka Raghavan 00:28:47 One of the things that you’ve been saying is it all depends very much on how you verify the signed packages. And I wanted to ask you in this context, because we’ve also had many attacks in software supply chain where, there’s a typo squatting you, you might just pick up something which is similar to the name. Like one of those popular piper packages or whatever. Now the thing is, what I wanted to ask you is these kinds of attacks, do they come down with a Sigstore or not really, right?

Dan Lorenc 00:29:13 Yeah and I kind of glanced over this and caught myself right after, but typo squatting is real. And I mentioned before like there is a flow in Sigstore to say who signed to this package and it’ll print out the email addresses that’s actually not secure. Because if you just read those email addresses, there’s all these Unicode attached where it might be a different version of the at similar or something in the middle or something like that. So, a better way to do it’s to say here is the exact you type in the email address and say only allow it if that exact one sign it because of things like Typo Squad. I could go make an email address today that looks just like yours but with like a Unicode character in the middle, it looks the same but isn’t actually the same at the end of the day.

Dan Lorenc 00:29:49 And if you’re relying on just looking at the words, people make mistakes and it’s really hard to notice a lot of these. So that same kind of thing happens in Sigstore too and they’ve changed the default to make it hard to do that by accident. You have to type in the email address, and it’ll tell you whether or not that email address signed it instead of just printing out the email address and letting you do visual inspection. This stuff is subtle but type of spotting attacks happens all the time. And again, if you want to know the identities of the people, not necessarily the names of the package at the end of the day.

Priyanka Raghavan 00:30:17 Okay, okay.

Dan Lorenc 00:30:18 It doesn’t solve it by itself, but again it gives you ways to solve it.

Priyanka Raghavan 00:30:23 I wanted to ask you if there’s sort of one practical step towards maintaining supply chain integrity. What can a small team do?

Dan Lorenc 00:30:32 Inside of a company? I think the biggest thing is stop if you’re doing this today, don’t do it. Stop doing it. But don’t ever publish anything from people’s laptops, right? Move to a build system and require everything that goes to prod to come from that build system. You then have to secure that build system and make sure it’s not compromised, but it’s a much lower attack surface than everyone’s laptops and it forces you to adopt a lot of this hygiene and practices to make sure that things are flowing through one point before they get into production.

Priyanka Raghavan 00:30:59 So definitely not from your laptop and go to a build system and do it from there. And the other thing I wanted to ask you is, is there a sample first project or demo for beginners that they can try out?

Dan Lorenc 00:31:11 There’s a bunch of getting started guides on Sigstore, but, the easiest is just download the co-sign tool. If you’re doing anything with containers, just download co-sign. It’s really easy to use. Just do co-sign sign and the name of the container and it kind of guides you through the rest. There are good flows if you’re using GitHub actions, there’s flows. If you’re using GitLab, there’s flows if you’re using tons of basic CICD systems out there.

Priyanka Raghavan 00:31:31 Okay. So, I now have to ask you a little bit about what Chainguard does. Can you tell us a little bit about that?

Dan Lorenc 00:31:37 Chainguard is a software supply chain security company. At the end of the day and we’re trying to build a safe source for open-source software. So, because open source is such a massive and distributed problem and even though these tools are out there, you can’t just go expect tens of millions of developers to adopt them overnight. That’s kind of what we do here, in Chainguard. We take all of the open source that our customers need and rebuild it from source directly. We do all this stuff like using Sigstore and hardened build systems, the SALSA standards along the way solve the signatures into the final artifacts. We have our containers product, which is a set of you over 2000 container images where every piece of software in there is built from sourced by us. And our built systems everything is signed from start to finish and then people get access to this catalog of all of these safe container images.

Dan Lorenc 00:32:22 And then we also automate the other half of the supply chain security problem. Today we’ve kind of focused on the malware and tampering piece around Sigstore, but there’s also the ingredient problem too. We talked about, there’s tons of instances. Log4j is probably the most well-known one. Nothing malicious, just a vulnerability that makes its way code has bugs and some of those bugs cause security issues. We automate the patching lifecycle of all those vulnerabilities that come out. So, everything is up to date, all patches are applied, and you get stuff that hasn’t been tampered with and stuff that has all the vulnerabilities patched.

Priyanka Raghavan 00:32:53 Okay, so it’s done automatically for you. I mean you take the newer version of what’s available on Chainguard?

Dan Lorenc 00:32:59 We have tons and tons of automation to keep everything up to date, whatever known vulnerabilities are found.

Priyanka Raghavan 00:33:04 Okay, that’s really good because otherwise I’m spending most of our time doing pseudo app Git on everything. So that’s kind of taken care of. I wanted to ask you what’s different between say, like you have these CIS benchmarks which are done for virtual machine images, right? Is that something similar that happens with the, I mean I guess is this is a container image but maybe. Is that what you’re attempting?

Dan Lorenc 00:33:27 Yeah we are also doing that. So, we give you these artifacts and then there’s tons of different ways to run these things. I mean know if you run, if you get a super secure container image or a VM that has been secured in getting you everything, but it runs as root and has all of these extra components in it that don’t need to be there. You had a super secure supply chain give you something that is now insecure when you put it into production. And that’s what a lot of these benchmarks are, their CIS benchmarks. US government has another program called Stig and it’s kind of a checklist of like if you’re going to run this particular database, here’s how to do it securely, don’t have basic password access, make sure you change the default passwords, that kind of thing. So, it’s a checklist for making sure you’ve configured the software to run in a secure manner at the end too. And we also do that.

Priyanka Raghavan 00:34:08 And I think I have to also ask you another question. Sometimes you have the very slim images which are produced, right? And that also becomes, sometimes it’s a bit too slim and then you end up sort of, you can’t really do anything sometimes with some of the very slim images. You can’t really run a lot of things and then you have to install things. So how does that problem happen for your customers? Because I’m sure you are also building slim images, but then everybody has a different want, right? So how, what happens then?

Dan Lorenc 00:34:34 It’s a series of trade-offs. Some people, depending on how sensitive an of an environment you’re running in, how regulated your industry happens to be, you might need them to be fully hardened, fully locked down, no extra attack surface, that kind of thing. In these images, I mean that’s our default we believe in secure by default instead of insecure by default and then making everyone go secure things. But it is a spectrum. Not everyone needs that, but if there’s no shell in your container image that’s running in production and something’s going wrong, it’s very hard to debug that. It changes the equation on a lot of things. But depending on how secure you need to be or what your regulations say, you might need that in place. But we have a whole spectrum for all this stuff we ship that’s our default, but we have full versions of all the container images that have the shell, the package manager, all that kind of stuff that you might be used to and familiar with. So, you can start there and work your way down.

Priyanka Raghavan 00:35:21 Okay, so different flavors based on what you’re developing. Okay.

Dan Lorenc 00:35:25 Different levels of hardening kind of.

Priyanka Raghavan 00:35:28 The last segment I want to talk a little bit about AI. We have so many coding assistant and everybody’s producing so much of code and then also sometimes the coding assistants are also telling you, install this package and in certain programming languages, for example, if I’m not familiar with the programming language, I’ll just use the coding assistance help and then pick up a package. Can that be something that that’s also vulnerable? Is something that Sigstore also trying to get in there to see if the coding assistant are also like looking up? I don’t know, it’s justÖ

Dan Lorenc 00:35:59 There’s a bunch of different problems I think here, but overall I think we’re heading toward a good future, right? You know coding assistance, all of these tools. Right now, more people can write software. There’s going to be a lot more software written. From the pessimist side and I said all software has bugs. Some of those bugs cause security issues. If there’s a ton more software out there, there’re going to be a ton more security issues. These same systems that produce code can also check for security issues. They can also fix security issues. Agents never get tired, right? They never need a break like people do. And we’re going to get to a world where there’s a lot more, a lot better software as a result here, but it’s going to require people to not just use the agents to write code and ship that to production tomorrow. Right. We’re going to have to figure out what it means to be secure, how to automate that into the process and then let the agents do that too. So, I think things are going to move faster. We’re going to get a lot more software built. It’s going to be cheaper; it’s going to be better for everyone. But in the short term there’s a lot of slop published and all of that stuff. But I think over time it’s just going to improve things for everyone.

Priyanka Raghavan 00:36:56 Yeah. So probably we’ll have an agent for picking out the right images from Chainguard or an agent for making sure things are verified.

Dan Lorenc 00:37:03 Yeah, we have a lot of this going on internally. Okay.

Priyanka Raghavan 00:37:06 So like a different agent for each of these activities, I think. That makes sense. So, I think, people are doing that. So apart from the coding part, you also have the different agents to take care of the security angle and the sanity angle. That makes sense. And I think that kind of also ties in with what you spoke about, which is also ensuring that the agents have a good machine identity, which is again part of what we discussed before. I think that’s about it. I think that’s all I’ve had questions for you. And before I let you go, I need to ask you, where can people reach you on the internet if they wanted to connect?

Dan Lorenc 00:37:40 LinkedIn’s probably the best place. My name is Dan Lorenc, that’s L-O-R-E-N-C. Find me there, but I post all the time.

Priyanka Raghavan 00:37:46 I’ll add that to the show notes. And thank you for coming on Software Engineering Radio. This has been a great conversation.

Dan Lorenc 00:37:53 Thanks for having me.

[End of Audio]

Join the discussion

More from this show