In episode 20 of The Kubelist Podcast, Marc and Benjie are joined by Dan Lorenc of Google. They discuss supply chain security and the Sigstore project, a new standard for signing, verifying and protecting software.
Marc Campbell: Hi, again, and welcome to another episode of the Kubelist podcast. Of course, Benjie's here with me for this episode and we have a fun conversation plan. Hey Benjie.
Benjie De Groot: Hey Marc. Thanks for having me back. It's always fun to co-host this with you. I'm super excited to dig into this week's.
Marc: Great. So today we're here with Dan Lorenc, a software engineer at Google to talk about the Sigstore project that's part of the Linux foundation. Welcome Dan.
Dan Lorenc: Thanks. Thanks for having me on here today.
Marc: Okay. So before we just start talking about the project, and security, and supply chains.
I'd love to learn a little bit more about your background, Dan.
What can you tell us about your role at Google and what led you to end up doing this on a day-to-day basis?
Dan: Sure. Yes, I've been at Google for a little over eight years now coming up on nine years.
And I've been in the developer tools and cloud space for almost the entire time since before Cloud was called Cloud at Google.
I started on the app engine team platform as a service, it's been around for a long time since before platform as a service really became a buzzword too.
In that area, I was too much with all of our tooling stuff and got onboard the container, and pretty early on when Kubernetes started to take off both inside Google and outside.
I had a dev tools background and Kubernetes was missing an easy way to run it locally so I started the Minikube project pretty early on, and that was my first real foray into open source software.
It was my first time working on GitHub and working upstream, open source projects.
And it was a great experience, but that led me down this long journey of open source supply chain and supply chain security.
And I was used to working inside Google where we had crazy MonoRepo and your custom build tooling, that's great and has awesome security aspects, but it doesn't really exist and it doesn't make sense outside of that internal environment that Google had set up.
And that's what I was used to having really good provenance and records of everything we had deployed everywhere.
And then I was on GitHub pushing Minikube builds from my laptop, publishing on my GitHub, and people were just downloading these things and running it as root on their laptops all over the world.
It was terrifying.
It's been a couple of years just trying to do that as securely as I could, and just kept going down this rabbit hole of finding out how bad the state of open source and third party supply chains are in general and trying to improve that and all of this.
Some other open source projects like Tekton CD, which is another Linux foundation project in the continuous delivery foundation also aimed at that modernizing and making secure supply chains a little bit easier.
And then this year started the Sigstore project with a bunch of other great folks from Red Hat and a whole bunch of other organizations.
Marc: That's a great background.
And yeah, I think the early days of Kubernetes, the Minikube story, it'd be a great topic to dig into sometime.
I think we've talked to the K3s team and there's microk8s and there's this whole ecosystem of-- Run locally.
But yeah, I mean going back to your story, I can imagine you're used to this really low friction, high trust, supply chain process at Google and you--
All of the benefits that you get and then that, I don't know, lack of a better way to describe it, that dirty feeling of like, I just compiled some code and pushed it to a release and now people are running it.
Dan: It's a little scary. Yeah.
Marc: Yeah. So let's just jump right in here.
So Sigstore, the project that you started, it's a part of the Linux foundation and the website, by the way, new website, it looks really great.
I think the website defines it as quote, a new standard for signing, verifying and protecting software.
I'd like to dig into, what does this mean? What's the scope of the project?
Dan: Yeah, that's a great question.
And there's a whole bunch of different parts of Sigstore too that make it a little bit harder to talk about in one sentence.
I think that was the best we could come up with in one sentence, but we can dive in quite a bit here.
The idea for Sigstore, the easiest way to describe it probably through comparing it to other projects.
We studied the state of software supply chain metadata and integrity for a while, and talked to a ton of different open source projects, and ecosystems and communities to figure out why they were signing stuff, why they weren't, why they were struggling.
And it started to look a lot like what HTTPS and certificates in the browser looked like five or six years ago before Let's Encrypt came into the picture. In general, it's just too hard and too expensive to sign software in open source context so people just don't do it. And that's what HTTPS and browsers used to be like, that you had to pay some CA a bunch of money, you had to email them a fancy formatted certificate request, they would send you something back, you had to figure out how to copy the things around, you had to remember to keep doing it, you had to pay extra for fancier certificates, all that stuff. Lots of people just didn't do it for the most part.
And then Let's Encrypt came around with our mission of making certificates free, automated, and easy.
And now all of a sudden, everybody does it in the browsers, right?
I don't remember the last time I went to a webpage that didn't have a certificate now.
So it's hard to remember, but five or six years ago, that was not the case at all.
So we copied that playbook. And a lot of the technology is the same too or parallel to it, instead of encrypting a web traffic, we're signing software.
Sigstore has a certificate authority to issue certificates, just like Let's Encrypt, except instead of verifying websites, we verify people.
And so we tried to package all of this up in an easy way to make it so people can get these certificates with just one command, not have to manage keys, not have to worry about any of that stuff.
So that's the overall scope and idea for Sigstore.
It just weeds down a whole bunch of other complicated technical paths and stuff we have to implement to make all of that possible.
Marc: That's great.
So drawing the parallels between Let's Encrypt making TLS creation management a lot easier that helps map it a little bit better.
One of the things that's important though, right? Is you're both publishing.
On the Let's Encrypt side, you're both creating the cert and then the browser has to be able to trust that cert and verify it.
So I assume that both parties are still involved here, right?
The ability to just sign an image isn't that useful, unless somebody on the other side has tooling and able to verify those images, is that also what Sigstore's being part of, or are you like relying on other community tools for that?
Dan: Yeah, that's a great question.
The overall idea is yes, it's to make it easy to sign and verify containers, but that's just one piece of the overall problem.
The big problem is we've got to make it easy to find out what went into the software you're using, who wrote that, what all the dependencies were, and talk about SBOM and some other techniques to do that a little bit later.
But yeah, it's not just signing. Signing is the easy part.
Verifying and finding the right keys to verify against is the hard part.
Let's Encrypt, and this is one area where we differ a little bit.
Let's Encrypt is trying to enter an entrenched area with a full Web PKI ecosystem, and process for getting trusted by browser and everything like that.
We have to be both sides of that coin here in Sigstore, because there really aren't any ways to verify signatures for certificates or identities in the open source world.
So it's a little bit easier, it's also a little bit harder in some ways.
We don't have a complicated process that we have to follow to get trusted.
We just have to make that up ourselves and get people to trust us.
So it's not a new problem to go solve, but you're trying to solve-- It's an unsolved problem.
There's no defacto major player that you're trying to say, "We're going make this easier."
You're just saying, "Hey, this is a really complicated problem. There's various different pieces along the way, making it intimidating for me as an engineer or me as somebody who's going to ship some code, to ship something that's signed and verifiable, and has a secure supply chain."
Marc: Given that, there's actually this other let's call it prior art in this field.
There's tools like Notary for signing or The Update Framework, TUF in-toto.
There's lots of other stuff, I'm not going to attempt to try to name all of them here.
I'm curious, how the work you're doing at Sigstore either compliments this or builds on the work that already exists or maybe attempts to replace some of it.
Dan: Sure. Yeah. That's a great question.
The other-- Freed into TUF, Notary, in-toto, there are some, some other prior art here.
And if you look at the Microsoft ecosystem, or if you want to ship a Windows driver, you have to go get a certificate signed by one of these, your code signing CAs, and they require the EV stuff for that, a background check on you, and verify your business as right address, and you pay hundreds and hundreds of dollars for this, but otherwise you can't ship a driver that will run on Windows.
So there are a couple of other parallels like this, like Apple runs one of these for the Mac store.
They just-- Nobody's really aiming to do it in a general purpose way, especially for open source.
And so those are some pretty good parallels too.
On the open source side though, like you mentioned yeah, there have been some efforts here before too.
Notary was an effort built on top of The Update Framework and baked into Docker registries.
It's called Docker Content Trust when it was in Docker Hub and a couple other cloud providers implemented it.
This was a way to sign and verify the authenticity of container images using The Update Framework.
It unfortunately never saw a ton of adoption, there were a couple of problems with it, with the first implementation of it mostly around the registries themselves.
The container registries didn't support enough APIs and flexibility in the metadata they can store to do this in the registry protocol itself, which is managed by the OCI or the Open Containers Initiative.
So it's a stand up a notary server. You actually had to run and deploy this whole extra server, an extra piece of infrastructure as a whole open source database.
It was really a pain to manage.
And so people just didn't do it since it didn't work across registries.
The Update Framework itself is a great tool set, and even more importantly, it's a great way to think about supply chain risk and key management and signatures and all that stuff together.
It started in academia. It's not really a piece of software. You can just pick up and start using.
Although there are some client libraries and command line tools that are starting to get a little bit better, but it's not meant to be a drop-in signing tool or verification tool.
It's a way of thinking about the problem and a checklist of things you should keep in mind when designing an update system. And so, yeah, Notary was built on top of that and a whole bunch of the Sigstore components are built on top of that too.
I actually just wrote a blog post about some of the confusing nature of it, where we use The Update Framework to protect parts of Sigstore, but then we also allow people to use The Update Framework themselves on top of Sigstore.
I called it the TUF Sandwich, how The Update Framework is on top of and underneath everything that we're doing.
In-toto is another great project for metadata formats and policy formats that lets you have a test of what happened in a supply chain, so what happened in a build process, what inputs went into something, what outputs came out, what steps happened in the middle there.
Just got a bunch of envelope formats to describe that.
And then ways to sign those, link them together, do checks later to make sure what happened is what you thought should have happened.
And it also integrates with The Update Framework and stuff like that.
So with Sigstore, we also support storing a lot of that stuff in our transparency logs and querying it so if people aren't producing these in-toto files, you can start to build up this awesome graph of what went into my software, what went into that software?
How, who built it and all that stuff.
Benjie: Yeah. Backing up a second here, but to build on the dependency graph stuff that you were just alluding to.
Benjie: A layman's question here, what are we trying to solve with signing of packages?
Just going back up like 10,000, 40,000 feet here.
Ultimately I think I always, when trying to describe what I think of software supply chain stuff is, I'm like, it's dependency management.
I always got to break it down to that ultimately, but obviously there's some pretty important nuances there.
So just high level, I'm a noob, I'm walking into this, why do I want to sign my packages?
What's the motivation here? What is the supply chain? How does that all work?
Dan: Yeah, that's another great question.
I think signing itself is really, really simple, and I think it leads to a bunch of confusion on what it actually does, and why you might want to do it.
And in fact, there are a lot of ways to get a lot of these same supply chain guarantees without signing.
And depending on how you sign, you might not actually be getting anything.
So yeah, there are a lot of misnomers and stuff here, which would be awesome to talk through, and clarify a little bit.
You mentioned dependency management as one way to think about the overall problem.
That's a good way to talk about it. I think I have a slightly different take where I split it up into two different problems.
When I talk about overall supply chain risk, especially for open source.
When you look at it, there are two main problems here.
And one is that we don't know if the dependencies we're using, all right?
You might have an idea of it, but you can't tell, none of this metadata is verifiable.
You know what an example here, if you want a PyPI, a Python package registry, you search for a package, it'll show you a little a git-repo link on the left-hand side, right?
That is not real, right? Anybody can just type in any git URL they want in there, and you have no way to prove that the Python package you downloaded actually did come from that git-repo.
For the most part, maintainers do the right thing, but security, we have to worry about the ones that don't.
And so we have the supply chain, but you can't actually write it down or verify it.
So even if you know what were the top level things that went into your code, you don't know what went into them, you don't where these things actually came from.
And so signing is one technique to actually allow people to start publishing that in a way that you can verify it later.
So step one is actually just knowing what went into your build, and what went into your dependencies, and what went into the dependencies of those dependencies, all the way down, because we're working across organizations, across companies, across open source projects.
And we can't really trust the communication channel really where all these things get packaged and stored. so instead you sign metadata and put it there so people can verify it later.
That's one half of the problem, right? Just understanding the dependencies.
And then you can start to manage them, you can start to update them, and you can start to look for your vulnerabilities and that kind of thing, which leads to the second half of the problem, which is just that all open source software has bugs in it.
All software has bugs in it and open source software isn't magically secure just because it's public and you can look at it, right?
So even if you know all of the dependencies in your code, there might still be tons of known CVEs and unknown CVEs, and that you've got to constantly figure out how to manage, and update and patch.
But if you don't know the first half of the problem, if you don't even know what's in your code, you can't start to tackle the management problem.
And at the same time getting perfect signatures everywhere, doesn't mean your code is bug free so you have to tackle the two problems in parallel.
Benjie: So hearkening back to the older days of the internet, when I would go to download FileZilla, was that what it was?
What was the old FTP thing? One of these old FTP programs and it would say, "Hey, this is the hash of the executable you're downloading."
And I'm pretty sure that was mostly so that you could just validate there's no man-in-the-middle attack for the download itself, same type of thing for dependency management.
I am using this Python package from PyPI and this is just an automated way to say, "Hey, this is really what I think it is."
And so it builds that trust chain. So, that's just one side of the problem.
And the other side of the problem is, okay, well now I know what all this stuff is, but I have no idea what's broken in this stuff.
So that's a very descriptive way, know all this stuff is, but know it's broken and stuff. Right.
But this is a massive problem.
This is a computer science 101 to me, in the sense that you got to know what you're using, you got to know your software bill of materials almost if you will.
So yeah, this is a massively challenging project.
So the place you guys are starting is just the physical ability similar to Let's Encrypt to sign a particular package, because we haven't even defined that.
So that's what this whole-- The foundation of this whole project is.
And then we have all these other projects you're building on top of that.
Maybe there's a good opportunity talk high level about some of those other projects that you have, and how you're leveraging the signing stuff.
Dan: Yeah. Perfect.
So yeah, just to wrap up the signing, I think it's a good transition into some of the other use cases here, like the SBOM and software bill of materials, like you mentioned.
Signature by itself doesn't really do much, right?
It tells you that, "I had a key and I used it to sign this artifact."
That's really all it means. It can't convey the artifact is good, it can't convey when I built it, it can't convey what went into it.
Just that I had that key, it might've been on a YubiKey, I might've just memorized it and I signed it.
Nobody else could have done that unless they stole my key.
And so that's the base step, right?
It means that I took some action to sign this thing and handed it to you, and you can check that I took some action, but that action is implicit.
The intent is all implicit, that it didn't tell you why I signed it or what I wanted to tell you about that only just that I had a Boolean, yes, I said, yes, this package kind of thing.
That might be a little meta and hard to understand.
So we'll make it a bit more concrete here, right?
That's great, because it's hard to do and it gives you a bunch of guarantees, if somebody else didn't come in the middle of change the package or something like that.
But we really want the ability to do is to start making more powerful statements about the packages that we're using.
If I built something from a specific git commit, right?
You might want to know what git commit, it was built at so you can look at that and scan it for CVEs or something.
A signature by itself can't do that, but instead what we can do is just add another level of indirection.
That's one of the other comps side one-on-one techniques.
So instead of signing the package, we can just write a little description down the package and some metadata about it.
And so I can say, "Here's the package, this is the digest so it hasn't been changed. And this is the get commit I built it out of, right?"
That's a pretty easy way to start thinking about it.
And if you sign those two pieces of data together, now somebody can verify it and see, not only did Dan sign the package, Dan says that he built the package at this tag on this date, which is way more powerful than just a simple Boolean statement about the package.
And you can start to get creative here, and you can start to package up other information there and sign it.
And then all of these statements can be shipped together with the package, and let people start to do richer queries, richer lookups, and start to understand the bigger dependency graph of things that went into the things that they're using, which is really what we need to be able to do.
And so a lot of the other parts of Sigstore design around that, right?
We have a transparency log to publish these statements, and the statements are called the attestations, if you've heard that term.
So you can start to do queries on the attestations, and look up other things, and automate that as part of build processes so people don't even have to think about it, that's the direction we're heading.
Marc: So the Sigstore project is-- I guess I like to think of it a little bit like an umbrella project, right?
There's like-- It's actually not like a CLI that I run, it's made of different projects like Cosign for signing, Rekor for transparency logs.
Is that the full scope of the project or are there other sub projects in there?
Dan: Yeah. So there's one more, that's subtle.
Another important thing to say, between Sigstore and open source projects is, yeah, it's these open-source projects, some of them you can download and build yourself, on all of them, you can do that because it's open source.
But we're actually operating some of these as services, just like back to the Let's Encrypt model where there's transparency log you can build and run and internally if you want it to.
But we also have a public instance of this, that we're operating as like a public benefit where anybody can stick their data into the transparency logs.
So that's Rekor. The Cosign tool is something you download to sign containers, other stuff that gets stored in OCI registries.
And it can-- If you flip a flag, it can automatically pull all that stuff into the transparency log, and if you're verifying, you can verify all this stuff out of the transparency log too.
The other one we should talk about, but I guess we can go over the name.
The name is Fulcio, F-U-L-C-I-O.
This is the certificate authority that can issue certificates for you so you don't really have to manage keys.
If you download Cosign, you can either create a key, right?
You can use KMS if you have it, you can use TUF if you've got that set up or you can just do Cosign sign a container without a key at all.
And that uses the free certificate authority to issue you a certificate, a little browser window will pop up to prove your email address, and then you get a certificate issued to that email address for a short period of time that you can sign stuff with.
So you don't have to worry about losing anything.
You don't have to worry about people stealing your keys.
It's all ephemeral in memory, gets deleted, never touches your desk.
And these all interact, the website tries to explain some of this, but it always ends up looking like a complicated spaghetti diagram, where to do that certificate authority we need a transparency log to put stuff in so people can check later to make sure that we're doing things correctly, all that stuff.
So the certificate authority and the transparency log are intertwined a bit, but hopefully you don't even need to see them or know about them for the most part.
Benjie: Dan, two quick follow ups on that. One, is when you say, "We are providing the central authority stuff." Who's we in this particular instance?
Dan: Yeah. So it's the Sigstore community, basically running, operating under the Linux foundation is operating all of this infrastructure.
Google, where I work is contributing some of the funding for it. It's really, really cheap to run at this point so it's not a huge dollar amount or anything like that.
And then the Sigstore maintainers are operating that infrastructure as a community effort.
Benjie: Super cool.
And then my next question is, and this is a little bit of a layman's question here, can you explain to me with a little bit of detail what a transparency log is?
I think I can infer from the name, but just talk through the mechanics there a little bit and why that's important.
Dan: Sure. Yeah. We need a whiteboard for this.
It took me months and months to really understand all the details of transparency logs.
But they actually have a-- There's a website now that came out and it's actually from the same people that helped us with the Sigstore website.
So if you want to read more, go to transparency.dev, there's awesome animations and everything to really explain these concepts.
Transparency log, simplest is it is an append-only log that's running somewhere, essentially people can append stuff into that log. And then there are some techniques you can use to prove that the log was append-only. And so one person is operating the log, they can't tamper with anything that was in there, people can only add new entries, and then anybody can iterate over the entire log. And anybody can prove that the log has not been tampered with, and entries have only been added to it, primitive, that's what it does.
And once you can make those guarantees, you can start to build up some cool systems on top of it.
And these are used-- The first time they started to be used in practice widely was certificate transparency coming back to the Let's Encrypt model a little bit.
So every time you get a certificate for a website, Let's Encrypt or anyone else, these CAs, the certificate authorities that issue them have to write those certificates to a transparency log.
That's part of the requirements for being a certificate authority now.
And your browsers actually automatically prove that every certificate they're issued is in that log.
And that means as a certificate authority, can't misbehave now.
If they must behave, they can get caught because every certificate they issue is in this log.
And if you imagine a sketchy CA sitting somewhere, they could issue a certificate for google.com or microsoft.com or something like that.
Nothing is stopping them, other than the fact that they have to put that in this log on the public record and Google or Microsoft is watching, they're going to say, "Hey, wait a minute. That doesn't look like a certificate that we issued."
And they can touch and remediate and fix that after the fact.
So it's a way to put all behavior on the record. It's not a way to say if something is good or should be trusted, it's just a way to give the world a global view on what has happened, but then you can build up other systems on top of.
Benjie: Very interesting. Forgive me, but you mentioned that it's centralized.
I think this begs the obvious question of isn't this a little blockchainy?
Should this be a blockchain thing?
And is that where you're trying to go ultimately, at least from a distributed perspective and not being centralized, but a distributed attribute version of this?
Dan: Yeah. So they're very similar to blockchain. This comes up all the time.
The architecture is roughly the same where it's a Merkle tree, which you can look up on, but it's just a hash of a hash of a hash all the way down.
The main differences are it's centralized, which is good and bad, right?
It's centralized so it's really easy to operate.
One person can just stand up a server, you have a URL, you can start working with it.
And it's not as scary as it sounds to be centralized, right?
Because these are transparent.
The only thing you really have to trust is that the person will keep it running, because you can verify their behavior.
You don't actually have to trust that they'll act correctly.
You just have to trust that they can keep this thing running.
Whereas in blockchain gets distributed, and all the computation is happening everywhere, all the time, but otherwise they're pretty similar except I think of it as a blockchain where there's just one element, and everybody is writing into that same element.
In Web PKI where certificate transparency is a thing, and it's not quite as centralized, but it's still hasn't taken the full jump to be completely distributed.
There are dozens of companies that operate independent transparency logs, and then there are requirements that they all gossip between each other, and all the certificates eventually make it across all of them.
So it's centralized, there's dozens of independent, centralized copies of this.
I'm not a blockchain expert, but I think these let you get the benefits of a blockchain without having to do the complicated proof of work style stuff that allows a blockchain to operate without a single party just coming in and hijacking the entire thing.
You could build a lot of this on an established blockchain, if you're okay with the carbon footprint and all of that stuff that happens to make it so the blockchain can't just be taken over and hijacked, or you can just trust that somebody will keep one of these logs up and running.
You can have multiple people running their own logs too, auditing each other at the same time.
Marc: I assume there's a transparency log and you're following all the best practices when building that host service that you have there so we can actually like see into that and have confidence in it too.
Dan: Yeah, exactly.
Every time you run one of these commands, you actually automatically start doing some of these tracks to prove that the log is tamper-free and hasn't been mutated before.
There's another actually implementation of transparency logs going on now that most people use, especially in a cloud native world, you use every day without even thinking it though.
And that's the Go module sumDB, when I see those go.sum files everywhere.
But if you see those and you're using Go modules as part of your project, then you're also using the Go module transparency log, which is a way that the Go team built a system to help protect the supply chain of Go modules.
The very first time anybody installs your package version FOO an entry gets created in their transparency log, shrink the digest of that package, and the hash source that went into it at that tag.
So you can be assured that everybody, that accesses version FOO of your package gets the exact same contents in that package.
And anybody running Go commands is verifying that log and making sure it's consistent.
And hasn't been tampered with without even really knowing it?
Marc: That's awesome. That's a good way to think about it.
I think a lot of us are like Go developers and we have that go.sum file, we've seen it.
We get merge conflicts on it. We have to clean that up. That's cool.
Shifting to the next part here, a couple of times, we've been mentioning software bill of materials or SBOMs, is that in scope of Sigstore or out of scope, but how do you think about that?
Dan: Yeah. So SBOMs are a hot topic now.
They're described as a way to describe the materials that go into a piece of software. It makes a ton of sense.
If somebody hands, you a binary and you want to know what's inside that, so you can hook it up to your CVE scanner, your notification system and figure out when you need to rebuild it or getting your copy of it.
And that's how I see the value in SBOMs, There's some challenges with it, right?
You have to trust that the person generated it correctly.
There's no real way to prove that the SBOM is correct, the contents inside of it, because they're giving you information you couldn't have otherwise obtained.
So you just have to trust them.
But if you're taking a closed source bought from somebody and running in any way, you trust them implicitly, at least a little bit, otherwise you shouldn't run their binary.
But that's the future of SBOMs.
The US government is working on some regulations to start describing how they should be produced, how they should be distributed, that kind of thing.
So the actual generation of them is probably out of scope for Sigstore, right?
There's dozens of projects that are doing a great job at this today.
You can scan containers and generate SBOMs, you can do as part of the build process.
And there's a couple of different widely understood industry formats to shift the SBOMs in.
But what I think is in scope for Sigstore, and what we're trying to do is make it easy to distribute them, to sign them so they don't get tampered with after they are generated, and then to find them for code that you're running.
If you just grab a container, there's no easy way to go look up the SBOM for that container, if there is one.
And so that's the type of thing that we're trying to do for say Sigstore, people don't have to email you SBOMs, you don't have to go through a whole bunch of other systems to find them, if you do want them for the stuff you're consuming.
Marc: It turns out the distribution of those is tricky.
And if that's not through a trusted source, the whole-- All bets are off then.
Marc: And actually I love what you're doing there.
I've recently added SBOM generation as part of a CI.
Hopefully we're doing it accurately. It's an open source project. It's a CNCF project.
We added it in and we used Cosign to do it and publish it to an OCI registry.
And I'd love to give you an opportunity to talk about how you're building on top of the registry, and you're using some really cool methods to distribute that, that don't involve net new things that I need to run.
Dan: Cool is one word for it? Hacky is probably another one, but it works.
But yeah, so it's awesome to hear you're doing it as part of the build process too.
I think a lot of the people that guessed on us today, they're doing them post-facto and scanning a container or doing binary analysis, which is good, but if you can scan your own container, then I could've done that too.
So the SBOM isn't really adding a ton of value. It's just saving me a tiny bit of work.
So doing it as part of the build process, doing it upfront as a way to get more information into them to make them more useful which was awesome to see.
The way this works in Cosign, the way actually all of Cosign works with OCI registries is a giant hack, but I love it.
And it just works everywhere and we didn't need any new capabilities.
OCI registries are really simple at storing things, right?
You can upload something there and you can get it back if you want to, but they don't really have a way to reference other objects, right?
If you upload container image and then you generate an SBOM for it later, and you want to attach that SBOM to that container, that's not something you can do as part of the API today. It's pretty subtle.
The registries are all content addressable and everything is hashed and hashed different ways.
So you can't really attach it to something without changing the digest of the thing you want to attach it to.
So it causes a whole bunch of headaches.
So we solved it with following solely a little naming convention and Cosign, so when you want to upload something and attach it to an image, we take the digest of that image and just turn that into a name.
So it's just a random string. We take that digest, not a digest anymore, we convert it to a string and then we give it a little name.
It's like a dot suffix. So dot signature for signatures, dot SBOM for SBOMs, and we re-upload it.
So if you want to find all the SBOMs for an image later, you find the digest for that image.
You can calculate the name for where the SBOMs should be, and just go download that other separate object.
So the registry isn't aware that their linked in anyway, and that's the fun little hack that we had to do to get it to work across dozens of OCI registries in the wild today.
Benjie: I might call that elegant, just for the record.
I just want to put that out there. I find it to be elegant, but we all know.
Dan: So hacky, I loved it. It crossed the chasm.
Benjie: Yeah. It's like when things are so hot that they're cold.
It's like, I feel like that's a fair statement there.
Marc: And, we've recently had Josh, one of the maintainers of the OCI spec on, and talking to him about that. And I think like-
Marc: One of the many takeaways from a really, really great conversation, but one of the many takeaways was these aren't hacks, this is the future of like how we want to do storage.
And let's do these. Obviously we need some standards around it.
Marc: It leads to like-- Maybe in the future, if you have the standardized naming convention and OCI registries exist, and there's these like it's a SHA so it's cryptographically somewhat linked to the actual source, the registries themselves, the creators, the developers can introduce tightly coupled tools that can verify it.
So instead of-- You see that today with CVE scanning, the registry will show, oh, this one has these many CVEs in it, but at some point it could just link to the SBOM, because it knows how to find the SBOM in its own registry.
Dan: Yeah. That's awesome.
The CNCF Harbor registry I've been following they have an RFC out now to start showing some of this stuff in the UI, because we already have published the spec for how we're doing this naming.
And so we'll line up little links that show you signatures and SBOMs for all of that stuff which is just going to be awesome to see when that rolls out.
There's also a whole bunch of efforts in the OCI itself to figure out ways to improve these APIs so that you can do these lookups in better, slightly less hacky ways that interact with garbage collection, and copying things around a little bit nicer.
They move slowly for good reasons, dozens and dozens of companies are going to have to implement any changes that actually get made here.
So it's good to be careful, it's good to be slow, but it's cool that we can innovate and build some of this stuff anytime without having to wait.
Marc: Yeah. And if it changes, instead of a different name or different tag, it changes to be like, oh, some content type or a different header, that's a relatively easy change, but you've already said, "Hey, we're going to push this thing after the build into the registry."
Dan: Exactly. And the UX for Cosign and everything won't change at all. The commands will still be the same. Our API will still be the same.
We would just switch over how we talk to the registry and it will just work faster and better for everybody without them even noticing.
Marc: So shifting gears for a second, Sigstore recently hit 1.0.
Dan: The Cosign Part of Sigstore.
Marc: Cosign hit 1.0.
Dan: Yeah. So Cosign itself hit 1.0, up next we're going to do the transparency log and then the certificate authority probably in that order.
Marc: Great. So what does that mean?
Cosign hit 1.0, does that mean I should start using it in production?
It's GA or you meant certain set of features or what did it take get there?
Dan: Yeah, it's 1.0 for an open source project.
It's always hard to define, but yeah, it means we're comfortable with people using it in production.
We're not going to break the APIs, we're not going to break the COI.
We've set it up so there are some parts that are still experimental and you've got to opt into those, and those are the ones we're still tweaking and tuning.
But it's clearly delineated so if you want to start building stuff to sign and verify containers on top of Cosign, then go for it.
We spent a while using it ourselves for a bunch of important container images that we release, including some that go into the core Kubernetes distribution, like Distroless, if you've heard of that image, that's the base that's used by upstream Kubernetes.
So we've been signing that, and having Kubernetes verify it before they do their builds for months and months now to test this out and make sure it's all solid.
So yeah, go for it.
The challenge with this always is that a lot of companies, big organizations don't want to look at something seriously until it's 1.0.
And we really want feedback to make sure everything is great before we want to call it 1.0, because it gets harder to change stuff around later.
So it's like a chicken and egg problem there where you've got to just encourage people to try it, keep encouraging people to try it, get your own feedback, dogfood however you can to get some level of confidence.
And then you just call it 1.0 and wait for the feedback to come in later.
Marc: You said, you're going to work on making the transparency log 1.0 next?
Dan: Yeah. So that's the operational service, basically.
There's an API for adding stuff, tailing things out of the log, that kind of thing. It's pretty stable.
We have a giant warning on there though that we might have to delete the data at any time, and that gets into some tricky legal issues we've got to work through a little bit, where anytime you put data on the internet forever, that starts to get pretty scary, and you let anybody write data into that without being able to delete it.
So just some stuff to think through there, and figure out playbooks and plans for what happens if the law gets screwed up, how we recover from it, that kind of thing.
And if you follow the Web PKI news much, there was a transparency log actually maybe a month or two ago now where a cosmic Ray flipped a bit on an entry, and invalidated all the stuff after that.
So they had to figure out how to recover from it and these things do happen. So that's next on our list.
Marc: Wow. That's one of those things that you think about and you're like, theoretically, that could happen, let's protect against that.
But then when you actually read that it happened, it's like, wow, okay, we need the new theoretical. What's the new fear?
Dan: Pretty cool. It's one of those things where it's this complicated data structure and it just completely garbled, and people had no idea what happened.
And then if you actually look at the bytes, it was really just one bit that flipped that caused the whole thing to be screwed up.
There was one zero that should have been alone or something like that.
And then if you flip that, it was all completely good and valid.
So nothing nefarious happened, some machine bug or a cosmic ray or something caused that one flip and stuff happens at scale.
Marc: Yeah. And I think it's worth talking about that technical challenge, right?
You're trying to make this immutable provable audit log basically like the transparency log.
And it's like, this thing happened and now this thing happened.
And so they're all tied to the things that happened to before, that's how a Merkle tree works.
But then if somebody realizes, oh, last week we accidentally pushed some really sensitive information and we need to delete that.
You can't delete that because it invalidates everything below that.
And so these are the types of problems that you're trying to solve to get it to a 1.0, so that you can provide some stability guarantees in a known way, how you're going to resolve that.
Dan: Exactly. Yeah. So we're really careful about all the data that goes in, right?
We don't store data itself.
We just store the hashes of the data and the public keys and all of that stuff, but all code has bugs so yeah, we've got to keep going through making sure that we're not letting any crazy illegal stuff get into the log.
And yeah, they can go through these playbooks around what happens if something gets screwed up and how we recover from it.
Marc: So there's a lot here. This is a really complex topic.
And I think we're going to dive into it at a high level, some of why everybody's paying attention to this now, but while we're still thinking about the implementation details in the project, and how I would adopt it, there's two different sides here.
There's a publisher who's publishing software and you have dependencies, and you're thinking about how to share this information, how to get guarantees that you're publishing what you think you're publishing.
And then the other side is on the consumer, you want to take either a binary, a container image, open-source software and consume that.
And you've talked a lot about the various aspects of that, but I want to think like, okay, I'm an org and I'm new to this.
There's a lot of information here. Do you have any recommendations for where to start?
Should I start with signing? Should I start with transparency logs? How do I get my feet wet here?
Dan: As a publisher, you can definitely start by signing your stuff in some way or giving keys out to your consumers so they can verify everything back.
The one piece, if you can only include one piece of metadata on top of the signature, and you can do this with one flag inn Cosign, you don't need to do anything crazy with in-toto or TUF yet.
The one thing that you include is the commit or something like that.
So build a container from a GitHub repo, add the commit SHA in there, sign that whole thing and push it, and then you're good.
There's tons of nuance around key management and everything, but the simplest possible way to do it for an open source project is just check the public here, using right into the GitHub repo, next to your Makefile, or next to your Dockerfile or something like that. And then people can... From that container, if they can follow it back to the GitHub repo, they can see the commit it was built out and they can verify that public key right there. It's not perfect. And The Update Framework and a bunch of other things are better, but they're also a lot more complicated.
And this is so much better than just a URL or people having to guess.
If you have to rotate the key, then you just check in a new one, and then the new thing gets built from that new commit.
And it's automatically tied there too. So it solves a lot of problems, very cheaply for a small open source project.
Marc: That's great. And then TLS, Let's Encrypt going back to that earlier early analogy, right?
There's new. And maybe I'm going to take this too far, but we'll see.
There's new top-level domains that come out that require with TLS, you can't serve services on them that aren't tls.app, for example, I think is one of them.
Do you envision that future where Kubernetes is, let's generically call it more secure by default or something like this, where all of this validation, when I want to run something that's just baked right in and I can't get around it?
Dan: Yeah. I hope so.
I think you've got to come at it the same way where you just get enough people doing it to the point where it's not like, oh cool, this person signed their container.
It's like, oh, gross, this one, hasn't signed their stuff yet.
So we've got to start getting the momentum shifting in that way.
And then you can start to, as a large project, as somebody in a central position to affect a lot of supply chains, you can start to think more carefully, and take bigger risks and implement bigger requirements in the software that you're willing to consume.
Kubernetes is a great example again, because they are taking this seriously.
They should, it's one of the most widely used products in the world.
They have new policies for what Go modules will start to allow, and they're actively trying to reduce the number of dependencies in the tree which is awesome.
If you don't actively do that, then they just grow over time forever and stuff gets deprecated.
And then you end up with old unmaintained vulnerable stuff in there that's impossible to get out.
So yeah, I think it's on everybody to start doing things as responsibly as they can as producers.
And then it's on huge projects to start setting standards and requirements for the stuff they're willing to use until we get to a point where we can actually start to require and mandate and do this all the right way across the board, like the .dev and the .app stuff you're talking about.
Marc: In the latest version of Kubernetes, they're actually now starting to generate and publish an SBOM, a software bill of materials, with the Kubernetes release, which is awesome.
Is that using some of the tooling that you've built?
Dan: Yeah, they built their own SBOM generator, which is awesome.
There wasn't a lot in the Go container space when they got started so they built their own.
It can scan containers, it can scan Go projects, it can do all of that and stitch it all together into this mega SBOM.
It's a huge amount of work, and it's great because it's not custom to Kubernetes.
The tools are just called BOM or something like that, I think.
And other projects that started to use it too.
They generate stuff in the same format so you can upload them with Cosign and sign them.
Marc: Yeah. I think having generated an SBOM before for an app that's running in Kubernetes, I'll tell you like, there's a lot of like, how do I get started doing this?
I know we're at a little bit off topic, but it's supply chain, and it's like, I can-- As a Go project, I can generate this from the go.mod that's part of it.
But now I'm building a Docker image and I have to think about the base layers here.
And do I stitch this back together? Do I deliver this as two ones? Two different SBOMs?
And I think I'm personally looking at Kubernetes and the way that they're doing it and saying, "Okay, in lack of a standard way to do this, let's follow that. Let's follow their lead."
Dan: Yeah. It's the early adopters at this point trying to do it, and hopefully figuring out the right ways and the wrong way, and what works and doesn't, and they can tell all of us as we try to do it for our projects.
It's awesome to see Kubernetes being so far ahead of smaller projects.
Normally you'd expect something that big to wait for the past to get a little bit more paved.
They're in a great position to start pushing this stuff forward.
Marc: That's the fun part about this whole ecosystem?
Benjie: Yeah. I think there's--
We have you on, so I have to ask, and this leads into that as to why it's gotten so much prominence.
Their supply chain is starting to really be talked about, and that's obviously SolarWinds.
How has that affected how you think about this, and what have you seen in the ecosystem in general about that?
And how significant was the SolarWinds event I think is what I'm trying to say?
Dan: It was huge.
I think, SolarWinds itself wasn't as terrible as some of the other things we've seen, or as scary as some of the other things we've almost seen.
And I think it was just the straw that broke the camel's back here. So it led to a huge change.
I don't want to blame them or call them out more than they already have been, but yeah, I think that that event, not necessarily the incident itself, but that event has led to Washington and all these other large organizations taking this whole area much more seriously.
I've been worried about supply chain security for years now, but up until maybe halfway through last year, people looked at me like I was crazy, and this wasn't a big deal and they didn't care about it at all.
It started to shift and then December, January of this year, all of a sudden everything changed when Biden writing an executive order about it.
Everybody, every company in the world now is trying to figure out what their supply chain is, whether they know or not.
Marc: You've gone from wearing the proverbial tin foil hat to saying like, "Oh, I know this. I told you. Come on let's..."
Dan: Yeah. I don't know how to fix it yet, but I was, I was worried about it a while ago.
Benjie: So just going back a little bit on that one, because I'd love to hear it from your words.
Can you just explain to us the SolarWinds event just quickly, and then also follow up to that question would the Sigstore tooling have prevented it if it was in place and ubiquitous?
Dan: Oh man. Okay. Yeah. So this is a great topic too.
I was actually just talking to someone on Slack over it.
One of the biggest problems with talking about actual supply chain attacks and events that happened is that there's always two halves to it, right?
As opposed to a lot of other breaches. There's the initial attack, right?
Or somebody gets into the supply chain, and then there's a pivot or a move downstream in the supply chain.
Instead of just attacking a company, ransomware in their database, you attack a company that's a vendor of software, like SolarWinds was in this case.
And instead of just attacking them, you stick something into the software they ship, and now it trickles downstream to all of their customers.
And that's what happened here. It was a relatively--
I don't know the exact details, but it was relatively standard breach of their build server, which then let the attackers put a backdoor into the build server.
So all of the software that was built on that machine was now backdoored, and got distributed to a bunch of sensitive places, because they're monitoring software that was only in privileged environments, US government and other huge companies.
So when you talk about preventing SolarWinds or something like that, you have to clarify which half you're talking about preventing, right?
Preventing the build server compromise is actually pretty easy, right? We know how to secure systems.
So that's been a pet peeve of mine forever that for some reason we're okay running Mac minis under desks in offices to do builds that get shipped into production environments, but we would never do that, you'd never run a production database on a Mac mini under your desk.
It's okay for some reason to ship the binary that can talk to that database from there though.
Right, so we know how to secure systems. It's really just about treating build systems as production systems.
Everybody's guilty of it. Really Minikube builds run a Mac mini under my desk, every company is sitting on old desktops in a closet somewhere.
So it's just about changing that, and making those assessments harder to attack.
We're never going to get perfect, but we can at least make it harder than it is today.
On the downstream side, which I think is where a lot of people are focused, unfortunately it's a lot harder, right?
If you get a binary from someone that you trust, you can't prove what's inside of it.
An SBOM can't prove what's inside of it. If there wasn't going to be a line and a SolarWinds SBOM that said, "Malware here."
That's not going to be something you can use to trust and protect yourself.
So, no, I don't think, nothing in Sigstore would have prevented either aspect of SolarWinds attack from happening, but other efforts going on now around just securing and hardening build systems and using ephemeral environments and all these other best practices would have at least made that initial attack a lot harder.
Benjie: Well, I think everybody knows I love when you start talking about ephemeral environments.
So I tend to support that statement more.
But also I will say that what's been in the back of my mind now, having had this conversation with you is that as a compromise, if I was running the department of whatever, agriculture, and knowing what has happened.
And then if everything was using Sigstore, I could quickly figure out my internal systems potentially that were compromised without needing outside help from the SolarWinds community that's probably a little busy at that moment.
So I do see that there's self analytics I could be doing so my internal response teams can actually respond without needing guidance from up above.
My brain races on this, but we're running out of time and we have to ask you obviously, Marc and I are engaged in this, but if we want it to be specifically engaged in helping contribute to this, how does that work?
Do you guys have community meetings? How do we get involved if we want to be involved in Sigstore?
Dan: Yeah, there's a community meeting each week, AM central time.
So I will say, yeah, there's 11:30 AM on Tuesdays central time or whatever that is your own time zone.
And we can get some links in the show notes probably for that.
We have a Slack that's really active, people ask tons of questions in there.
That's also probably-- You can probably find all of this on github.com/sigstore/community.
Either the Slack or the community meetings each week are the best places to get involved.
Marc: Is there a specific type of feedback that you're looking for at this stage?
Either around Cosign 1.0, or getting Rekor up to 1.0, more folks to use it, specific types of environments or use cases that you'd love to see more of.
Dan: Yeah. I think just try it out and tell us how you want to use this.
A lot of this stuff-- I think we're still at the stage where organizations know they should be checking containers before they get deployed, but haven't quite figured out the best way that fits into their workflows, fits into their build systems.
So you can just play with this stuff, start to imagine how you might use it in your organization, and then let us know if it works for that or not.
Getting people to start to converge on patterns for where they sign, how they manage their keys, where they want to verify, whether it's at the node or in an admission controller or something like that.
It's going to be really helpful for us in designing the next wave of stuff here and making it usable.
Marc: Cool. Dan, I really enjoyed the conversation.
I learned a ton, and I'm actually happy to know that there's a really smart team of people out here literally doing nothing but working to solve, and advance this cause, make it easier to use, and think about these use cases.
Dan: We're trying at least so thanks for the conversation.
Subscribe to Heavybit Updates
Subscribe for regular updates about our developer-first content and events, job openings, and advisory opportunities.
Content from the Library
The Kubelist Podcast Ep. #38, Exploring K0s with Jussi Nummelin of Mirantis
In episode 38 of The Kubelist Podcast, Marc and Benjie speak with Jussi Nummelin of Mirantis. This talk explores the...
O11ycast Ep. #64, Shared Language Concepts with Austin Parker of Honeycomb
In episode 64 of o11ycast, Jessica Kerr and Martin Thwaites speak with Austin Parker of Honeycomb. This talk explores how...
O11ycast Ep. #57, Monitoring K8s Applications with Shahar Azulay of Groundcover
In episode 57 of o11ycast, Jess and Martin speak with Shahar Azulay of Groundcover about monitoring Kubernetes applications,...