Ep. #101, Supply Chain Security with Feross Aboukhadijeh of Socket
about the episode
about the guests
Brian Douglas: Welcome to another installment of JAMstack Radio. On the line we've got Feross Aboukhadijeh.
Feross Aboukhadijeh: Thanks for having me here.
Brian: You've recently launched a new tool which is Socket.dev, and wanted to have you on to introduce yourself, tell us who you are and how you got here and then we'll get into the product that you launched recently.
Yeah, it was a good time, it was a fun time back then. Lots of modules being written, lots of server problems to solve and I got really interested in BitTorrent and Peer to Peer protocols.
I got obsessed with this idea of making torrents work in a web browser, so I started this project called WebTorrent to do that. It's been something I've been working on, worked on for many years after I started it and it became this big project where there was so many different parts to build out.
So I ended up writing different NPM packages in the process, and splitting all the project into these nice, independently separated packages. That's how I accidentally found myself as an open source person, an open source maintainer of different things. It's been fun doing it and being an open source person.
Brian: Are you still maintaining WebTorrent?
Feross: So I don't actually actively code too much on WebTorrent but it is still actively maintained. There's other people now who help out and work on it. It's definitely, I think, one of the coolest torrent apps out there because it does work in a browser and the desktop app has a really, really nice user experience so you can just drop in a torrent and it'll stream it. So yeah, it's definitely worth checking out still.
Brian: Very cool, yeah. That's a pretty awesome way to find a way into open source and some serious programming. Also there's got to be a lot of value you drew from having folks using the thing that you built, and then also even attracting other contributors. You had mentioned 2013, 2012 is when you started. We actually had the Astro founder, Fred, on for Episode 55.
Feross: Fred is great.
Then someone told me, "Hey." Actually, surprisingly enough, I was working on a Socket thing. It was actually Socket.io is what I ended up introducing myself into Node code and why I started writing it. But Socket.dev is a little different, and can you give us an explanation of what is Socket.dev?
Feross: Yeah, Socket.dev is a tool to help you secure your open source supply chain. Maybe I should define what all that is. Yeah, a software supply chain is this idea of all of the component parts that make up your application. The term comes from factories, from the real world where you have all these parts that you pull together and produce some physical good from.
But in the software world, we have usually open source components making up our apps and most apps are about 90, 95% of the lines are open source, in the average app. So most apps are really built on the shoulders of giants, I guess. It's really powerful when you can put an app together in a few hours or days because you have all this open source code to rely on.
And so when we talk about the software suplly chain, we're basically talking about where does all this code come from, who writes it, how can we trust it, what build servers are used when we build the code, how does it get all the way from where it's produced to when it's in a final artifact in the app that you're building. Then how do you make sure that it's safe?
Feross: Yeah, definitely. The situation, I think you calling it interesting is a little bit generous of you. Yeah, this maintainer, this has actually happened pretty recently, back in January of this year. There was a maintainer who had two popular packages, one called Colors and the other called Faker, and both of those together got, I think, around 100 million downloads a month. So very popular packages, very widely used by the ecosystem.
One day he just woke up and decided, "I'm not happy with how companies are using my code without paying me."Which that's a whole separate conversation, I think. Anyway, he decided to sabotage his own code and basically fill it with spam messages that would get printed out in random Unicode characters and also infinite loops and stuff like that.
It completely changed the behavior of the package, it's just kind of like, "What's going on here?" And Amazon's CLI tool was using one of these packages, and so it ended up people who used the tool that day, who installed it that day got a new version of colors that he published as one of their dependencies. Then using this Amazon CLI, they thought, "Whoa, Amazon might be hacked, because what is all this weird output that I'm seeing when I run the Amazon CLI?"
So a lot of companies that use this code were actually affected by the way that dependencies are installed in Node where you specify these loose version identifiers with the caret symbol or the tilde symbol that allows any version, any new patch or minor versions to get installed automatically.
So yeah, that was a really interesting recent case, but there's been so many more too. It's not just that case. If you just go back a couple more months in November and October of last year, there were several other supply chain attacks against NPM.
Those ones were a little bit different, so those packages were hijacked or taken over by some random attacker. The way that it happened was pretty interesting because there was this maintainer who was totally trying to do the right thing, a maintainer of packages that were widely used, widely respected, really good packages.
But I believe he may have reused his password, his NPM password on other websites and when one of those other websites got breached... That's kind of what it looks like happened. Because there was this post on a Russian hacking forum two weeks before his packages were compromised, basically saying, "Hey, I have the password of an NPM maintainer with seven million weekly downloads. I'll give it to whoever pays me $20,000." And then two weeks after that, it's a maintainer who has a package that gets seven million weekly downloads was compromised. So it just seems like that's probably what happened there. Yeah, that was a bad one.
You had mentioned in passing about the funding thing, and I know it's another whole conversation, but I know you have a history in funding and attempts to sustain yourself in your open source as well. The state of funding right now in open source, is that a big security vulnerability for a lot of packages because maintainers might not be getting paid enough or not getting paid anything to support AWS CLIs?
Feross: I think it's part of the problem, it's definitely not the full problem. The part of the problem that funding is responsible for, it's actually quite similar to the Chrome Extension ecosystem actually. In Chrome extensions you'll see there's a lot of these extensions that are made by individuals for fun and then put out there, and then they end up accruing million and millions of installs.
The person who maintains it may want to work on it more but they may have a day job, they might have other things going on in their life and they also find it pretty hard to monetize these extensions because people aren't used to paying for extensions.
Just like how people aren't used to paying for open source. So what will happen is sometimes bad actors will reach out to a Chrome Extension owner and say, "Hey, I'd like to buy your Chrome Extension for $50,000 or something like that, $25,000." And this person who's never made a cent from their Chrome Extension and is honestly maybe tired of getting users complaining to them for years and years, may just say, "Hey, you know what? That's actually a great idea. This company can take it off my hands and give me some money for it." And often the company will immediately turn around and put ads or a tracking code into the extension which is obviously the reason why they're buying it.
Open source has kind of a similar problem where a maintainer who may not be using their package anymore that they wrote, they may have written it at a previous job that they're not working at anymore. They may have written it just for fun, put it out there and then accidentally found themselves now the maintainer of a really important, really popular package.
Someone like that who just doesn't have the time that the package needs to do a good job with it, may be very receptive when someone reaches out and says, "Hey, I'd like to help maintain this package with you because we're using it at my company and you're not really responding to issues." And that's actually a great thing, that's how open source works. People share access with each other pretty liberally, especially if someone's done a good job contributing in the past. And I'm not saying that's bad, I think that's actually awesome.
But there's a bit of a risk that comes from that issue, and one of the most prominent attacks that happened was called Event Stream back in 2018 and that was exactly what happens. Dominic Tar, a prolific maintainer, stopped using one of his packages and hadn't been maintaining it and someone just reached out and said, "Hey, I'd love to help you." And he said, "Yeah, sure." I don't think funding it would have helped in that case, that's the thing where I say I think it's part of the problem.
I think if people could really make a full time living doing open source, maybe they wouldn't just give away their packages, maybe they'd want to professionalize a little bit around how they maintain it because they're making a living from it. But yeah, I think it's only part of the problem, I don't think it would really fully solve things, yeah.
Brian: Yeah, so personally, we talked off air on the thing I'm working on in the future, but I think the other, not solution, but it would help the problem is finding other folks who are contributing code and to be able to trust who you're handing the code off to. Because I know with EventStream it was like he was done with writing the code, it wasn't the thing that it was his day job or interest.
But to be able to validate people in the ecosystem through historical commits or contributions or just attach to a real profile, it would make a lot of those decisions a lot easier if you could trace where these verified commits are coming from. But going back to the supply chain, I'm curious. There's a few other security scanning tools, so how does Socket.dev differ from those?
Feross: Yeah, great question. So if you think about how do you actually stop supply chain attacks, what would be a way to actually... If you were trying to solve this problem, how would you stop the next supply chain attack? It helps to look back and see, "Well, what did the previous supply chain attacks do? What were the indicators that these packages may have been compromised? So if you just think about the ones that we've already mentioned so far, what did EventStream do? Well, EventStream was one of the trickiest packages because...
Maybe we won't start with EventStream, that's actually a harder case. Let's start with the ones that were compromised in October and November, so one of those packages was called UAParser.js, so it's a User Agent, a string Parser, it parses the browser user agent and tells you what browser the user is using. A package like that doesn't need access to the file system, it doesn't need to talk to the network, it doesn't need to run shell commands, right? It's a very simple package, it doesn't need these kind of powerful capabilities. What we saw when it was compromised was a new version was published, actually three new versions were published because they wanted to get as many people to install them as possible.
But if you look at those versions, they contain all this code that uses all these new platform capabilities that weren't used in the previous versions. And so one very obvious thing to look for is does a package suddenly start using permissions or capabilities, powerful features in the platform that it didn't use before? Specifically ones that are security relevant, such as running shell commands, running uninstall scripts, talking to the network, reading files, right?
So that's something where we can just look at new versions and say, "Hey, you know what? This version is introducing this new behavior and that's probably something that a human should look at and be able to answer the question why all of a sudden does my user agent parses need to run shell commands? That doesn't make any sense, right?"
So that's a thing where we think when the developer goes to update that package to a new version, Socket can come in, leave a comment on the pull request and say, "Hey, by the way, in case you didn't notice, this package is now doing these things that are worth a look." That helps the reviewer figure out that this is worth a look because most people, and I'm sure I'm not going out on a limb here when I say probably most teams are not clicking through when they see a package log file was changed, they're probably not going and doing research to see what actually changed inside those packages. It's just not a practical thing to do.
Brian: No. Yeah, I did for a while back when I first included DependaBot into my projects because it was like, "Ah, cool. I can look at the code and look at the release notes." And then after a while it's like, "Oh, it's DependaBot day. I guess we're just going to merge these things in." So are these comments inferred from folks disclosing these changes or is this all through automation or machine learning or something fancy like that?
Feross: Yeah, so it's not machine learning although we may use that at some point. Right now it's just static analysis, so you can think of it kind of like a linter. We're running an automatic analysis of every NPM package that's published. We have a pipeline that we've built that analyzes all these packages in the past, all the ones that are published in the past as well as, in real time, the new ones that are being published. When we see a new package version, we just check, we have a list of 60 things we check for in the package that it may be doing and then we can annotate that package with these... We call them issues.
They're basically like potential red flags that that package version has, and so you can go to Socket.Dev and just look up a package right now and see what issues the package has that our analysis has found and get an idea of the kind of things that we can find. We've already found some pretty interesting stuff, by the way, just looking for basic things and static analysis such as like an Angular calendar component, which you think would be pure web code. But it's actually doing all the things that I said earlier, shell, file system, network, install scripts.
That's actually kind of an interesting case, because that package, it's not malicious. It's gathering data about who is using the package and sending it to the maintainer so that the maintainer can... It's almost like Google Analytics for open source. I wouldn't say it's outright malware, but also it's not something that probably a lot of people realize their open source might be doing, and so I think it's useful to be able to go look it up and Socket and see, "Hey, we've actually tagged this package as doing all these things and also we've even tagged it as using telemetry." So we can summarize all these things and say, "This is actually phoning home, it has telemetry in it."
Brian: Yeah. Even just that one feature of identifying telemetry, letting me know, "Hey, this package actually is sending your data back to wherever." It's enough for me to be, not concerned, but raise an eyebrow and be like, "What am I using this for?" For example, if I'm working, I was working with some tool, some CLI tools that were deployed to GitHub packages and it used some NPM packages.
It was looking at data, all public data, but it was pre GitHub Action, so it was taking a look at everybody who didn't have GitHub Actions and reaching out to those owners of those repos and be like, "Hey, Action is the thing. What can I do to get you to use it?" Type of deal.
In my DevRel role, it's like, "Ah, it feels very salesy." So we had to lock down a prior repo, but if it had telemetry that's going back to one of those packages that I'm using, it's like, "Looks like GitHub is looking at CI, basically." Because it was pre the CI change, and that raises red flags of like, "Ah man, is someone going to look into BDuggy's libraries and stuff like that and figure out what new features are coming out of GitHub?" Maybe that's just a little paranoia from me, but the telemetry feature would be very useful for me for that reason.
Feross: Yeah. I also realized I didn't answer your question fully about how is Socket different from the other code scanning tools. Yeah, so I think if you look at other stuff out there, DependaBot is a good example, right? DependaBot, when it finds a security issue, what it's really doing is comparing the package version that you're using in the project to a public database of known vulnerabilities. A known vulnerability is when a security researcher finds a bug in an open source project and this bug has a security implication.
It may be exploitable by somebody, so the security researcher writes up a report, sends it to the maintainer so the maintainer can get it fixed, and then that report gets filed with this thing called the NVD, or the National Vulnerability Database which is actually run by the US Federal Government. That vulnerability is assigned a number, like an identifier, called a CVE, and what tools like DependaBot or Sneak do is they basically just tell you, "Hey, you're running a version of a package which is known to have vulnerable code in it, based on what we found in this database."
So I would say it's definitely a good thing to do, but it's pretty reactive because it requires someone to go and find this security issue and report it, get it added to the database and then the tool can warn you that you're using it. These types of things that are reported in the database are primarily accidents that the maintainer put in the code, whereas what we're seeing a lot more in the headlines these days are not accidental vulnerabilities but actually outright malware, where someone's actually compromised a package, someone's actually taken it over, right?
The maintainer reused their password and now some attacker has control of it and they're going to just put malware in there. If they do that, then what happens is anyone who installs the package for the next day or two, until this issue is discovered, is going to just get this malware on their computer. That's actually exactly what happened with UAParser.JS back in October of last year, was there was a cryptocurrency miner added which mined cryptocurrency, it stole your CPU resources and also stole all your passwords on your computer.
That was not in this database of vulnerabilities because that was not a thing. It was basically like if a package is suddenly published, how do you know if it was published a couple of hours ago or yesterday? How do you know if it's safe? Looking up vulnerabilities in the vulnerability database, you're very unlikely to find some vulnerability published about a package that was published hours ago. That's just not the speed at which... And so that's what Socket is trying to find.
If you look at the actual code you can analyze it and see what is it going to do, right? Is this thing going to do something to my computer when I install it that I should know about? And if so, why doesn't NPM prompt the user? Why doesn't it say, "Hey, this code is about to do these things"? Almost like the way when you install a smartphone app and it wants to access your camera or your contacts, it can't just do that, it has to ask you, right?
And so that's kind of what Socket is trying to do. It's like, "Well, why don't we tell people upfront what this package is going to do? And then later on also, if a new version comes out and it's not doing this new thing, they should have to disclose that or the user should be prompted that this package is now doing this new thing, just like it would on a smartphone app." And so that's the difference between Socket and these other, older school tools.
Brian: Okay, yeah. That makes a lot of sense, and yeah, I appreciate you leading the charge on trying to improve supply chain security but approaching it in a different manner to as well, because I don't want to look through all the updates. It took me a long time to even just update packages on a regular basis, so now that I'm actually doing that I don't want to have to go through and also read every single one of these release notes. So we can have some sort of tooling and automation, which I understand Socket, you can install it in a GitHub Repo? Is that the introduction?
Feross: Yeah. Like you were doing before, you said you were looking at all the changes for a while there and then you got tired and stopped doing it, which is totally understandable. I think with Socket you can install our GitHub app and then for the most part if a version update has good release notes, it seems good and your tests pass, you can merge those and not worry about them.
Then when Socket calls out a particular update as being interesting in some way because the behavior has changed, you can dig deeper into those ones but that's going to be a much smaller percentage of your updates. It hopefully won't be too many and you can devote your limited attention to the packages that are most likely to have issues and then just not worry about the rest. Since most teams right now aren't even checking anything for any of their updates, so Socket says, "how about instead of just literally doing nothing, how about let's start with putting some attention on the ones that seem suspicious?
Brian: Yeah, excellent. Congratulations on the launch and looking forward to... Actually I'm going to start testing it on some repos of my own to get some feedback, because some repos I heavily write code in but then also I take a lot of open source contributions as well. Every now and then I had a UI component library slip in when someone solved a UI problem and I was like, "Ah, I wasn't ready to add this to the project yet, but here it is." And so things like that, it's nice to have some sort of security, I guess Socket Security, if you will. So yeah, anything else you want to add to the conversation for folks who are listening before we jump over to the picks?
Feross: Yeah, I'll just mention one other cool thing that people can find if they go to the website, if they go to Socket.dev, because I mentioned we're following NPMs. We're downloading all the packages as they're published.
When malware is discovered on NPM and it's reports and NPM takes it down, we actually get to see that happen and because of that we actually have this nice sample now of known malware or packages that had security issues. It's just interesting to look through it, so we decided to host what we've collected. Not to serve it to people to download or to run it, but just to read it on our website.
If you go to the footer of Socket.dev and you click on Removed Packages, you can actually brwse through these removed packages. It's interesting to see what does the malware do, what is this stuff that's getting taken down actually trying to do to your computer if you were to install it, and there's some crazy stuff in there. Some of it's really straightforward, like it just grabs your process.n, which is all your environment variables, and then it just sends in an HTTP Get request off to some server with all your tokens.
But then other stuff is like these giant blobs of obfuscated code, that it's just like a wall of completely gibberish stuff and you're like, "Okay, I don't know what this does but I probably don't want to install this." And then there's some stuff on there that's just spam, people are just posting spam links to random spam sites on NPM, I think just to get their links in the Readme on the website and so those get taken down.
You can just see what's out there and there's hundreds of these every week that are published and then removed by NPM, so it's just interesting to see. We're also reporting the stuff we find to NPM and getting it taken down and we plan to keep doing that in the future. So yeah, it's just interesting, if people want to check that out and play around with it. All the data is there, it's open for people to access.
Brian: Okay, cool. Yeah. I'm actually really intrigued as well. I don't want to look at release notes but I'm actually intrigued to see what people are actually doing. I don't token myself as a security researcher or anything like that, but I'd have some cycles on a Friday to go read and thumb through some of the stuff. That's pretty cool. That would make a great newsletter though, just throwing that out there.
Feross: Just post interesting tidbits, like, "Here's what we found this week."
Brian: For sure. I'm pretty sure that would be on Hacker News at least once a week, for sure. Maybe not front page, but it would be on Hacker News. Excellent. Well, appreciate the conversation, Feross. We're going to transition to picks, so these are Jam Picks, things that we're jamming on. Could be music, food, code related, all of the above. I actually see you already have some picks in there, so do you want to go ahead and go first?
Feross: Sure, yeah. I recommend people check out the podcast Darknet Diaries. Have you heard of this one, by the way?
Brian: Actually I'm an avid listener. I love it.
Feross: Yeah, it's so good. Right?
Brian: For sure, yeah. Honestly, working at a company as GitHub and seeing the acquisition of Microsoft and then also seeing all these... Maybe I shouldn't disclose, but yeah, definitely seeing pen testing type emails come through my inbox and then having the head of security be like, "Hey, I saw you open that thing." That is absolutely fascinating stuff. I only come from startups, so GitHub is the biggest company that I've ever worked for, but I feel like now my brain is turning. I'm like, "You know what? I'm never opening up anymore emails." And I don't want to ever be the person who has the same password in multiple places, so did a whole cleanup. But yeah, please proceed, we didn't even explain what the podcast is.
Feross: Yeah, I'll explain it. Darknet Diaries, it's basically all these... Their tagline is, "True stories from the dark side of the internet." And it's basically just this fascinating glimpse into internet crime and what hackers are up to. So they'll interview a lot of hackers and have them talk about the various, I guess, crimes that they did. A lot of people they interview have been caught and have already served their time and so they can talk about it now.
You'll hear all the stuff that they did and the details of their various shenanigans. Also they'll interview security professionals, and one of the things that I loved learning about listening to the show is that there's some people who have a job to be basically a physical pen tester where actually their job is break into buildings.
Companies will hire them and say, "Test our security and also our physical security."So there's literally people who's job it is to break into banks for a living , and they have these little notes, printed out papers that if they get caught or arrested, they can take this paper out of their pocket that says, "Actually here's a letter from the CSO," the head of security, saying that, "They paid me to break into the building so please don't arrest me and call this phone number to confirm with the security person that I'm supposed to be here."
And they're supposed to pull that out if they get in trouble. But there's all these stories that they have of people where that went wrong and they got arrested and then it didn't go well, they didn't believe the letter or whatever happened. It's just a really interesting part of the security world that I had no idea about really before listening to the show, so it's really cool.
Brian: Yeah, it keeps me paranoid. I haven't been in the office in a long time, but definitely the whole follow through or the, "I forgot my badge," type of stuff. It's like, "Ah, I've seen this show before. Sorry, go to the front desk."
Feross: Right, yeah. Definitely it helps you up your game, it gives you a little bit of the paranoid mindset that you really need to have to do security well.
Brian: Yeah, for sure. You had another pick?
Brian: Not a pick, but I did get stuck into the Mark Rober series where there's the two guys. One does streaming where he picks up the phone with the scammers and then answers the question but in an old lady voice. Honestly, I'm blanking on who the guy is but I'm sure everybody listening is probably thinking, "Oh, it's this guy." But Mark Rober is a YouTuber, he's like an electrical engineer and he builds a lot of inventions and he built a box for Amazon packages. Basically, he gets a box as if it's like a package outside your door and then there's Porch Pirates, I guess is what they call them.
He actually has a spinning glitter bomb, fart spray thing so if someone steals the package and opens it up in their house they get a bunch of glitter and a fart spray spraying. Then the actual box has a GPS tracker so he goes and picks it up when they throw it in the trash. But it's a little fun, and I definitely watched pretty much every one of those YouTube videos on that.
They did a combo effort where they called the scammers and then tracked down the folks who were doing the ransomware type stuff, trying to get Bitcoin out of old folks, which kind of summarizes what's really happening. Let's just call that a pick, YouTube Glitterbomb Ransomware. Google that and you'll find what I'm talking about. I did have another pick which is the Burr Coffee Grinder. Do you drink coffee, Feross?
Feross: Yes, a lot, too much.
Brian: Okay, yeah. Being in an office, I didn't realize but most of them in San Francisco have Burr Coffee Grinders and the way it works is the normal spice grinder, which is like a flat blade that chops up the beans or spices or whatever... Apparently you're not supposed to use that for coffee because it doesn't really chop up the coffee or grind the coffee properly. But a Burr grinder is like this flat disc and it has angled blades and it gives you a more consistent grind of the coffee, and it actually does make a difference.
I feel like one cup of coffee is enough if I've used a Burr grinder, and I've only been using it for two days. Well, ironically I've been using it for years at work. But at home I've always had the garbage, $20 Amazon spice blender. So I do recommend actually jumping to the Burr grinders, it's a little more expensive than a normal coffee grinder but definitely worth the flavor for sure. I'm a black coffee drinker, so I like full flavor, I got to taste the beans.
Feross: Right. That sounds awesome. Yeah, I've been going hard on the espresso because it's just quick and easy. But I used to grind the beans and do the whole French press thing, but now just to save time I'm doing Nespressos.
Brian: Okay, that works. Pick your poison.
Feross: It's not as tasty, it's definitely not as tasty.
Brian: I did have a Keurig for a short amount of time and it just never got to the same palate level that I was looking for. Being in San Francisco, we are a little bit spoiled when it comes to coffee, so once you taste the goodness you can't go back to the regular stuff.
Feross: Yeah. Keurig is nothing compared to the Nespresso, Nespresso is the bare minimum I think. But we are spoiled in San Francisco, you can't really find bad coffee unless you go to, I don't know, a gas station or something.
Brian: This is true, and sometimes you can even be surprised there as well. Yeah, sometimes it can be on another level. Some of the small, corner stores, at least here in Oakland, can have some pretty good coffee on deck, for sure. But Feross, again, thanks so much for the conversation about security and Socket as well, and also the conversation in general. I think I learned a ton about things I can probably think about in my supply chain security. So best of luck with the new launch and, listeners, keep spreading the jam.
Subscribe to Heavybit Updates
Subscribe for regular updates about our developer-first content and events, job openings, and advisory opportunities.
Content from the Library
Jamstack Radio Ep. #108, Securing Environment Variables with Dante Lex of Onboardbase
In episode 108 of JAMstack Radio, Brian Douglas speaks with Dante Lex of Onboardbase. Together, they discuss environment...
The Secure Developer Ep. #43, Combatting Security Burnout with Stu Hirst of Just Eat
Note: The Secure Developer has moved. Head to mydevsecops.io to subscribe and listen to the latest episodes. In episode 43 of...
The Secure Developer Ep. #41, Optimizing Team Communication with Sara Dunnack of InVision
Note: The Secure Developer has moved. Head to mydevsecops.io to subscribe and listen to the latest episodes. In episode 41 of...