In episode 17 of The Kubelist Podcast, Marc Campbell and Benjie De Groot are joined by Keith Basil of SUSE. They explore K3s, the certified Kubernetes distribution built for IoT & Edge computing, and how it removes the cognitive overload of standing up Kubernetes.
About the Guests
Keith Basil is Vice President of Product, Cloud Native Infrastructure at SUSE. He has a wealth of past experience working in cloud, including senior positions at Cloudscaling, Red Hat, and Rancher Labs (acquired by SUSE).
Marc Campbell: Hello again and thanks for tuning in to another episode of the Kubelist Podcast.
This is going to be a really fun conversation this week.
Once again Benjie from shipyard.build is here as my co-host. Hi Benjie.
Benjie De Groot: Hello.
Marc: So we're lucky to be joined by Keith Basil from SUSE by way of Rancher.
Basil is the VP of Product and Cloud Native Infrastructure at SUSE. Welcome Keith.
Keith Basil: Hey Marc, it's really good to be here.
So before we start talking about all of the work that you're doing in the cloud native space, I'd love to start a little bit with just your background.
How did you get into the cloud native space?
Keith: So I've been in cloud for some time, starting probably around 2008, 2009.
And I did some work for a US government project to do a secure cloud implementation.
And from there I went to Cloudscaling. So some of your listeners may know Randy Bias of the Pets and Cattle fame.
So I worked at Cloudscaling for a while and then Red Hat started to get real serious about open stack.
And I was one of the first project managers to join Red Hat, and because of my security background with the US public sector I eventually got into doing security for Red Hat's cloud products.
And so that was my area of expertise.
But personally I've always had this passion for what I call decentralized cloud and small footprint cloud.
And that's kind of really drove me to look at Rancher as an opportunity.
Marc: That's great, yeah. I mean definitely that-- I can see that progression from Red Hat to Rancher and then--
How long were you at Rancher before the acquisition?
Keith: That's a great question.
My first week at Rancher I was privately informed that we were going to get acquired by SUSE.
So, I think it was about six months.
Marc: Wow. So you go through the entire process of interviewing and joining and deciding whether or not you want to work there, and then the first week they drop that out on you.
Keith: It was actually worse than that, because I left Red Hat to do a startup around decentralization.
And I ran into Shannon and basically Shannon with this silver tongue in sales convinced me to join Rancher and put my startup on hold.
So you can imagine the psychological journey.
I went from a very stable job with a awesome culture company with Red Hat, well regarded there, I just wanted to do my own thing and express myself in the tech space with a startup and then I joined Rancher.
And then the first week at Rancher we get notice that we're going to get acquired by SUSE.
And so six months later I am at SUSE.
So it's been a whirlwind of activity, and to put the cherry on top, in May SUSE went public.
Marc: That's awesome, congrats.
Keith: Thank you.
Marc: Yeah, but that mental that context switch I guess from working for a larger organization to a small organization back to a large organization again.
And bigger, huge opportunity though for what you guys can do at SUSE.
Keith: Oh absolutely.
And I am in a very privileged position because when I made the decision to join Rancher, some of the core technology we were going to have to build for the startup, Rancher already had in the form of K3s.
As an example. So I mean, what better way to help shape that core technology than being the product manager for K3s and Rancher, helping to lead that team.
I'm not the only project manager, obviously. So that was really cool.
But when you fast forward to the SUSE world-- And we could talk about this if you want about the Edge positioning.
There's really three things that are needed for successful Edge deployment. One is this idea of managing things at scale. So we have a solution for that in the form of Rancher and a sub feature called Continuous Delivery. Okay. It's basically GitOps. The other thing is a lightweight Kubernetes distribution in the form of K3s. And so that was a powerful component of the solution. And the third which is what most people have, is a lightweight OS that's container optimized.
So coming to SUSE and going after the Edge space strategically with a great solution, those three pillars were in place with the acquisition of Rancher, and SUSE.
So SUSE Rancher, the solution for SUSE Edge is basically those three at the core plus some other value added things around that. But let me just say this.
We are in the middle of a storm that's phenomenal in terms of cloud native running into Edge. It's crazy.
Marc: Right. I want to actually dive into that a little bit.
So one thing I'd love just to understand is the inspiration for K3s.
I know it was created before you joined Rancher, but was it specifically targeting Edge?
Or was the idea, the inspiration a little bit broader than that?
Keith: It was not targeted for Edge.
Edge was in all transparency not really on Rancher's radar at the time of K3s' creation.
What was underway at the time was a pass solution called Rio.
A project upstream was a project called Rio.
And Darren Shepherd who is absolutely a brilliant software engineer, that I'm so happy to be able to work with this guy, he was creating Rio and he got tired of always having to go through the complexity of standing up VP of product and cloud native infrastructure at Kubernetes the Hard Way.
If I can reference Kelsey, right? So he says look, he just made a call to say, "Hey look, I've got to solve this Kubernetes problem first before we can continue the development of Rio."
And so K3s was born.
He essentially looked at the components of Kubernetes, decided which ones were not germane to having baseline clusters stood up.
Put all those into a single goal binary and 30 seconds later you have Kubernetes running that you can use locally to test against.
So that's where K3s was born. And this was all before I came to Rancher by the way, so-- But we saw it.
When I was at Red Hat and I was looking to do my startup we were evaluating the Kubernetes options and K3s was-- it looked positionally correct. Okay, if I can say that.
But back to the origin. Rancher then just released it upstream as a project and it was immediately adopted by hobbyists.
I mean I kind of fell into that category myself with some software that I was writing for the startup as a proof of concept.
And K3s, very much like Darren, solved the complexity of standing Kubernetes up very fast and quickly and cleanly.
One word I like to use, and I'm borrowing this from a talk that Darren did, he said that K3s removes the cognitive overload of standing up Kubernetes.
And I love that phrase because it's succinct and right to the point of what K3s does from a value perspective.
Marc: Yeah. I mean I think everybody should go through that Kubernetes the hard way from Kelsey.
Marc: It is great, you understand it all.
But I shouldn't have to go through it all the time, and so the value of that absolutely makes sense.
Complexity is a problem and introduces all kinds of risks.
More than just the time constraints on it, but the complexity introduces opportunity for security vulnerabilities.
It introduces all of this challenges that throwing it all in and going binary sounds like a lot of that's removed.
Keith: No, you're spot on.
And going back to my Cloudscaling days and being in the school of Randy Bias, Cloudscaling would always say that complexity is the enemy of scale.
I mean we've got it on the back of the t-shirt.
So if you can remove complexity you can gain scale.
And K3s is a brilliant rendition of that phrase.
Marc: That's great. So given that, do you think that K3s is the default, everybody should choose that as a Kubernetes distribution?
But there's a lot of other ones, there's Kubeadm there's upstream Kubernetes, You can do it The Hard Way.
When does it make sense, if I'm thinking about I need to create a Kubernetes cluster?
When does it make sense for me to choose K3s versus other distributions?
Keith: There are some use cases.
What we're seeing on the business side is that companies that don't want to waste a lot of time, and we're kind of speaking about this complexity issue right?
If they just want to get something up and running quickly and test against, that's great.
But the other benefit of K3s is that it's production grade.
So just to give you some statistics, K3s is downloaded over 20,000 times per week.
So if you do the math that's like a million downloads a year.
So the traction's on that's pretty phenomenal.
And what happens to us on the business side is that there will be POCs internally behind organizational firewalls.
Once those get to a certain level of maturity they'll call us for a support.
So that's pretty much our business model.
But anybody who wants to quickly test Kubernetes, and spend more time higher up into the Kubernetes stack, K3s is a great fit.
It's multi-architecture meaning it runs on our processors as well as Intel based machines.
So you can run it on a Raspberry Pi locally to create a home lab.
In fact that's part of the initial traction is that the hobbyist, the home lab scenarios, Intel developer, system on chip boards. I mean those use cases are very prevalent.
And on the commercial side, actually US public sector side, we've got use cases where they're running K3s on satellites on Raspberry Pis, for example.
So... Let me just say this.
The Rancher philosophy when we look at Kubernetes is the following: we expect Kubernetes to be everywhere.
Okay. We don't see ourselves as adding a tremendous amount of value to Kubernetes itself, and you hear this from other folks in the ecosystem where Kubernetes is just pretty much done as a thing.
And so we are trying to-- This is the point of K3s as well, we want adoption everywhere.
We want it to be a standard everywhere. And I think that's the mission for us.
And so we as company build value on top of the expectation that Kubernetes will be everywhere.
Just K3s is just one form factor of making that a reality.
Benjie: Sorry, I got to back up a second here.
Did you say it's on a satellite?
Can you tell us a little bit more about that?
Or is that a little bit not appropriate to get into too much?
Keith: We can talk about it. There are some use cases in data sheets that we've built around that particular use case.
It's in conjunction with a company called Hypergiant.
The use case there is there's a cluster of four Raspberry Pis running K3s.
They have their own switching in all the fabric there. And they have a Baby Yoda doll. Okay.
And what they're doing is they're doing image recognition against that Baby Yoda against a space backdrop I guess.
There's more detail about this online that we can reference, but I think that's a preliminary test of something else.
I'll leave it at that for the use case.
Benjie: I would say that's a very good usage of the word Edge, that's as Edge as I've heard for Kubernetes.
Benjie: So that's pretty exciting.
Marc: We'll make sure to get that link and keep it in the show notes here too, because that sounds phenomenal.
It's interesting, Basil you mentioned, Rancher-- Your Kubernetes philosophy, we don't add value to Kubernetes.
That's what you just described.
But you actually, by simplifying Kubernetes and then adding support, paid commercial support sure, but that's a massive amount of value.
Kubernetes is still early and people are struggling with it, and so I imagine you're unique in the ecosystem with the setting up that support organ being able to support a Kubernetes distribution out there.
Keith: Yeah. And let me course correct that statement, because there's a nuance there.
When I say we don't add value to Kubernetes I'm talking about-- We do, okay? So let's clear that up for a moment, okay.
Marc: Of course.
Keith: We absolutely add a ton of value to Kubernetes.
The point I was trying to make there, in clarity, is that we don't spend our time pushing a lot of feature evolution in Kubernetes.
So our mission is to make Kubernetes as it is today from an upstream perspective widely available in all of these use cases as we go forward to this new world in terms of fully cloud native infrastructure everywhere.
So that's probably a better way to say what we do. And to your point, K3s is a huge amount of value.
Because we've, again, reduced the cognitive overload needed to stand up Kubernetes.
It's repeatable, you can scale it. Everything that comes with that. So that's real strong value.
Marc: Yeah. I mean we at replicated, every engineer gets their own K3s cluster.
That's their dev environment, they run their whole stack on.
We've standardized on that and honestly, before we were using different smaller Kubernetes distributions.
We've experimented with different ones and K3s has been the one that actually was just, it just kind of works.
It got out of the way and it allows us to be able to build codes.
So that's great. So let's move on and talk about some of the challenges that it took to build K3s.
You took a really complex problem of Kubernetes the hard way and made it into a single go binary.
That complexity didn't just go away, right?
You took it out of everybody out there whose trying to spin up a cluster, N number of people, and made it so that the Rancher team is responsible for making that process simple.
What were the challenges there?
Keith: Yeah, there's really two.
One is deciding how do you reduce the footprint of the Kubernetes core services, right? Into a single process. That's a challenge that's going to be always, from my perspective, in the hands of the engineers to figure that out.
But they've done it. So that's theme number one.
Theme number two was fairly innovative. Etcd is fairly heavy.
And if anybody's used it in a low resourced machine you'll quickly understand that it can be problematic from a resource utilization perspective.
And so one of the cool things that was net new that we introduced with K3s was KINE.
K-I-N-E. It's a recursive acronym for KINE is not Etcd.
And so essentially what that is, again going back to the brilliance of Darren Shepherd, is a shim layer so that you can back the Etcd store from an API perspective to something else such as MySQL or PostgreSQL or SQLite.
And so K3s as a default ships with SQLite, it's much more friendly to Raspberry Pis, low resourced hardware et cetera.
As a default. But we also now support Etcd as a first class citizen, as well, if you want to revert back to classic Etcd.
So the KINE shim layer was really cool, and I would like to also call out that KINE is not specific to K3s.
You can actually use this in other Kubernetes clusters, to scale out the back end.
There's several reasons that you may want to do that, but it's a really cool innovation.
Marc: So if I want to spin up a K3s cluster that's a single node, like on a Raspberry Pi or something, that SQLite layer's great.
But would the idea be if I want to have a multi-node K3s cluster that's where I would swap SQLite out with potentially something like a manged PostgreSQL offering.
Or something like that?
Keith: Yes, correct. These Edges cases are really driving a lot of this.
So just real quick, slim Edge use cases you would have a Kubernetes control plane somewhere, not the Edge.
So somewhere, either hosted in the cloud or in a data server on your own hardware, whatever it is.
In those scenarios you would have kind of a beefy backend.
So Etcd, or using KINE to talk to PostgreSQL or MySQL .
But the downstream clusters that would be under management would be something like SQLite, because of just the footprint of the hardware is very different.
And to your point also, it's a multi cluster, you could run Etcd and let's say a three mill cluster at the Edge given the requirements for Quorum with the leader election protocol.
Benjie: Sorry I just want to dive in a little bit more, obviously this KINE stuff is pretty spectacular.
What other kind of innovations do you have in the architecture that you can see kind of make it stand out a bit?
And what other reasons do I have as a developer, like why else would I want to use this?
Obviously you've listed them, a million great reasons, but what else is there?
Keith: So a few things.
One is that we are listening to the users intently about what we need to add going forward in the roadmap.
And so one of the things that helps with the scalability in massive deployment of downstream clusters is having a config file driven startup process for that cluster.
So now you can sling out config files, you can run K3s with that config file and boom, you've got a cluster up and running.
So that's cool. The K3s as a single binary, number one, has no host dependencies on the Linux OS.
So as long as your Linux is fairly modern then K3s will just run, which is cool.
It's a small thing, but it's very important.
Outside of K3s itself in terms of moving up the stack, we're about to release a project called Rancher Desktop.
So Mark you mentioned that your developers have their own K3s clusters, well we're trying to make that even better where you have a desktop tool.
It's kind of like Docker Desktop, but with K3s underneath.
You can select the Kubernetes version that you want to have running and then you can test your helm charts or whatever you have against that local cluster and just swap it on demand real time.
So that's a new innovation that's coming out for us based on K3s, but not with K3s directly.
Marc: That's cool. And we've been talking about K3s, I want to shit for just a second.
There's a ecosystem of other Rancher projects around this. There's K3d, K3OS.
There's another distribution, RKE2.
How are these all related back to K3s? Do they share the same code or what?
Keith: It's interesting.
Because of the popularity of K3s, what's happened is that it's almost like the tail wagging the tiger.
So just to give some context, Rancher provides its own Kubernetes distribution.
There's actually two, three. So let's talk about that historically.
Historically there was RKE Rancher Kubernetes engine.
It was container driven Kubernetes distribution. Okay?
The next version of that is RKE2 and what we've done there positionally is that because of my public sector experience and understanding of that space, what we've decided to do strategically was go 120% into that space, capture the security requirements for both regular use cases from a security enhanced perspective, as well as disconnected Kubernetes environment use cases.
Which are very different, okay.
There's some-- It's corner cases there that we have to take care of, and we're doing a very good job of that.
So RKE2 is all about winning in that space, and the upstream project is called RKE2, the whole Git Repo.
What we've done is we've decided that that's the tip of the spear, so all of the security enhancements, the Containerd, SE Linux upgrades and things like that, the crypto modules, we put all of that into RKE2 to meet our government security requirements.
And then we package that up and brand it RKE Government.
Because it doesn't have feature parody completely today with RKE1 from a Rancher management perspective.
It is proper Kubernetes but Rancher's ability to deploy and manage that is just a little bit behind, but we're closing that gap very quickly.
And once that's done we're going to say okay here's a new version RKE, which is RKE2.
Hopefully that made sense. Okay.
So back to the original question, K3s because of the simplicity model and the single binary in the user experience from a developer or deployment perspective has been so popular that that model is bleeding over into RKE2's development.
And so they both share kind of a launch mechanism, I think it's called the supervisor where you run a single binary and then RKE2 is container driven so it will pull down all the container macro services and sling out your cluster as per your specifications.
So think of it as a data center grave version of K3s where you can have finer control over the number of Etcd servers, PAs servers, et cetera. So it gives you the ability to tune the cluster for high performance environments in large clusters, so that's kind of the mission there.
K3s still has that supervisor model, you launch it as a single binary but all the services are within that binary.
So it's bringing everything to the table, there's no external dependencies to have Kubernetes running in that case.
So we see that being adopted for, in your case developer use cases, or for these Edge use cases where it's absolutely taking off there.
But they do share a similar code base there.
Marc: So limited band width and stuff like this, or completely even offline disconnected environments.
K3s is great because everything's self contained right there.
Keith: Yes. And the question about the projects, we've got a few.
I'm going to speak on one in the US public sector that's giving a lot of traction as well.
And this is not going to really matter to most folks on the podcast, but we have a project called Hauler, H-A-U-L-E-R.
And it's kind of a build system where it will package up Kubernetes into a single TAR file.
And you get on the other side, or what we call the high side or disconnected environment, and you run Hauler and it unpacks everything, stands up the cluster.
If you've installed helm charts it does all of that, pulls down the containers and you have a fully disconnected running stack from the bottom to the container has applications running on top of that cluster within fully disconnected environment.
I mean it's phenomenal.
So we're going to be releasing that officially probably within the next month.
But that's a really exciting project because, again, using the DOD and the intelligence community and US government as our leading requirements indicator if you will, sorry about that, we're building tools like that to software disconnected environments.
And those solutions have direct applicability in other industries like financial services, banking et cetera, healthcare.
Where you see very similar disconnected environments.
For example, we're talking to some folks to do K3s enabled x-ray machines.
And they're going to run their container RES apps and all the x-ray data is going to come to the local cluster right there in the room with the machine, and then they process it and move it all onto things like that.
But that whole environment is the same type of disconnected network that you'd find in some of the military use cases.
So it's a very, like I said already, it's a very exciting space and I'm very privileged to be where I am to see these opportunities and to help shape some of these opportunities.
Marc: Yeah and I think you talk about these government environments that you need to deploy too, but if you solve that you generally are solving the same highly regulated compliant environments by financial services, healthcare.
There's lots of commercial adoption that you unlock by solving those hard to get to environments.
And so that's really the overall strategy is to create a generic Edge solution based on K3s and Rancher and the SUSE sleek micro OS.
Again those three pillars that are in my opinion an NVP for any Edge solution.
Whether it's from us or somebody. So we have this generic baseline solution that we can apply to different industries.
In addition to that though some industries are even further regulated, particularly the auto industry because there's safety Linux.
I don't think cloud native apps will be in that space for some while because it's just there's too much to review, there's too much regulation to go through.
So those would require a specialized solution.
So it's not going to apply to every industry, but for the most part we can solve I would say 80% of the far Edge use cases with a solution like that based on K3s.
Marc: Great. You guys, ship to K3s, lots of traction, lots of downloads. Lots of use.
But recently SUSE donated the project to the CNCF.
Can you talk a little bit about that decision to make it a CNCF project and not a Rancher project any more?
Keith: Yeah I mean that really speaks to the thing I said earlier about we see Kubernetes everywhere, and that move for us was to further that philosophy.
Where we want to give it to the world, and to accelerate the commoditization of Kubernetes everywhere.
Almost like electricity, like in utility.
And so it's almost like the Elon Musk Tesla strategy where the faster they can build those charging stations the better the network overall's going to be in terms of adding value on top of that with better cars.
Not that we're nowhere near Elon Musk in terms of stature, but the idea is to accelerate Kubernetes being everywhere, at the Edge, in space as we mentioned earlier, in these disconnected environments, in your house.
And then build real value on top of that standardized API of the CNCF certification, which is great.
So that's the mission. That was our main push to doing that.
And the second is that by doing that we would like to see multiple organizations participating in the promotion, the evolution of K3s as that standard.
And to make sure that we're in lockstep with the overall Kubernetes ecosystem going forward.
Benjie: So going forward, what's next on the roadmap?
Keith: It's about security.
So we were part of several companies who helped the United States Air force with their Platform One project.
And also submitted some documentation to DISA.
And what's created there was the first ever DISA security technical implementation guide, the STIG.
I'm trying to avoid being too technical in the US public center space with acronyms.
But there is an official DISA STIG for Kubernetes.
Now the problem we had with K3s is that it's not individual services that have configuration files. It's a single binary.
And so we need to make sure from a security perspective, because this is being adopted heavily in the D of E space, we need to make sure that K3s meets those STIG recommendations.
So that's the next thing for us.
Second to that, we need to make sure that our internal crypto libraries are using these mIsvalidated modules.
So that's big too. And then we get things like SE Linux from the host operating system, so we needed to play nice with that.
But security is our next broad theme to get K3s better adopted by public sector first, other industries second.
And it's like we want to mirror the RKE2 model specifically with K3s.
So that's next for us in terms of roadmap.
Marc: That's great. On that, the security side, you mentioned a lot of stuff around the run time security encryption, logarithms and things like this.
Are you also looking at things around the supply chain and software bill materials?
Seems to be everybody's talking about it since the SolarWinds hack, which is great.
But it's also a hard problem to solve.
Keith: Yes. I'm glad you bough that up.
So this is actually one of the huge benefits that we get from the SUSE acquisition.
And so if you look at the legacy in the artifacts that we have from decades of operating system builds from the SLEA family, we are actively looking to better our Rancher process based on what the SUSE folks in their expertise bring to the table.
So you're going to see some changes around that. In fact the SLEA micro operating system takes advantage of the SLEA artifacts, as I said.
So common criteria certification, the mIsvalidated models.
It's basically like we have a candy store of things to help with our security positioning, and we're very excited to put things together in the right order to meet our security requirements.
But yeah the supply chain piece is largely resolved by the value that SUSE brings to the table from the Linux side.
We will and are adopting a lot of that and embracing that and sending that value downstream to our customers.
Marc: That's awesome when there's such a perfect marriage there of you have this open source modern application K3s to bring it easier, but SUSE has the framework and the infrastructure to think about that and you can just join the two and actually have a better product.
Keith: Yes, exactly that.
Marc: Want to shift for one more second, there's another project from Rancher that's in the CNCF Sandbox.
Not a Kubernetes distribution, Longhorn which is a storage backend.
Are you working with the Longhorn project also?
Keith: I am indirectly. There's a guy on my team, his name is William Jimenez.
He is the product manager for Longhorn.
Marc: Cool. And then the question is just really at a high level, is the need for Longhorn was that discovered by the distributions of K3s out there in realizing hey storage is really hard and everybody's storage is too complicated, we need to do the same thing for storage that we did for Kubernetes?
Keith: Yeah that's the similar philosophy.
We wanted to make storage easy in the context of Kubernetes.
So Longhorn gives you a great way to deploy it via helm chart, and has got a great user interface to quickly give you persistent volumes for your containers with Kubernetes.
It takes that simplicity model into effect and it carries it throughout the whole lifecycle of the product, and so that's what's really driving Longhorn.
Marc: Cool. So storage, Kubernetes.
Are you doing anything with networking?
Another hard problem that everybody has in Kubernetes.
Keith: No, we're not touching networking.
But we are looking to go after the hyper converged infrastructure space with a complete stack that's 100% open source.
That project name is called Harvester.
Where, to your point about Longhorn, it includes Longhorn in there for storage, there's KubeVirt.
All of this is built on Kubernetes. And again just to recap, you have to understand our philosophy.
So we expect the Kubernetes API to be everywhere.
And so at the bottom of Harvester is K3s, again, which gives us that standard CNCF API that we can target for infrastructure management.
And so Harvester is designed to solve three areas. Well there's virtual machines that we may need to spin up, but we can manage that work machine life cycle management from the Kubernetes interface. Which is great. And then if you want to run Kubernetes clusters on top of that you can do that as well. If you want to do both, Rancher can manage all three of those scenarios.
So think of a future version of Rancher, or future capability within Rancher, it would mange a cluster of Harvester nodes and give you a vSphere like capability against like a hyper conversion infrastructure.
So it's-- Taking Harvester as an appliance that gives you all three of those things out of the box.
You get VMs, you get Kubernetes, and you get the storage of course to support the VMs.
But Rancher can see that cluster and manage that cluster at scale.
Marc: That's actually really useful when you think about large organizations trying to figure out how to adopt Kubernetes, and how they can take some of the operational practices that they have and apply them into a Kubernetes world without having to throw it all away, what they've done for years. So that's cool.
And we're seeing a lot of that, because people are running things like open stack or they're running VMware and they want to go cloud native but they can't drag the applications over fast enough. Right?
So some of the greenfield apps they're obviously going to be cloud native from the beginning, but if we could stand up a few racks of gear we could now have a place to bring over the VMS and have the entire thing be cloud native at the core, which that's really what the target is.
Marc: So if I'm thinking about hey I want to try K3s, do you have any recommendations on where I should start?
Should I think about it as a local Kubernetes cluster?
Or should I start thinking about it for a single app running in production? Where do you see success?
Marc: All of them.
Keith: Yes. So the thing is, I mean the quickest way to get started--
Again, this is coming from a product manager so I'm not the most technical, but go to k3s.io there should be a cURL command there gives you a local VM or Linux box locally, run that cURL command and 45 seconds later you'll have a CNCF certified Kubernetes distribution at your back and call.
That's the quickest way to get started.
I have a small pre-node Intel boxes running it right now.
I've got Rancher running on top of that.
Just to test things out and stay fresh and manage some things downstream.
So that's the quickest way to do it, k3s.io is your answer.
Marc: Just like curl a pipe bash and Kubernetes, boom you're done. Well that's pretty cool.
Benjie: Let me ask you a different quick question here.
Is there any reason not to use K3s and rather just go straight K8?
Is there any reason why-- because yeah I'm pretty convinced I should never touch raw K8's again after talking with you for this last bit of time here.
So tell me when I should use K8?
Keith: It's a great question, and that will illuminate K3s' power, so let's talk about.
So there maybe some use cases where your cluster is sufficiently large enough that you need to have better precision over the landscape of your services within the cluster.
We talked about this a little bit earlier I believe where you may have a certain number of Etcd nodes, versus the APS server count versus the other services. Right?
So with K3s you're limited to either control plane node or worker node.
And so you're, it's kind of blocky in that sense where you don't have the same level of granularity over the split between the services.
It's kind of all or nothing for K3s.
So that's going to limit you for extremely large clusters, because it just will.
So there is a probably some cluster size limit where K3s is probably not the best answer.
But that's still a lot of nodes, okay don't get me wrong.
It's still a lot of nodes. Like again for very large clusters, K3s is probably not your best solution.
Benjie: Okay, that's fair.
Marc: But on that, you don't have to obviously name any customer names or anything like this, but just to get an idea of that order of magnitude what are some of the largest size production clusters or just large scale K3s installations that you've come across?
Keith: I don't have the answer to K3s numbers, but for RKE the original RKE and RKE2, it's thousands of nodes.
Marc: We're still talking a pretty large cluster.
Benjie: So that brings up another thing, just how do you look at meshes and that type of inter cluster communication possibly?
Or external cluster communication?
Do you guys have a-- is there a project that's coming up, anything like that?
Keith: We don't have a Rancher originated mesh project but we are participating in the Submariner project.
I think we're sort of eyeing some folks over at Red Hat.
That looks very interesting from a cluster to cluster perspective.
And there's some interesting use cases for decentralization around that as well.
As far as service meshes into a cluster, we're hoping it's just standard Kubernetes and you can load whatever you want to put there.
Basically use cases, so we're agnostic in that front.
Marc: That's why Kubernetes is good, right?
You build it, conform a cluster and you can bring in whatever you want to.
Keith: Exactly that, yes. Agreed.
Marc: So if I'm a developer and I want to get started contributing, maybe I think here's something that I want to add in the K3s.
Can you talk a little about the community meetings and how you're engaging the developers out there?
Keith: Yeah it's just standard upstream.
At my level I am not on a day to day basis directly involved in the community engagement.
So probably some other folks would be better suited to answer that question from an engineering perspective.
But we listen to the community intently as I said earlier.
And if you can go to the Rancher slash K3s GetDepot and start there that's probably the best place to go.
Marc: Cool. And is there a particular type of feedback that you're looking for as you're shaping the next couple of version of the road map right now?
You talked a lot about the security aspect that you're focusing on and a lot of the team is putting into that, particularly use cases or feedback that help you right now?
Keith: Yes. I'm going to give you an answer with K3s and then we'll talk about something else after that.
So the use cases that we're seeing are largely these Edges cases, and so any requirements around let's say arm architecture would be really interesting to get feedback on.
Any interesting use cases around arm plus TPUs, arm plus FBGA boards would be of interest to us.
Because we want to make sure that we can say yes to all of that from a supportability matrix perspective.
And the other thing that we're interested in is requirements-- Not necessarily related to K3s because we see K3s and K3s clusters as downstream clusters that should be managed by something like Rancher.
So we have this capability called continuous delivery, I mentioned it earlier where it's based on our project upstream called Fleet and it's our GitOps model to manage downstream clusters.
And if there are use cases around the Edge scenarios with K3s and Fleet we would love to hear those.
So any kind of crazy configuration.
I mean we talked about Hypergiant with satellites and everything that comes with that from a disconnected end latency perspective, we're very aware of.
But anything that's very out of the ordinary.
The more out of the ordinary or diverse the use cases, the more we want to hear about it.
Because we want to make sure that, again going back to Kubernetes everywhere, that we can actually support those crazy use cases and there are quite a few of them out there.
But yeah, that's the kind of feedback we'd love to have.
Marc: All right. Let's talk about Fleet for just a minute.
We're big fans of GitOps in general, is Fleet really targeted around just the Kubernetes cluster in managing those or applications?
What does Fleet do?
Keith: Yeah, so Fleet is Rancher's philosophy for GitOps.
We see it as the core technology to give us the management at scale capability.
There's a really cool blog article that Darren Shepherd wrote where we took Rancher, we scaled it up to manage a million downstream clusters, and he talks about all the road blocks that we ran into with Kubernetes.
For example, I didn't know this at the time, but Etcd apparently has an eight gig key value store limit and the default is set to two gigs.
So you can do the math on the number of objects you might store and you basically tap out at like 100,000 downstream clusters.
So this is why KINE, the thing we talked about earlier, the shim layer to swap out the backend database was needed for this scale initiative.
So in the blog he talks about how using KINE we pointed to--
You see the PostgreSQL and MySQL more specifically RES in one of the two API protocols for the database.
And that gave us the ability to go beyond that eight gig database limit and crank up the scale pretty awesomely up to a million clusters under management.
So Fleet was the technology in that use case.
So we're very comfortable with its ability to scale.
I mean we've not run into a customer that has million downstream clusters today.
We see multiple tens of thousands though, so just to give you some perspective of real world use cases.
And so what Fleet does is its GitOps driven, meaning that you can manage all of your sources of truth in the Git repo.
It's a two stage pool model in where it clones the Git repo, contrasts that repo against the odd back that you assign to the downstream clusters.
And then it prepares what's called bundles.
And so when the downstream clusters phone in to say hey do you have something for me to do?
Fleet operates has operated on the Rancher, has operators on the downstream cluster side.
And the Fleet operators talk and say hey look there's something for you here, download the bundle and then reconcile that against the local cluster and we're good to go at that point.
The pull model is the only one that really scales.
If you do a push model what happens is that you pretty much just wind up testing the network connectivity to the downstream clusters.
So that's Fleet in a nutshell and it's very powerful, it's very successful.
It's elegant, it's the right thing to do, it gives us that scale, that management--
It's the third pillar of our solution by the way, and we are slowly moving Rancher to use the Fleet model for 100% management of downstream clusters going forward.
Marc: And it's open source?
Keith: Everything we do is opensource.
In fact one thing that kind of surprised when I joined Rancher coming from Red Hat was that the Rancher code upstream is identical to the Rancher code downstream--
In the past, we're changing that a bit to be more corporate friendly.
But we didn't have a notion of subscription keys or anything. And it really kind of shocked me.
It was almost like the purest implementation of open source that you could find in a commercial business.
And customers would pay us for support and we would just verify that they had support and then we would support them.
But the code, the bits, were exactly the same.
And we're still doing that, but the thing that we're changing going forward is that we want you to register in this support system on the SUSE side and we're going to change the color so you can easily see visually that you have a supporter installed but the bits are exactly the same.
So we're not going too drastically into the traditional model of subscriptions but we do have a little bit of a light layer on top of that, but the bits are identical.
There's no real notion of upstream and downstream on the Rancher side of the house.
Benjie: So all this stuff is pretty interesting.
Is there a world where the big cloud, they're just using K3s' as my managed Kubernetes service?
Is that coming? Or is that the dream, is that the goal? Or--
Keith: It's coming. It's interesting, because if you look at the big three you've got Amazon, Google, and Azure.
They all have slightly different approaches to what we call the far Edge.
Some look at K3s and want to partner with us, and so we're having early discussions right now.
I can't go into details because SUSE's a public company at this point so there's very little I can say about that.
But there are interesting Edge use cases where-- let me give you an example of one generically, okay.
In what we call the near Edge, the near Edge is kind of nearer to the core services.
It's really the realm of the telecommunication companies.
The big communication providers, 5G networks, the whole thing.
Cable companies, multi service operators, et cetera.
They are servicing use cases called multi-access Edge computing, or MEC. M-E-C for short.
And we're seeing opportunities where the MEC side of the house is put into Amazon or Azure or GKE, and then the downstream clusters are K3s running.
So there could be Azure with control plane services and supporting services, but K3s at the Edge.
It could be the same thing with EKS from Google, et cetera. So those are really interesting use cases in conjunction with--
So it's really kind of a partnership with the big providers to provide best of breed solution.
And it really brings home the constant thing that I've been saying on this call is that we expect Kubernetes to be everywhere, and by having the best of breed, a CNCF certified API and distribution of Kubernetes we get a tremendous amount of value and choice for the customer implementation.
So we love providers, we love the hyper scalers.
Because we don't have that infrastructure, but we do have solutions that run very well at the Edge and we do have Rancher as a form of a control plane that is absolutely agnostic to Google, Azure, or Amazon.
We can see all three of those, we can provision to all three of those.
We can pretty much do with the two dot six release anything that you can do from each one of those consoles respectively you can do from the Rancher console.
Which is a very powerful statement when it comes to multi cluster management with Kubernetes.
And so that gives us the control plane side of the house, going back to the mech use case.
Running on top of that provider's infrastructure and then K3s is the downstream piece for those joint solutions.
So it's a powerful combination and something that we look forward to extending in the future.
Benjie: Okay so this stuff is a little mind melding for me, but--
So okay. I'm running my new Mac book and I got my arm chip K3s', perfect.
I don't do this, but if I were to run a Windows machine does K3s fit into that ecosystem whatsoever? How does that look?
Keith: Yes. So there's a few things. Rancher has support for Windows, number one.
So you can run worker nodes on Windows and you can run Windows containers on those worker nodes, but the control plane would be a Linux based control plane.
So that's a Kubernetes story. So that's one story.
The second story is that Rancher desktop with K3s will run on Widows, day one.
So we're targeting Mac OS and Windows for the set of releases for a Rancher desktop.
Marc: Is that like using WSL to solve the control plane problem?
Or how are you getting around that?
Keith: It is. It is using the subsystem to do that, yes.
Marc: So kind of the last question that I had was, K3s right now is a sandbox project.
Lots of adoption, lots of traction, lots of use cases out there.
What goals are using to measure what it's going to take for you to apply for incubation and get out of the sand box?
Keith: Well I think the main one is having other organizations join us, having other individuals push it.
It's already very popular, but we don't want to be perceived as a single company open source project.
Because those tend to die overtime.
So we want to make sure that we have good cross-pollinization between companies to participate in the furtherance of K3s.
So that's kind of an internal goal for us.
And again it back up the philosophy that we want to see Kubernetes everywhere.
So I think that now that it's in the sandbox and you have community engagement and there's more of an open governance model it's probably not going to be very hard to see now that everybody can do it and they know it's not a Rancher project specifically.
I don't have any other questions.
Anything else that you'd like to chat about, about K3s or any of the Rancher projects?
Keith: No, I think we've covered it quite a bit of it today.
And I very much appreciate the time to have this discussion.