Ep. #10, Crossplane with Daniel Mangum of Upbound
about the episode
about the guests
Marc Campbell: Hi everyone.
I'm here today with Daniel Mangum, a senior software engineer at Upbound, to talk about their sandbox project called Crossplane.
Daniel Magnum: Hey, thanks for having me, Marc.
Marc: To get us started, for anyone who's not familiar with Crossplane will you start us out by just explaining what the project is?
Daniel: Yeah, absolutely.
So Crossplane has a number of different facets but the main one that folks are introduced to at the beginning is bringing cloud services to your Kubernetes clusters.
So Kubernetes handles workload orchestration in the form of containers really well.
A lot of times the different workloads and applications that you're running need to consume external services.
So kind of in the first iteration of that with a lot of Kubernetes projects, we tried to bring in things like Postgres and run it in a containerized fashion in the cluster.
And folks started realizing, why would I do that when there's really great cloud services available to me where I don't have to worry about uptime and that sort of thing.
I like to go to consume those, but it's a pretty fragmented process to provision my applications in one way and then try to wire them up to infrastructure that's been provisioned in another way.
So Crossplane initially just brought those cloud services into a Kubernetes cluster in the form of CRDs.
So basically extending the Kubernetes API to allow you to create an RDS instance for your Postgres database in the same way that you're creating your Kubernetes pod or something like that.
And since then it's grown to involve packaging and composition as well which are topics I'm sure we'll get into later.
But basically allowing you to define a platform for your organization that presents kind of a customized user console like you get on a cloud provider but it's specific to your organization and the types of infrastructure you want to deploy.
Marc: Okay so like let's, let's dig in there, that's actually really interesting.
So allows me to create a platform for the organization with the dashboard.
Can you help explain to me a little bit what that means?
Like if I'm running a development team here and they're writing an application that needs new services--
Maybe let's dig into the example that you gave, you know a Postgres database.
How would I use Crossplane to provision or deploy that Postgres database with my application?
Daniel: Yeah so there's a number of different ways you could do it.
So Crossplane takes kind of a familiar model of using different plugins which we call providers, which you know folks are familiar with infrastructure as code tools like Terraform.
It's a similar model except these are all running in Kubernetes.
And we'll talk about the implications of that in a bit.
But these different providers bring the what we call the managed resources which are the granular cloud provider types.
So that would be something like an RDS instance an EC2 instance or you know, a GKE cluster or something like that.
And they just bring them to your cluster, right?
So now you can kube control apply your RDS instance just like you would your Kubernetes deployment or something like that.
And then Crossplane itself has a number of different responsibilities.
First of all, it manages all of the different providers that you have installed.
And it makes sure that they, you know, don't step on each other in terms of you don't want to have a different provider or two different providers managing the same managed resource, right, and conflicting or duplicating infrastructure or something like that.
And then it also allows you to do what we call composition.
So, you know, within an organization the infrastructure team likely wants to have a lot of control over, you know the granular resources that are provisioned.
And they may have a lot of experience with cloud providers and know exactly how to configure the hundred possible fields on an RDS instance.
Application developers on the other hand, probably do not want to know all about how an RDS instance works or, you know even getting further down the line we're not even talking about single resources, right?
When you deploy an RDS instance or a Kubernetes cluster on a cloud provider, you're likely also having to set up networking, setting up different security policies, firewall rules, etc.
And bundling those all into an abstract resource that you can present to developers and say, as an infrastructure team we've kind of developed this template and now you can go into your namespace in the Kubernetes cluster and create an instance of that, right?
So you'd have an abstract type that would be something like Postgres database and that may be satisfied by a composition of managed resources, such as an RDS instance, a VPC, and some security groups or something like that.
But the users, the developers are only looking at kind of this abstract type.
So it's kind of like a build-your-own Heroku experience if you have familiarity with some of those more lighter weight cloud providers.
Marc: So that all makes sense, what does this replace?
I mean, you take it all the way down to the core building blocks there where, you know--
Obviously I can use something like the AWS console and go create that RDS instance in the VPC and obviously there's all kinds of challenges there like you just described around uniformity, like the developer doesn't know all those settings.
But there is other tools in the ecosystem, right, that existed before Crossplane, maybe Terraform and Pulumi are two that come to mind.
Can you help explain a little bit about the pros and cons of the various approaches?
Daniel: Yeah, absolutely.
So broadly Terraform, Pulumi, going back a little bit further than that looking at things like, you know, Ansible and that sort of thing are considered infrastructure as code tools, right?
And we're moving more towards this kind of like infrastructure is data future which is sort of a buzz word that you may have heard.
But essentially the difference is in an infrastructure is code solution you're going to have these one-off kind of imperative runs of, you know go create this infrastructure for me.
And then the process that does that spins down at the end of that and says, I've done what you asked and now your infrastructure is there.
Running in Kubernetes with Crossplane means that you have something that's constantly reconciling your infrastructure, right?
So when you create your RDS instance, it doesn't just go and say, upgrade your RDS instance, good luck.
It says, I've created your RDS instance with the parameters that you specified, and I'm going to make sure that it stays up to date with those parameters, right?
So if someone comes into the console and messes with it or if there's some sort of service degradation it's going to make sure that those resources stay up to date.
So that's one difference. Another one is that you're kind of taking the abstractions and I like to say persisting them to the cluster.
So a lot of folks say, you know I can use Terraform or something like that to define these abstractions at creation time.
And, you know, that gives a user a kind of like object oriented programming or something like that a more friendly interface to interact with these managed resources and provision them on the cluster.
But then once they're there, right they're just the granular resources.
And the Crossplane model we like to keep this interface kind of as the thing that the user interacts with.
So in the cluster, you may have this abstraction, right that you create an instance of your Postgres database as a developer that object in your Kubernetes cluster continues to live on, right?
And you continue to interact with it.
It's what provides you the connection details and that sort of thing.
And you can reference that through various Kubernetes mechanisms to, you know include those connection details and wire them up to your Kubernetes workloads.
So it's a completely native Kubernetes process for provisioning and consuming this infrastructure.
And one of the benefits that comes along with that is there's a lot of other CNCF projects, a lot of them that you've had on this podcast actually before me, they do a lot of really incredible things.
And because we're all standardized on the Kubernetes API it allows you to use that functionality alongside Crossplane.
So a great example that I like to talk about and I'm not sure if you've had them on the podcast yet but the folks over at Open Policy Agent which is a really powerful tool that basically allows you to write policy for objects that are created in your Kubernetes cluster.
You know, once we've moved to represent our infrastructure as Kubernetes objects, we can start to use tools like Open Policy Agent to say, you know, this person or folks in this namespace can't create a database with these parameters, but they can with these other ones.
And so you start to get really big benefits from standardizing on this control plane which is really the most powerful part of the Kubernetes API, right?
The container orchestration part is kind of just an implementation detail.
Marc: Yeah so it's like integration into that whole ecosystem just unlocks a lot.
Even things like ArgoCD or Flux, which, you know we've had on the podcast and talked to.
You know it sounds like if I'm using Crossplane I can integrate into that ecosystem exactly and use GitOps to deploy the database and the underlying cloud infrastructure also.
Daniel: Yeah, absolutely.
Argo is definitely a big one that we see folks using with Crossplane and, you know previously where you may have had an automated process that runs your Terraform for you, or, you know executes Pulumi code or something like that you know, now this is all included in a single process.
And once again, having those abstractions persisted to the cluster means that you can have those abstractions in your GitOps process as well right?
So the infrastructure team could have a GitOps process where they are creating the abstractions right?
Kind of defining the platform if you will.
And developers on the other hand can have GitOps processes that are consuming the platform.
So they may have something like, you know a pod and a Postgres database and the infrastructure team on the other side may have something like a composition which is our unit for grouping these different resources in an XRD, which is the unit for kind of abstracting them.
Marc: Yeah and that that's great.
Like as a developer, I want a database to be available for my application, but like, I don't care like at all about how that database comes into existence.
And it sounds like even if I were to, you know have a really modern multicloud infrastructure set up I could actually run this and if it's like, you know, Azure and Google and AWS I'm running on all three of those, the three main cloud providers--
I could deploy the application and it'll use the native database as a service provider on each of them and allow my infrastructure team to have all the settings and the configuration so that I know it's going to work.
But like as a developer I just think I have a Postgres database and here's an end point to it.
Daniel: Yeah, absolutely.
And what you're kind of touching on is the infrastructure team enforcing policy across an organization, which is a really powerful component that's built into Crossplane.
So you're exactly right.
You know, folks are probably familiar with CRDs custom resource definitions, which are how you extend the Kubernetes API, Crossplane has a concept which I mentioned earlier, XRDs which are composite resource definitions.
So this is how you essentially define a new abstraction type, which eventually just renders out a CRD. So let's say our XRD in this example that we've kind of been going through is a Postgres database and then I have a composition that is, you know RDS Postgres or something like that.
And it has, you know, the RDS instance and VPC, et cetera that we've talked about.
You can also have other compositions that satisfy that XRD.
So I may have my GCP cloud SQL Postgres composition or my Azure one, or I might just have, you know within my single cloud provider, you know a large RDS configuration and a small one or a dev and a staging one, et cetera.
And you can have those enforced, right by the different environment that the user is deploying into, or, you know potentially different settings that the user has.
And those can all be enforced by the infrastructure team but abstracted away from the user.
So they just get what they need and the infrastructure team make sure it's within the organizational policies.
Marc: Yeah, that's great.
I mean, and if you have a SOC 2 compliant or other regulatory reasons, you as a developer don't have to necessarily understand that.
Obviously it may cause limitations or other constraints on how you use the service, but you don't have to worry about making sure that every new service that you spin up is compliant with whatever compliance and regulatory frameworks that you have to follow.
Daniel: Exactly, you're just reducing friction, right?
Your BAM structure team is getting what they're wanting and developers are getting what they're wanting, right but there doesn't have to be direct interaction each time that operation actually takes place.
Marc: Yeah and I think I heard you say something on, you know dive into a little bit more, you know because Crossplane runs as a Kubernetes controller there's the reconcile loop so there's a concept of drift detection.
And so I've created my cluster and then, you know somebody on a team manually went into the AWS console and modified a setting.
Did I hear you talk about the ability for Crossplane's reconcile loop to be able to detect that change and then bring it back?
Daniel: Yeah, absolutely, so you're exactly right.
It is running as a reconcile loop.
So both within the cluster and externally it's going to make sure to always be driving the status of your resource to your specification for that resource.
So you gave a good example there of someone going into the console and messing with it.
There's also just examples of someone violating policy in some way and it driving it back to a state that you've defined.
One of the examples we like to show is a team management with GitHub, right?
So we've talked a lot about using different cloud providers.
A Crossplane provider can talk to any API.
In fact, one we may want to talk about down the line which is kind of a unique example is our helm provider but we have a GitHub provider that essentially allows you to do things like create repositories as Kubernetes objects, manage users, et cetera.
So a good example of that would be a if you defined a team type, you know, for GitHub and you created one that had a list of users in it and someone went in and added a new user, that type would go and that controller for that type would go and actually remove that user from the team and say, this is not what the, you know declared configuration should be.
Marc: That's cool, so Crossplane you know, the goal then it sounds like it's really just infrastructure as data but for more than just like the commonly thought of infrastructure with, which is, you know AWS, GCP, or Azure, but it might be like every SAS service out there that I need to provision?
Daniel: Yeah, absolutely.
I mean, we talked a little bit and touched on the possibility of a multi-cloud kind of setup with these abstractions.
What we actually see more frequently than multicloud setups which is a little bit over-hyped in the industry space potentially is folks that have you know, hybrid cloud setups, right?
So they are either transitioning from an on-prem data center to the cloud, or they have some highly, you know critical and regulated workloads that have to run in an on-prem setting.
And they can actually write their own providers or extend providers in kind of a lightweight manner to, you know have some workloads that are going to go to the cloud provider and some that are going to go on prem and they can have different labels on those resources to configure that.
So you as a developer say, you know, oh is this going to have PII in it, or something like that?
If so, you know, the infrastructure configuration there is going to go ahead and send that to on-prem right through your database, as opposed to in the cloud and, and various other scenarios like that.
Marc: That's great, and because it doesn't just run at initial deploy time, then also as if I'm a startup and I'm not even worried about SOC 2 compliance or anything right now using Crossplane to deploy that infrastructure--
If later on down the road I start to take on more compliance, overhead and burden in order to like work my way into larger and larger customers, I can just modify those XRDs then in order to modify the underlying resources to make them compliant?
And bring them up to the standards that I'm now setting?
Daniel: Yeah so it definitely depends on the type of infrastructure in that case.
We're not a silver bullet in that, you know we can't make the cloud providers do something that they don't natively provide right?
We can just provide abstractions on top of them.
So in some cases that's absolutely correct where you could say, you know modify this resource to make it SOC 2 compliant if that involves, you know some sort of replication configuration for an RDS instance or something like that.
In other cases, it may be necessary, you know, to actually stand up new infrastructure or something like that.
But touching on that a little bit, I want to talk a little bit about Crossplane's packaging system because it's basically the method for distributing these providers and configuration packages.
So provider packages contain a controller and some CRDs.
And essentially what you do is you package those up into an OCI image that can be pushed to any registry like Docker Hub.
And when you install them, Crossplane is going to see that, it's going to start up the controller, it's going to install the CRDs, make sure that it owns the CRDs.
That controller you just installed that brought the CRDs.
And so that you can't have kind of conflicting reconciliation there.
And then the configuration packages have things like the XRDs and compositions in them and they can also declare dependencies on providers.
So right if you have an abstraction, once again we can go back to our Postgres database that you know has compositions of AWS and GCP resources, what you can say is I'd like to bundle this all up into a configuration package that is called Marc's infrastructure package, and it has dependencies on provider AWS and provider GCP.
You can just with one command actually with the Crossplane CLI say, install Marc's infrastructure. And what it's going to do is go and fetch GCP, fetch AWS and then also your configuration and abstractions there and bring those into the cluster. And so you'll immediately get this kind of infrastructure control plane, and you can do that across different clusters if you wanted to.
So one of the things that we're pretty excited about for the future of Crossplane as we get more adoption is folks actually sharing their infrastructure platforms, right?
So let's say a larger startup says this is kind of the infrastructure that we use and we use these attractions and present them to developers.
You can actually imagine a future where you have open source infrastructure platforms that someone could publish and say, other organizations can use this, they can modify it, they can publish their own versions of it and you can one-click install that into your cluster and have the same kind of infrastructure console that some you know, well-respected large startup has as you're just bootstrapping your company.
Marc: Yeah that's cool, being able to kind of stand on the shoulders of other companies and kind of take that marketplace idea of how they've packaged all that infrastructure together to solve a problem is definitely, really cool.
Marc: You know, you talk about the providers that actually connect to GitHub or you know, the cloud provider whatever, let's dig in a little bit there and understand what providers are supported out of the box if I don't want to like write my own like what can I get started with today?
Daniel: Yeah so we take a kind of familiar approach of the Crossplane community maintains a set of providers.
We also have a Crossplane contrib org which is kind of alpha level providers you can imagine and folks that want to develop providers kind of with shepherding from the Crossplane community.
And then also some folks just have their own providers right that they've written for their very specific infrastructure inside their company.
The major providers that we support and we see folks using the most are obviously large cloud providers.
So AWS, GCP, Azure, Alibaba, and a few other ones.
We also see some support for providers for things like GitLab or GitHub mentioned before.
And then the other big one that we see and this kind of is taking to the next, I guess level of usage of Crossplane here is Provider Helm.
So a good example for this is we've talked a lot about creating abstractions for kind of a single infrastructure unit.
When you start to think of Crossplane and that the Kubernetes cluster that Crossplane is running in as your single control plane, you can imagine spinning up other things that can run workloads, right?
So not just something that's consumed by workloads but something that itself can actually run workloads and have infrastructure consumed within it.
And so a good example of that is we have a couple of different configuration packages that do things like provision a GKE cluster and then put a helm chart into it.
So an example of what that could look like within an organization is the infrastructure team says all right, we're going to spin up a new Kubernetes cluster, a new GKE cluster let's say and we want to install some common operators into that.
So once we see really frequently are things like the Prometheus helm chart to provide metrics and that sort of thing.
And then we just connect our users to that.
So a lot of folks start to use Crossplane as kind of the central hub from which they spin out other Kubernetes clusters or other, you know VMs or bare metal instances or something like that.
And then being able to provision into those and we have other providers in the works for things like cloud in it, or just general kind of like SSH that can really start to kind of take your Crossplane consumption to a new level.
Marc: That makes sense.
I think, you know, it's also interesting and it's a totally different conversation to think about you know, how organizations are using Kubernetes whether they're putting everything in one large cluster and separating it through namespaces or creating these, these specialty clusters to run certain applications.
It sounds like you have a little bit of exposure to that.
Like, do you have anything that you've seen, best practices that you, that would be interesting to talk about and share?
Daniel: Yeah so this takes us to an interesting design decision of Crossplane that a lot of folks notice right off the bat.
So you mentioned different forms of multitenancy essentially whether it's, you know customers within a cluster or internal development teams across clusters or within a cluster.
So the unit of isolation in Kubernetes, as you mentioned within a cluster are namespaces.
All of our infrastructure our managed resources, CRD types are cluster scope, right? So when you create an instance of it exists at the cluster scope.
So if you're in a single cluster, right with different development teams if they were creating managed resources directly those would exist at the cluster scope which may feel a little counter-intuitive from an isolation perspective.
But we take a pretty strong stance of this kind of separation of concern model, of infrastructure teams owning actual granular infrastructure and application teams owning the consumption of that.
Right so these XRDs that I mentioned before, they actually allow you to kind of spit out abstraction CRDs at both the cluster scope and namespace scope level.
So the general thing that we see here is that infrastructure teams are going to be in charge of all the granular infrastructure gets deployed and in charge of the abstractions that are going to be exposed to developers and then developers consume those from within a namespace.
So you say, we call them a claim kind of modeled after the persistent volume claim.
You say, you know, I have a claim for a database and that gets satisfied at the cluster scope but consumed from within a namespace which is an interesting model and a bit of a paradigm shift for some folks that are already using Kubernetes.
But we believe that when organizations really buy into that model, that it provides kind of the most organizational benefit when looking at infrastructure and application teams.
Marc: Yeah, that definitely makes sense.
Kind of going back to earlier in the conversation we were talking about, you know infrastructure as code providers being Terraform, Pulumi tools like that.
And now infrastructure as data with Crossplane but there was this area that we kind of skipped over things like the Google cloud config connector there was the old service broker concept, Amazon has operators available.
How does Crossplane map into those and provide value that those aren't providing or where, how would I know whether to use like the Google config connector versus Crossplane?
Daniel: Yeah, absolutely.
So we actually work quite closely with those different teams, especially the AWS controllers for Kubernetes, the ACK project.
In fact, we actually use the same kind of like API source to generate our controllers from that.
So the number one thing that you'd see right off the bat about the difference between Crossplane and those is the multicloud slash multi kind of on-prem hybrid sort of model that you get from Crossplane.
For some organizations that's not super compelling, right?
They say we only use AWS, we're only ever going to use AWS and we don't really care.
There are a number of other benefits primarily kind of alluding back to what I was talking about there of the separation of concern.
So primarily other infrastructure management projects that are specific to cloud providers, will go for more of that namespace isolation approach, right?
So they'll have namespace CRDs that are available and probably controllers running in those specific namespaces but you don't get any abstraction in that model.
So you're going to say, you know, the development team in this namespace, if you want an RDS instance you need to actually configure an RDS instance and know the different parameters and maybe we give you a helpful template to do that.
Or maybe we use, you know an abstraction that's not persistent to the cluster.
So something like helm or customize or something like that.
And that's definitely a valid approach for some organizations, right?
Maybe they say, we want to give this full kind of configurability to our developers within namespaces.
However, we see a lot of organizations where the infrastructure teams are not that willing to give developers that level of freedom.
And we think Crossplane is a nice balance of allowing you to define kind of what level of flexibility you want your developers to have which could potentially be, you know all the options, all the knobs to turn.
Marc: Interesting. So the, my infrastructure team if I'm in a large enough company then could decide, okay we're going to run on Azure and AWS.
And so they provide these abstractions, the SRDS the definitions of here's what, like we'll go back to the example we've been using here a Postgres database.
They define what that looks like and that's going to probably require custom configuration for the provider that's running in Crossplane to run that.
But as a developer, when I want a database, like my request to create a Postgres database looks the same regardless of which of those environments it's running in.
Daniel: Yeah, absolutely.
And you know, it should be possible for the infrastructure team to say, you know we no longer run on GCP or Azure.
We're now going to run on AWS and for the development team to never know that, right?
For those kinds of, to be the implementation details to be switched out underneath them without them having any knowledge of it.
And so Crossplane definitely empowers infrastructure teams to do that.
Another aspect of this, and this goes a little bit back to the cluster scoping of these granular managed resources is that there are different types of infrastructure that exist on these cloud providers that are typically not provisioned by application teams, but are consumed by them.
So a good example of this would be any sort of networking infrastructure so a VPC or a private network or something like that.
Those are generally provisioned directly by an infrastructure team.
And then application teams will deploy things into those.
So it makes sense, right to have these infrastructure concepts at the cluster scope where the infrastructure team can say all right, we're going to set up, you know four VPCs or something like that.
And then we're going to create different abstractions that reference those VPCs.
So when the resources actually get rendered out they go into the VPC but that kind of level of configuration is not really exposed to the developers at the namespace scope.
Marc: Got it. Let's go back in time a little bit and talk about like the history of Crossplane.
Upbound is you said around 20 people right now?
Daniel: Yep that's a exact, or I think we're just over 20 at this point.
Marc: Cool so where did Crossplane come from?
Like what, what problem did you have that, you know like as a small startup, right where everybody's probably doing wearing many hats and doing everything, like how did you realize this was something that needed to exist as a standalone project?
Daniel: Yeah absolutely so I was not at the company when Upbound was originally founded or Crossplane was, but for some history on the Crossplane project it was announced at KubeCon Seattle I believe in 2018.
I was actually at school during that time and noticed it as I was kind of keeping up with some of the KubeCon news and started contributing at that time.
At that point, it was very focused on just this multicloud approach right?
And so actually the Crossplane project itself there is no concept of providers and configurations and that sort of thing.
It was just all of these different infrastructure resources in one big package, right?
So you installed Crossplane and you got AWS, GCP, Azure, whatever and everything else had to be kind of in-tree right?
We've seen that familiar model with Kubernetes of having different cloud provider extensions in-tree and then begin to be broken out. So over time it's moved from just kind of this basic thing of putting infrastructure into your Kubernetes clusters to this full kind of composition engine.
And I think at that a lot of the founders of the project, such as Bassam and Jared, who are two of the leaders of Upbound now had this realization that the Kubernetes API was going to be really important beyond the container orchestration aspect of it, that it was going to be kind of the way that we interact with platforms moving forward.
And infrastructure is such a key critical component of that and that area really wasn't served at that point which led to the creation of the project.
Then we've seen it grow over time as other parts of Kubernetes have matured and other projects have been introduced that we integrate with.
Marc: Got it, that makes sense. And Upbound has other projects, right?
Like Rook being one of those.
Have you worked on that project? Is there anything to chat about there?
Daniel: Yeah so I haven't personally done much work on Rook.
It came out of a earlier startup by the founders of Upbound so definitely have a lot of integration there and Crossplane actually has some different points of how you can consume Rook and that sort of thing.
So I've interacted a little bit with it just from the Crossplane side but it's become a very mature and stable project.
So it doesn't actually require a ton of maintenance from our end any more.
I know there's some great maintainers over there that still do some awesome work on it but it's now a graduated project from the CNCF.
So it's kind of humming along and it has a good user base at this point.
Marc: Yeah and I know it's a little off topic here but like I think it shows the power of the Kubernetes API as an early operator right?
Rook you know it's driving SEF clusters inside infrastructure which is just notoriously difficult in specialized knowledge in order to like get up and running and takes a lot and Rook just minimizes the effort there as much as possible and you know, taking that idea across all infrastructure it's great.
I mean, the Kubernetes API definitely does provide a lot of that value in the future.
Daniel: Yeah, absolutely.
And Rook has, as well as other projects have really been kind of like pioneers in a lot of the patterns that we see and the countless Kubernetes operators and controllers that kind of proliferate the ecosystem at this point.
Marc: So I assume you're using Crossplane internally at Upbound?
Daniel: Yeah, absolutely, so Upbound kind of takes the traditional approach of an open-source company in that we provide a hosted Crossplane solution as well as a kind of Upbound distro, if you will.
And you know, we talked a little bit about how folks move towards this model of using Crossplane as their control plane that then deploys their other kind of application clusters and that sort of thing and customers that are really interested in that are a lot of times folks that are coming to us and saying, you know we'd like for you to host Crossplane for us right?
Make sure it's up to date, make sure that providers are right make sure everything is running well, make sure that our infrastructure is being managed correctly.
And we just get kind of a nice user interface that our developers can log into and say, you know I'd like my RDS instance, just like you would, you know on Heroku or something like that. But we're defining the interface that they're presenting.
And when I say we that's the infrastructure teams and our customers.
So it's really just making that experience of using Crossplane and Kubernetes a lot easier.
Marc: Yeah I think it's worth kind of going back earlier in the conversation again, you know we keep talking about this example of Postgres but that's like a really narrow view.
I think you mentioned earlier Crossplane can actually create a GKE cluster.
So if we can actually, if I use the commercial the hosted solution there it solves that bootstrapping problem.
And I can actually just get Kubernetes clusters and get all of my infrastructure up and running then.
Daniel: Absolutely, absolutely.
And it could be, you know Kubernetes clusters, it could be you're spinning up EC2 instances, could be you're spinning up EC2 instances and running Kubernetes on them, right?
This provider model kind of gives you unlimited flexibility.
So it allows folks to kind of start out with a narrow use case because this is a paradigm shift, right?
A lot of organizations either have a hacky platform that they've put together that kind of paste together different parts of cloud providers and on-prem APIs and that sort of thing.
Or they don't have one at all and they're kind of manually doing a lot of stuff.
So we like to give folks an on-ramp that's not going to constrain them in the future with its simplicity today though.
Marc: Cool and we love open source, love the fact that Crossplane is in the CNCF as a sandbox project.
That said know we are in business and we have to make sure that we can pay the bills.
And so, you know, thinking a little bit about the commercial offering, are there features that exist in the commercial offering that don't exist in the open source or is it literally just the open source stuff managed by the team at Upbound?
Daniel: Yeah, so it's literally just the open source stuff. So we are not an open core business.
We're not interested in withholding any features from customers for an enterprise use case.
It's primarily, right, just making sure that that control plane is up and running.
And then in the kind of like on-prem distro and that sort of thing, we give you connection back to that user interface that I mentioned.
So it's really just providing some helpful tooling and hosting around the solution.
Marc: Where you at Upbound when Crossplane was donated to the CNCF?
Daniel: I was, I was at Upbound at that point.
So when it was initially announced, like I said I was in school and I was really excited about it.
I'd kind of done some work around infrastructure automation and that sort of thing through internships.
And I saw this new project, it was open source, so I started contributing and I did that for about six, seven, eight months, something like that just kind of nights and weekends sort of thing.
And then after a while they basically reached out and were like, hey would you like to, you know be compensated to work on this?
And I said, that would be great, you know being a student coming right out of school.
And so that was a really amazing opportunity for me and now having the opportunity to be a maintainer of it and that sort of thing and seeing, you know, leaders within our organization and the Crossplane community in general go through that CNCF donation process and the amazing benefits that we've gotten from that has been an awesome experience for me kind of earlier in my career.
And speaking of that, one of the latest benefits we've gotten for being part of the CNCF is being able to be part of the LFX mentorship program where we're going to get to have other folks, you know like myself when I was in school or who are, you know earlier on in their career, being able to, you know work on Crossplane sponsored by the CNCF and have meaningful interactions with maintainers and kind of grow their skillset at the same time.
So definitely big fans of the CNCF and I personally have gained a lot from Crossplane being part of the organization.
Marc: Yeah that's cool, that LFX mentorship program is really, really great.
Like your responsibility as the maintainer is just to provide mentorship, but you get, you know, you just introduce new folks who are interested in the ecosystem and interested in the project to the code base, you help them out and they're able to like make meaningful open-source contributions too.
Daniel: Absolutely, absolutely.
Marc: Cool, let's dive into the roadmap a little bit.
I think I get what Crossplane does right now. Where's it going? What's next?
Daniel: Yeah so we had a really big milestone before the end of this past year where we hit our 1.0 for Crossplane.
And so we've committed to stable APIs and that sort of thing.
That's just the core Crossplane components.
So the packaging and composition and that sort of thing.
And then the providers writer released on their own schedule and have various levels of support at this point mostly depending on demand by the community.
So, you know, one aspect of that is continuing to mature these providers and partnering with the cloud providers themselves to drive adoption of new resources into those so that, you know they're more usable for folks.
For instance, provider AWS I think over the last month or two, we've added, you know 30 to 40 new resources, which if you've ever looked in the AWS console, there's still plenty more to go but we do have coverage of some really great ones right now and we'll continue to see that grow over time.
So that's obviously part of the roadmap.
Another part is kind of the formalization of what we call the XRM model.
So these manage resources that all of these different providers bring they kind of satisfy a common interface.
If folks watched TGIK last week with Scott from Knative, they have some examples of kind of concepts that's gaining steam that we use extensively in Crossplane already called duck types.
So essentially you're just saying that we have these different resources that all adhere to some common patterns that allow us to treat them generically.
And that's how we can compose them into higher level abstractions and aggregate up status.
So a good example of this is, you know when you provision infrastructure it usually takes some time to come up.
So if you have a composition of a lot of different resource types that have variable amounts of time to actually get provisioned, something like your database may take, you know 10 minutes while your BBC could be instant. And so when they fit this common type we can make inferences about, you know, when they're ready and that sort of thing, based on the status of those resources that conform to this XRM model.
So really formalizing that and partnering with other projects as well to say, you know what does it mean to be a good citizen in the custom resource definition space of Kubernetes and then connecting that up to applications, right?
So this infrastructure is not worth much, right if you can't consume that through applications, and there's a variety of ways to do that.
Frequently, we see people, you know use Kubernetes secrets that are published by these controllers that we run as part of Crossplane.
But there's also other ways such as service binding which is another kind of popular concept in the Kubernetes ecosystem that is gaining steam and also, you know, writing to external secret stores and that sort of thing.
So I think all of these can kind of fall under the idea of kind of like production hardening.
And so that's, that's what the near term of Crossplane looks like as we've committed to stability and you know reliability for organizations to rely on.
Marc: Yeah, there's definitely a lot there.
So it sounds like it's really just use cases, you know kind of summary, right?
Like making sure that people can consume it the way that they want to making sure that you're being a nice player in the Kubernetes ecosystem or in my cluster, which is important.
CRDs are a little bit still like the wild west out there when you install them into the cluster, you're a little bit unsure how they're all going to work.
Daniel: Absolutely yeah and Crossplane definitely has some opinions about how that should work that we enforce with our package manager, but yeah just continuing to improve that space and kind of have standards right, for what these different CRDs look like it goes a long way, not just for Crossplane, but also for other projects in the CNCF.
Marc: Have you given any thought to requirements around moving from a sandbox project to an incubating project in the CNCF?
Daniel: Absolutely so we are in the process of moving to incubating right now, or applying for incubation I guess, would be more appropriate.
So a lot of the requirements around that involve demonstrating the use cases from end-users.
So those are things we've been collecting from folks in the community and continue to push forward for that.
So Jared who's one of the founders of the Crossplane project is leading that effort and he led the sandbox donation and also led Rook to graduation so he has a lot of experience doing that but we're hoping that, you know we reach that incubation stage very soon.
Marc: Great, Upbound's small, you know as 20 people can you shed some light about like how many employees are working full time on the Crossplane project versus other projects?
Like what does the day-to-day look like for folks contributing to it?
Daniel: Yeah, absolutely.
So Upbound does take an approach of kind of having a Crossplane team which is the team that I'm on.
And that doesn't mean that other folks within the organization don't also contribute to Crossplane or various providers as well.
But we do have a team of four folks including myself that are maintainers on the Crossplane project along with some other folks and, you know maintainers of other Crossplane related projects.
And so a lot of our work revolves around, you know doing review and design and architecture and that sort of thing, as well as implementing features into Crossplane, as well as you know communicating with the community we have a very active and vibrant Slack channel where folks come and, you know get help or contribute or, you know, grow into maintainership roles within the Crossplane ecosystem.
And having that kind of separation of concern I guess if you will, within our organization even though it is quite small really provides the ability to build a really strong open source community where you have a team that's dedicated for advocating for the open source community, right?
So obviously Upbound, you know, has different thoughts and opinions about things that they would like to be implemented in Crossplane.
And we can kind of serve as a buffer to say, you know, like is this the best thing for the community or not? We can run it by community members.
We can talk to folks, go through formal, you know enhancement proposals and that sort of thing to make sure that that open source users are served first by our open source Crossplane project.
Marc: Cool, and one step above that I imagine as you're building Crossplane, you start to kind of bump up against some edges of Kubernetes itself.
How often does the team get involved in KEPs or Kubernetes enhancement proposals or upstream work in Kubernetes?
Daniel: Yeah, so this is kind of a two-part answer I guess.
The extension model of Kubernetes is really strong.
We certainly hit some edges, but I would be remiss if I didn't call out how incredible the design of the model is and how extensible it really is allowing for projects like ours to build robust systems with the kind of like distributed system layer and API already implemented for us.
A lot of folks who write controllers are familiar with controller runtime, which is a helpful framework for being able to write controllers and run your reconciliation loops.
So we have contributed upstream when we've hit issues with that.
And we also have kind of our own opinionated layer which is starting to become a pattern with a few different projects on top of controller runtime that makes it really easy for folks to write providers which have some common patterns that differ from just generic controllers.
So in that regard, we certainly look upstream for any kind of, you know, advancements we make or things that we think can be more generalized for the community.
And then myself personally, kind of outside of my responsibilities with Crossplane and Upbound when I started Upbound, got involved with the Kubernetes release team.
So the Kubernetes release team has a shadow process which I think is really one of the best open source mentorship programs there is.
So a few years ago now I started that shadow program, shadowed a number of different positions within the Kubernetes organization and release team and kind of continued to take on more responsibility and now serve as a tech lead for SIG release.
So I build some of the tooling and you know do some mentorship in that role where I get to really contribute to a lot of the things around Kubernetes testing and releasing our different artifacts and getting those into the end user's hands.
So in that regard I'm very passionate about upstream contribution.
And one of the beautiful things about working at Upbound is they really encouraged that, see the value in it and empower me to, you know spend time and energy on that.
Marc: Yeah that's amazing, all the work that you're putting into that too.
And I think, you know it's often easy to look at, you know, oh here's some code that I wrote, but like the effort and the process that the Kubernetes team goes through and you're, you know, you're a critical part of that to you know, make sure that it's not going to break my cluster and it gets released on time, like that's a massive effort.
Daniel: Yeah absolutely and there's so many folks that put in a lot of kind of like thankless effort to make that happen.
And it's really incredible to look at such a large group of people globally distributed but also just distributed commercially, right, with different commercial interests.
And you primarily see that with larger companies that want to have more influence over Kubernetes rather than startups that are more consumers of Kubernetes but it's really incredible to see such a large group of people work together and really respect each other despite sometimes conflicting interests, right?
And structuring a large open source project is a difficult, difficult thing to do.
And lots of projects have gotten it very wrong.
And that's not to say that Kubernetes doesn't have its bumps and bruises that it's had along the way.
But I think the caliber of people that work on Kubernetes is really impressive just from, you know an integrity and caring about each other perspective.
It's definitely one of the most welcoming communities that I've ever seen.
And, you know, I personally benefited from it a lot and so that makes me really excited about welcoming other people into it because I know the immense benefit that it's had on my career and will continue to have on other folks as well.
Marc: Yeah that's great. It's almost like if we all collectively put our minds to it we can build something really, really, really great.
Marc: So, you know, my last question is really just you know, how can we help, you know Crossplane is cool and we'd love to see Crossplane move to the incubation layer.
Like what's the biggest ask that you'd have from somebody who's listening right now?
Is it on the contribution side, use case side? What just in general?
Daniel: Yeah so I'd say all fronts.
Definitely always looking for new contributors and we've had some folks kind of grow from, you know early contributions to even maintainership roles at this point.
And there's a lot of different areas to get involved with all of our different providers and extension points.
So definitely looking for folks who want to get involved in that.
And there's also opportunity for mentoring in that as well.
If you'd like to grow your skillset and kind of evolve in that way.
And then you hit on it, use cases is really huge.
I personally, since the 1.0 released have been taking a look at our security model and the way we kind of use credentials to contact cloud providers, as well as propagate credentials to applications within the cluster for infrastructure that gets provisioned.
And a big impact on how we design that and improve that is based on user feedback that we get in Slack.
You know, who say my organization can't handle credentials in this way.
We have a policy against it. You know, we have this sort of compliance we have to have.
So if you're an end user and, you know, you would like for Crossplane to be better, you know, opening issues, joining us in Slack, even, you know pinging me and saying, hey do you want to jump on a call and I'll explain it to you, we're more than happy to do that and that's really going to be for the benefit of all users of Crossplane moving forward.
And we see this as kind of the primary way that organizations that adopt Kubernetes are going to manage their infrastructure.
So it's definitely a high impact on both the contribution and kind of user stories side.
Marc: That's great, so to try to use it, if you run into any problems and make it hard to use or incompatible with the way that you have to run stuff, reach out?
Marc: Great, is there anything else that you want to share that we should chat about Daniel?
Daniel: No, I don't think so.
I would like to go back to as we mentioned before that LFX mentorship program, not just for the Crossplane project, but for all projects there's some really, really great opportunities on there.
Definitely some maintainers who have committed to mentorship that I know to be really wonderful people that would be awesome to learn from.
So definitely would encourage folks to look at that as well as the SIG release shadow program and upstream Kubernetes that I mentioned earlier both of those will not only enhance your skillset and improve your career but also will probably lead you to meet a lot of wonderful people who become your friends.
So I would definitely encourage folks to get more involved in that way.
Marc: Yeah take it from Daniel who kind of walked this path himself, like creating opensource contributions and ending up working for the company.
Marc: Well Daniel thanks a lot for your time today, it was really great to chat about Crossplane.
Daniel: Absolutely, thanks for having me, Marc.
Subscribe to Heavybit Updates
Subscribe for regular updates about our developer-first content and events, job openings, and advisory opportunities.
Content from the Library
The Kubelist Podcast Ep. #23, Pixie with Michelle Nguyen and Natalie Serrino
In episode 23 of The Kubelist Podcast, Marc Campbell and Benjie De Groot speak with Michelle Nguyen and Natalie Serrino about...
The Kubelist Podcast Ep. #19, Inside InfluxData with Rick Spencer
In episode 19 of The Kubelist Podcast, Marc Campbell speaks with Rick Spencer about InfluxData, developer of InfluxDB. Rick...
The Kubelist Podcast Ep. #18, Submariner with Miguel Ángel Ajo and Stephen Kitt of Red Hat
In episode 18 of The Kubelist Podcast, Marc and Benjie speak with Miguel Ángel Ajo and Stephen Kitt of Red Hat. They discuss the...