Ep. #2, GitOps At Scale with Mukulika Kapas of Intuit
about the episode
about the guests
Marc Campbell: I'm here with Mukulika Kapas from Intuit, who is the director of product management working on the developer platform and other projects, including the Argo Project, which is a CNCF project. Welcome, Mukulika.
Mukulika Kapas: Thanks, Marc. Thanks for inviting me to the podcast.
Marc: OK, I'm excited to dig in and learn a little bit more about Argo today.
Just to get started, can you set up a high level and just talk about what the Argo Project is, and the inspiration for the project?
Mukulika: Sure. The Argo Project is a set of Kubernetes native tools for deploying and running jobs and applications, including workflows, deployments, rollouts and events.
It uses GitHub's paradigm such as continuous delivery progress of delivery, and enables MLOps on Kubernetes.
Argo Project consists of four sub-projects, which includes Argo Workflows, which is a container native of flow engine, supporting both dag and step based workflows.
Argo Events, which is at events based dependency manager for Kubernetes.
Argo CD, which supports declarative GitOps space deployment of any Kubernetes resource.
And then Argo Rollouts, which supports declarative progressive delivery strategies such as Canary, BlueGreen and more general points of experimentation.
So, Argo Project was open source in a different timeline, and let me go through that.
The first project, which was open source, was Argo Workflows in late 2017 by a startup called Applatix.
With the emergence of containers and Kubernetes, we saw that there's a major shift in how applications and services will be developed, distributed and deployed, and Applatix was starting to make Kubernetes easy for the enterprises. While working with Kubernetes, we found that an integrated workflow engine is fundamental in a distributed system for orchestrating jobs, as well as distributing and deploying complex micro services based application. That's how we first open sourced the first Argo sub-project, Argo Workflows.
Then in 2018, early 2018, Applatix was acquired by Intuit to build Intuit's new developer platform, also internally known as Modern SaaS platform on Kubernetes, on which all Intuit products are today built and run.
While building the platform we realized there's a need for an Enterprise Ready continuous delivery solution, so in early 2018 around March, Argo CD was incubated at Intuit and open source.
Then in the middle of 2018 around May, BlackRock who was using Argo Workflows in their data science platform contributed Argo Events to actually integrate with our global flows and launch Argo Workflows.
Then while using Argo CD and production at scale and Intuit, we saw a need for progressive delivery like Canary and BlueGreen Deployments, and that's how in 2019 the open source Argo Rollouts.
So that's the story behind all the four Argo sub-projects.
Marc: I see. So, four different projects. And currently Argo is an incubating project in the CNCF.
Does that encompass all four of those projects, as all under that same incubating project in the CNCF?
Mukulika: Yes. Argo's overall project is incubated under CNCF particulars, all of the four projects.
Marc: OK, great. Can you talk about what other systems--? Specifically, non-CNCF systems that Argo replaces?
Mukulika: Sure. Let's first talk about Argo CD.
When Intuit started rolling out Kubernetes across 100s of services, today wee run thousands of services in production and almost 2,500 in preprod.
We were looking for Kubernetes native continuous delivery solution, and we were using Spinnaker at that time for deploying too public cloud, and we were looking for a more Kubernetes native solution.
Which was also declarative, we wanted to define everything in git as the source of truth.
We wanted to make sure there's a clear separation between our CICD pipeline and the continuous deployment solution, and also wee wanted some enterprise friendly features like auditability, compliance security, ARBAC, single sign on.
We looked at other solutions in the open source community at that time and we didn't find any, and so we decided to build an open source Argo CD.
Argo CD uses git repositories as the source off truth for the desired state of applications and the target deployment environments, so Kubernetes manifests are specified as YAMLL files or customized files or Helm packages or Ksonnet applications.
Argo CD automates the synchronization of the desired application state with what is specified as thee target state in the git repo..
Marc: OK, that makes sense. I'd love to dig into a little bit more about GitOps in general and how Argo CD sees GitOps being used.
So you mentioned you can have a declarative state in a git repository for Kubernetes manifests, Helme repositories and Ksonnet applications.
Once I have them declared and pushed into that git repo, how does Argo get it into the cluster?
Mukulika: Sure. Argo CD can run in both pull mechanism as well as push mechanism.
Argo CD is continuously monitoring your target state that is defining it, and it is also monitoring your state of the cluster.
It sinks whenever it sees that the gate is out of sync with the cluster, or a cluster is out of sync with the git repo.
It syncs automatically, so you can drive the syncing through Jenkins or Argo Workflows pipeline if you're in a very restricted company like big enterprises, you can drive the same through a pipeline.
Or, you can enable autosync where Argo CD is always listening and it can automatically sync the state of the cluster.
Marc: OK, that sounds pretty modular and flexible to enable various types of environments.
What about different types of events?
So if I add a new Helme chart into that git repository, Argo will automatically deploy it.
Is the reverse also true? If I want to un-deploy something, can I just delete it from the repository and Argo deletes it from the cluster?
Marc: OK. I'd love to dig in for a little bit about the tech stack that you're using to build Argo right now.
Can you explain a little bit about the various technologies that you're using, and maybe any evolution or changes that you've made along the way?
Mukulika: Sure. Because Argo CD is very Kubernetes native, it is billed as a CRD and goal line is the choice of language today to build Kubernetes native CRD.
Argo CD also is written in goal line, so dex is used for the single sign on part. It has mainly three components, Argo CD API Server , Argo CD Application Controller and Argo CD Repo Server.
Argo CD Repo Server retrieves the manifest and syncs with the git repo, and Argo CD takes care of retrieving the manifest from git.
Then Argo CD Application Controller, that's the main component which compares manifests in repo s ervers with actually what is running in Kubernetes.
Then Argo CD API Server is the gateway to Argo CD, so I can close the UI, CLI, and the API. So, it's mainly all in goal line.
Marc: OK, that's great. All in goal line across all the various projects, and the CRD implementation-- Has it been that way since you started building the project in 2017?
Mukulika: In 2017 when Argo Workflows was first open source it was not CID, it was much more heavier and it was not Kubernetes native at all.
But as we learned, Argo CD was built from scratch as a CRD, and then Argo Workflows version two was also built as a CRD.
Marc: I think that's a great trend, in building stuff to be Kubernetes native and in focusing on that really helps position the technology to solve the problem in a very Kubernetes-unique and Kubernetes-friendly way that just is more accessible to folks running Kubernetes clusters.
As adoption of Kubernetes increases, I'm sure that's really been a driver for more use cases of Argo Workflows and Argo CD.
Mukulika: Good. Also, people run Argo CD in different ways.
Some run it in cluster to sync that cluster, and some run it remotely to sync multiple git repos with multiple cluster endpoints.
So there are different ways people run it, and if you make it more and more Kubernetes native it's much easier to install and manage Argo CD.
Marc: That's actually interesting, and I'd love to talk a little bit more about that.
So there's the very common traditional GitOps workflow, it might be to install an operator or a controller in my cluster and have it watch a git repository.
You've described a workflow where Argo can do that, but there's some unique workflows that Argo creates by the modularity of it, by allowing-- I can run it in a separate cluster completely and not even install the CRDs into the target cluster.
Is that possible with Argo?
Mukulika: Yes. We have seen that people, when they manage their clusters using GitOps, they want to run within the cluster.
Whereas in Intuit, for example, we are also using Argo CD to do continuous deployment of thousands of services across multiple views, across 100s of clusters.
In that case, running Argo CD in each cluster versus running Argo CD centrally and then managing all the 200 clusters, we preferred the remote option.
Running centrally and then deploying into multiple clusters.
We do have multiple Argo CD instances for blast radius and different security reasons, so say one Argo CD can deploy say 5,200 clusters.
Some preprod, some prod. And you have to register those git repositories and those Kubernetes cluster endpoints to Argo CD as an application.
Marc: That's interesting. That actually creates a pretty flexible workflow.
If you have that cluster set up, that instance of Argo CD, and it's deploying into a couple of hundred different clusters or environments, how can I monitor those to ensure that the deployment is--?
Like, that the desired state that I want actually is applied and running in the target clusters?
Mukulika: That is the job of Argo CD. It is constantly showing you whether your cluster objects are in sync with what is defined in git repo.
Marc: Going back to building Argo CD originally and Argo Workflows, I'd love to talk a little bit about and understand some of the technical challenges.
Especially in 2017, which is not that long ago, but it was pretty early in the Kubernetes ecosystem.
I'm sure you talked earlier about going from Kubernetes compatible to a Kubernetes native and CRD driven architecture.
But what are some of the technical challenges that the team ran into early on that made it difficult to build?
Mukulika: Actually, the team ran into more technical challenges later on then early on because of the scale, so for example today at Intuit we manage 1.7 million Kubernetes resources across 200 clusters with 8,000 application.
So as the scale is increasing, scaling Argo CD has been a challenge, and that's where our biggest focus is now because we have lots of enterprises and startups using it, and at scale.
Argo CD uses its CD as the persistence layer, we don't have a separate database.
So when we have too many users trying to access UI, reading CD at scale has been an issue, so we had to implement caching off a list of apps and then we are storing details in Retis.
So basically, scaling the products is the challenge that we are facing, and we are working through it.
Marc: 1.7 million Kubernetes resources across 200 clusters is large, and Intuit is obviously an early user and big user of Argo.
Are there bigger use cases of Argo orders of magnitude larger? Or is that pretty much the largest use of Argo CD that you know of right now?
Mukulika: If you go to Argo GitOps repo, you will see a list of public enterprise and startup users.
There are big users like Tesla, Ticketmaster, Major League Baseball and Adobe, and many-- RedHat is a big user of Argo CD.
We recently announced a partnership with RedHat because a lot of their open shift customers are using Argo CD, so it's not just Intuit.
There are many other companies who are using Argo CD at scale.
Alibaba is using Argo CD at scale and they are also a contributing company.
Marc: I'd love to understand a little bit more about that partnership that you recently announced with RedHat in open shift.
What are the goals and the use cases in particular that you're looking to see out of that in the learnings you're hoping to get?
Redhat has been working with us for almost a year, and they have open share Argo CD operator that they built first to have Argo CD in their catalogue of applications.
And then they heard more and more from their users how better too integrate open shift with Argo CD, and that's why they wanted to be a core contributing company behind Argo CD.
So, we recently announced that partnership.
Marc: I see. I want to go back for a minute and talk more about GitOps in general, if that's OK.
Just the GitOps ecosystem, we've been running Kubernetes in production for quite a while and we're big GitOps fans.
But I think the term GitOps, it can mean different things too different people as far as the implementation of it goes.
It sounds like you're describing Argo as very modular and customizable, it adapts to what you need.
But I'd love to understand, three years ago you were building in the GitOps pipeline, which was pretty early for the Git Ops term and ecosystem.
I'd love to understand a little bit more about Argo's position as, if I'm just getting started with GitOps how should I get started?
Mukulika: Sure. Most companies today already have CI pipeline, and then they are looking for ways to automate CD, and then they are looking whether to use a declarative continuous delivery way of doing things.
That brings them to GitOps. We also started the same journey at Intuit, we have a CI pipeline using Jenkins and some of the things we're doing CD by directly using Jenkins to deploy or some other tool, or Spinnaker to deploy.
Then when we started building the modern developer platform on Kubernetes, we wanted to see how it goes. Kubernetes is completely declarative.
We wanted to do continuous deployment, but declaratively, and that's where we came across GitOps and we wanted to use GitOps.
But Intuit being a fintech company, we didn't want the continuous deployment is automatically enabled in every cluster, that any time there's a change in git you automatically update the cluster.
What we decided to do, at least within Intuit , is we have a separate pipeline which orchestrates Argo CD across different environments.
So suppose the developer makes a change in the code, the code is built and then it updates deployment configuration. It gets deployed first in QA, then staging, then production. This orchestration, although the git is being updated automatically it's the pipeline which drives calling Argo CD to sync first with QA and then sync first staging, and then to sync production. And then also we have approval git before production, so Argo CD shows when environment goes out of sync and then the pipeline is the one which finally makes the decision whether to sync the environment with it. So we are doing GitOps, but we are orchestrating it through a pipeline and we see many enterprises doing it who have a much more regulated environment.
Now when we are doing a cluster of grades, that is how we deploy applications to production.
Let's say our cluster management team, our platform team, when they're doing cluster upgrades using GitOps, then we don't use that pipeline.
Then we just update git repo, we have git repo for different clusters.
We update the git repo with new cluster configuration, whether it's AMI version or whether it's a Kubernetes version, or any other release.
Automatically all the clusters are upgraded following GitOps, so we use it in different ways.
Marc: I think that's actually important to talk about and to really think about for somebody who's thinking about adopting GitOps, it can be intimidating at first because they might say "I'm in a regulated environment" or "We don't have enough confidence in our process to have everything automatically go out to production."
But GitOps and Argo doesn't necessarily mean that, I can have any kind of gate and I can have any change management process I want on top of it, but it's really about automating the process instead of having a manual process.
Mukulika: Correct. It's about making the process more declarative so that you can go back and see what exactly changed and what was deployed, and then automating the process.
We knew that different enterprises will want to do GitOps in different ways, so from the beginning when we built Argo CD, it was a very conscious decision that we will allow Argo CD to sync automatically as well as we will allow Argo CD to be synced by an external agent, which can be any pipeline.
That's why we didn't build any type integration with Argo Workflows as well, because we knew that companies already had some kind of a pipeline, some kind of a CI pipeline at least.
So we said, "OK. Argo CD can automatically sync, or Argo CD can be synced from any pipeline."
So whether you're using Jenkins or Spinnaker are Argo Workflows or Tecktonik, you can call Argo CD from the pipeline as well.
And if you are very advanced or if you're doing cluster GitOps and you want to automatically enable as soon as the git repo gets changed, you can do that too without using any pipeline.
Marc: I think that that's super important to be able to hook into existing pipelines, because most of us have some CI and CD out there, and to make the bar to adopt a product to be completely erased all your CICD pipelines and build it, that's just a huge lift for anybody to be able to take.
Let's talk a little bit more about cluster GitOps that you were describing, you mentioned it again but earlier you mentioned that Intuit does it to deploy the clusters themselves, the upgrades to Kubernetes.
Can you explain a little bit about how that works into the Argo Workflows and Argo CD projects?
Mukulika: So we started Argo CD as Application GitOps Project, and then very soon we started realizing that, "OK. To manage all our 200 clusters instead of calling CLI to update or upgrade clusters, GitOps is the way to go."
And then some companies use their GitOps to do cluster GitOps first and then go to application GitOps, we started with application GitOps and then moved to cluster GitOps.
Today when we are doing cluster GitOps see do it at different layers, actually.
Our AWS accounts get created following GitOps, and then we have for every cluster we have a declarative file to create the cluster, which is stored in git, and then an upgrade to the cluster or any update is also followed using GitOps.
Argo CD is used for that as well.
Marc: Is that using cluster API, or can you explain a little bit about how you're actually implementing that?
Mukulika: We were using cloud-based clusters earlier, and now we are moving to EKS clusters.
We are not using cluster API, so we have cluster configuration files which defines the cluster state, and then Argo CD uses a cluster CRD to update the cluster based on that configuration file.
Marc: You mentioned earlier, and I just want to make sure I heard it properly, if I want to create a new AWS account at Intuit and create a new cluster in that AWS account, it's literally just checking in a declarative manifest into the GitOps repo, and Argo will go ahead and provision the account and everything inside it that's needed?
Mukulika: Yes. We haven't yet open sourced that part because we recently built that, so our goal is we have a tool which can take cloud formation templates and create accounts using GitOps.
We have a CRD for that as well as we have a CRD to then build the cluster once the account is created.
This part, we haven't yet open sourced.
Marc: Pretty cool, though. I think it leads me down a path of questions around governance and integrations into other policy based tools in the CNCF, maybe open policy agent is an example.
How does Argo integrate into tools like that, that allow me to declare the set of policies that I have before anything has to pass before it gets deployed into the production environment?
Mukulika: Argo basically, given any Kubernetes manifest, it can apply.
Argo has application CRD, you can define your cluster as an application and whatever Kubernetes manifests you have under that application, Argo can apply that.
Marc: OK, and so I can run any type of rules in my workflow?
Mukulika: Open rules, yes.
Marc: Great. That helps me understand where Argo is today and what it's doing, and I'd love to transition a little bit and talk about the roadmap and what you're currently working on and what you have planned in the future for the Argo Project.
Mukulika: We have a separate roadmap for Argo CD and Argo Workflows, and Argo CD also includes Argo Rollouts. One part of Argo CD's roadmap is focused on core functionality.
Any issues around core functionality that the community is finding, we try to be proactive in fixing those.
A big part of the roadmap is focused on scaling Argo CD to provide 2,000+ application support, 100+ cluster support for Argo CD instance, mono repo support.
Argo CD is not optimized if you have a large number of applications in one vehicle, so a big focus on scaling Argo CD is going on today as the adoption is increasing.
Then we are working on a new concept called Application Set, where you want to apply the same change across multiple applications.
This use case is especially needed when you want to apply the same change across 100 clusters using GitOps.
Alibaba is one of the core contributors for applications set, it's part of the roadmap. Then we are working on--
We took the core of Argo CD and we created something called a GitOps engine, and it's a library that implements core GitOps functions such as Kubernetes resource reconciliation and diffing.
The reason why we made this core engine is so that if somebody else wants to build a GitOps solution, they can take this engine and build on top of it. It doesn't have the UI and all the enterprise features of Argo CD, it's just a small library. Then we are always working on different configuration management tools.
Integration, Argo CD from day one supported Helm Customize and although we do support adding any custom config management tools, if tomorrow a new configuration management tool comes up we would like to integrate with that.
Then features like notifications, when people are deploying they want to send notifications.
Then automated registry monitoring, etc. They are there in the Argo CD roadmap.
Now Argo Workflows, as I said, is not only focused on CICD pipeline, but it is also focused on MLOps and then for large scale data processing pipelines.
There the roadmap is focused always on scalability of these workflows.
Features like memorization, where to select the workflow steps can be executed much faster, supporting more semaphores and new taxes, and then artifact management across steps in workflow and metrics and reporting.
Marc: That's a pretty ambitious roadmap.
Going back to the beginning of it, Argo CD application sets sound interesting and I'd like to talk a little bit more about that.
Can you give more of an example of a change that I would want to deploy across multiple applications?
Mukulika: Sure. When you're rolling out a change across 100s of Kubernetes clusters, whether it is deploying an add on or a demon set, or even updating Kubernetes version or releasing a security patch across multiple Kubernetes clusters.
You need a way to update multiple applications or clusters at the same time, and initially we had a concept of Apple Apps--
The way we are defining applications declaratively, we wanted to have a way to define a set of applications, which represents a set of clusters declaratively, and that's why the way we have application CRD and application controller with AppSec, we are creating application set controller.
Marc: So, is application set going to replace the app of apps declarative object that you have right now?
Mukulika: App of Apps today is not really declarative.
Application is declarative, and yes, we want to make this application set which is declarative to replace the Apple Fax back end.
But it is still in the very early stage, we haven't even had our first release, but it is a big part of the roadmap.
Marc: It sounds important, though.
You mentioned a lot of the roadmap for Argo CD is around scaling and being able to scale to additional and higher numbers of applications and higher numbers of clusters and unique workflows, like mono repos.
But the application set idea sounds like it really scales process and change management control, and it minimizes that diff that needs to be committed and deployed in order to deploy a change through GitOps.
Marc: Then the next thing you mentioned was the GitOps engine for library reconciliation.
It's a library that I can enable GitOps inside my application without the UI and all the workflow capabilities that Argo has.
Are there applications that you know of today outside of Argo that are using this, or is this really still very speculative?
Like, we put it out there just to see what might happen with it?
Mukulika: Actually, I'm seeing some of the vendors like GitLab are trying out this engine as a part of their product.
Marc: That's cool.
Just to make sure I understand, it's a go library that does full diffing?
So, given a set of declarative manifests the GitOps engine will look at the Kubernetes cluster using your client, go and then diff them and tell me the differences that are.
Like, funnel those back up to the application so that I can then make decisions that I need to do around deploying resources?
Mukulika: Yes, it's diffing as well as it does the reconciliation part as well.
Marc: You mentioned that you had customized Helm and Helm support built in from the very beginning, you said the roadmap includes potentially adding any other configuration management tools that surface and become popular requests.
Are there any right now on the roadmap that you're looking at and starting to evaluate?
Mukulika: No, currently we haven't seen any other that has picked up.
But we are always on the look, and we started actually the case on it, and then moved to Helm and then to customize.
Marc: So, who is the target today that you would like to see using Argo CD?
It sounds like as you're pushing to more scale in the project roadmap right now, you'd love to see applications in clusters that have huge workloads.
But are there other specialty workloads and use cases that you'd like to see more of to help the project mature and get more learnings from those types of environments?
Mukulika: Sure. Again, we are seeing both enterprises as well as startups pick up Argo CD, because given that it's very Kubernetes native to set it up to install, it is not hard at all.
It takes minutes literally because we are seeing enterprises using it.
They do need a lot of application support and that's where the roadmap is.
Also because Android is one of the core contributors and we use it at scale, it's a requirement from us as well.
Marc: So, if I'm getting started using Argo right now in my startup or in my enterprise, what type of feedback are you looking for the most?
Mukulika: Firstly, whether you're using it for applications, whether you're using it for cluster GitOps and at what scale you're using it, what config management tools you're using, and what processes you are using.
For example, today morning itself, I came across a unique process used by a company where they want to enable auto sync between git and the cluster, but they want to do it in phases.
They want to say, "OK. Autosync this many clusters first, and if that is successful then autosync these many clusters--"and they didn't want to use a pipeline to orchestrate all of this , so we have a feature in Apple Fax factoring called Sync Waves.
We don't have it yet in app sec, and we realize we need to build it.
So different processes always helps to understand and build features.
Marc: An area that we haven't talked a lot about right now is the Argo Rollout sub-project.
And it's interesting, I'd love to understand a little bit more about how long the Argo Rollouts Project has been out.
What created that, what was the inspiration for that one in particular? Just dive into that a little bit more.
Mukulika: Sure. We open sourced Argo Rollouts in the middle of 2019 because first we followed GitOps to deploy applications.
Then we needed to come up with a way to do the BlueGreen deployment because people earlier used to do BlueGreen deployments using Spinnaker on a cloud, and then we needed to provide Canary deployment in different ways.
So we were looking for a tool which can do different types of progressive delivery, whether it is BlueGreen or whether it's Canary, or some kind of experimentation.
We didn't find any, and that's where we again, open sourced Argo Rollouts.
Argo Rollouts can do BlueGreen in a GitOps speed, and we were looking for a tool where we can define the blue and green very easily in the git repo and sync accordingly, and transition accordingly.
We didn't find any, and so we built Argo Rollouts as another CRD.
So it is very similar to deployment resource, only thing is it has a rollout strategy portion in the manifest where you can say "OK, based on these results you can now move to the green strategy."
Marc: Since then, or really starting around the same time, there became another standard around service meshes called SMI.
Obviously Istio and Linkerd have been around, and Console have been around for a while.
But now there's Open Service Mesh from Microsoft, and I think there's some overlap between the two.
I want to better understand where I might want to use Argo Rollouts where that makes more sense instead of adopting a full service mesh, where maybe something like Open Service Mesh or Linkerd might still make more sense than Argo Rollouts for a particular use case.
Mukulika: Argo Rollouts works with actually service mesh. Say for example, you want to use Argo Rollouts for BlueGreen.
That is OK, you have a blue stack and green stack and Argo Rollouts tells, "OK. How to roll over from blue to green."
Now if you want to do Canary release, you can do it of course level, where you say have five replicas and you can do traffic routing to one replica, and then you do traffic routing the remaining four replicas.
So you're doing 20% of your core screen traffic routing, and if you want to do much more fine grained traffic routing then Argo Rollouts integrates with service mesh where you can actually say "OK. Route 1% of traffic, then route 2%, 3%, 4%."
So, Argo Rollouts supports different service mesh options, it's still SMI today.
Marc: How do I define that progressive rollout strategy?
How do I say I want in the core screen method that you described, I said one of the replicas is running the newest version and the other four, how can I declaratively define the criteria that's going to allow that to become 100% of the traffic with Argo Rollouts?
Mukulika: If you go to Argo Rollouts GitOps repo, if you go to the examples folder you will see there are many examples around Canary.
You will see that in the rollouts object there is a strategy portion where you have steps, and you can set weight, and then you can see how to move from 20 % weight to 40% weight to 60% weight.
Then basically, it's similar to deployment object, so you have the number of replicas and then using labels you can set how to roll over the traffic.
Marc: Cool. It sounds like a great addition into the ecosystem, even if I haven't adopted a service mesh I can--
The course Green BlueGreen deployment definitely enables the CICD pipeline.
How is the adoption of Argo Rollouts going since you launched that in mid 2019?
Mukulika: Actually, along with Argo CD, Argo Rollouts is also picking up.
It's not yet used by as many companies as Argo CD, because also the majority of the different companies are at different levels.
But that's the next big one that is getting a lot of adoption, even within Intuit today, I think we have almost 2,500 services running on Kubernetes.
We have around 300 services using different rollout strategies, so people are starting with first continuous deployment and then they're moving to more progressive delivery, because Kubernetes allows rolling updates.
So people start with basic rolling update of deployments, and then they see that, "OK. No, it's not working out." Or they want to progressively point the traffic, and that's when they go for a solution like Argo Rollouts.
Marc: It makes sense.
We don't have 2,500 different services across 200 clusters, but our production cluster has 100s of services and we're using GitOps and I think I'm going to go-- Argo Rollouts looks actually pretty interesting, and I wasn't that familiar with it until just now.
I think it's a great addition to the stack.
Mukulika: Yes. We are actually thinking whether there's some way we can contribute it as part of core Kubernetes, if possible.
Marc: Interesting. Maybe even potentially split it out of the Argo Project and put it into the core Kubernetes, like Kubernetes/Kubernetes?
Mukulika: Yes, because it is very similar to deployment, whereas deployment always follows rolling updates, this allows different other traffic routing strategies.
Marc: I assume when you say deployment, it's probably also applicable to a stateful set, or something like this that has the same concepts built into Kubernetes. Right?
Mukulika: We haven't yet tried it with a stateful set, but good point.
Marc: I think the last question that I really have that I'd love to dig into a little bit more is Argo's been around since 2017, and it sounds like you started off with one project and you've expanded into four.
There's lots of stuff on the roadmap, lots of new challenges ahead of you that you're trying to solve.
But it's currently an incubating project and the CNCF ecosystem starts with sandbox and then moves to incubation, and then graduated for the final step, which is where projects like Kubernetes and Helm and others are.
I'd love to get some insight into what you're thinking about and what the team is thinking about, around the timeline in the steps and what's needed in order to take Argo and make it a graduated project in the CNCF ecosystem?
Mukulika: Sure. We are very actively involved with CNCF.
We are a CNCF gold member, and we participate in all CNCF end-user community activities.
That is from end-user perspective, now about the Argo open source project we definitely want to go to the next stage, which is graduating.
One big criteria for graduating is more contributing companies, and I think we need all the rest of the criteria.
The big focus, although we have a lot of individual contributors, we are trying to partner with more contributing companies to meet that criteria as well as to meet the community need.
Because right now there are so many users that our team is not being able to keep up, and so we are definitely looking for more contributing companies and partnering with them.
Marc: Supporting open source and keeping up with requests from the community is probably pretty difficult.
What is the primary mechanism? Do you have a Slack channel in the Kubernetes Slack, or in the CNCF Slack?
What's the best way to get ahold of the team if we just have ideas or questions?
Mukulika: Sure. If you go to a GitOps repo of Argo, you will see that Argo has its own Slack workspace.
We are thinking whether to move to CNCF's Slack workspace, but right now Argo has its own Slack workspace with multiple channels.
We have more than 3,000 users already in the Argo Slack channel.
Marc: Can you give us some insight into how many people at Intuit are working on the project in either a full time or part time capacity?
How much effort is Intuit putting into building and continuing to mature the Argo Project right now?
Mukulika: Sure. Argo CD is used to deploy all applications at Intuit, and Argo Workflows is used by core platforms like all our cluster management and email platform, and the data processing platform.
So Intuit is using Argo heavily, and so they are obviously very heavily invested also in Argo.
It helps also build our tech brand, so we have around 8-10 people working on Argo full time and as needed we will be adding more. It's a lean team, but great engineers.
Marc: Got it. It sounds like a fun project to work on, getting paid to come in and work on open source projects on Kubernetes on a daily basis sounds like not a bad job to have.
Mukulika: Yes, other than they are drowning in a lot of community questions and are really looking for help.
Marc: That's true. Is there anything else that you'd like to share with everyone around the roadmap?
Anything that we didn't cover that's worth talking about, that we can add in here?
Mukulika: No. I think we covered the roadmap more or less.
Again, we started with application GitOps, now the big focus is on cluster ops and scalability.
For Argo CD, Argo Rollouts, as in when more and more companies are running some service mesh in production, we will add more features around service mesh and traffic routing.
For Argo Workflows, again we are looking for more contributing companies, especially around email and data processing space.
But I think we covered most of it, the team and the community is now working towards taking the project to graduation.
Marc: That's great. I mentioned I've been using GitOps for a while.
We've been using GitOps here, and I've learned a ton about Argo here today, just even some of the stuff you're working around cluster GitOps and then AWS account GitOps coming as a feature that hopefully you open source, and that becomes a core part of the product.
It sounds like a great addition, GitOps everything. Then you have declared a state in anything you want, and it can just be deployed.
Mukulika: Yes, I actually have already started seeing on CNCF GitOps channel, people using Argo CD for Lambda GitOps.
So, people are using Argo CD in different and innovative ways, and contributing back.
Marc: That makes a ton of sense. We've been here today talking to Mukulika Kapas, a product director at Intuit on the Argo Project.
Thank you very much for all the information today, Mukulika.
Mukulika: It was nice talking to you, Marc. Thank you so much.
Subscribe to Heavybit Updates
Subscribe for regular updates about our developer-first content and events, job openings, and advisory opportunities.
Content from the Library
EnterpriseReady Ep. #40, The Uncanny Valley with Alexis Richardson of Weaveworks
In episode 40 of EnterpriseReady, Grant is joined by Alexis Richardson of Weaveworks. They discuss Weave GitOps, insights on...
The Kubelist Podcast Ep. #5, Flux with Michael Bridgen of Weaveworks
In episode 5 of The Kubelist Podcast, Marc Campbell is joined by Michael Bridgen of Weaveworks. They unpack Flux, the GitOps...
How It's Tested Ep. #3, Balancing the Ownership of Testing with Alan Page
In episode 3 of How It’s Tested, Eden speaks with Alan Page. The conversation begins by exploring why developers should own the...