May 28, 2014
Every Minute Counts: Coordinating Heroku’s Incident Response
The hardest thing about ops and incident response isn't designing robust systems, debugging production, or quickly repairing technical issue...
In episode 4 of O11ycast, Rachel and Charity ask Adam Jacob, co-founder and CTO of Chef, how the culture of DevOps has evolved both for system administrators and the companies they help build.
About the Guests
Charity Majors: Adam, you've been a pioneer in the cultural change. DevOps, and so forth.
What was the hardest part of that?
Adam Jacob: "What was the hardest part of that?"
Charity: What's the gnarliest culture shift you've ever had to live through?
Adam: This one was the gnarliest culture shift. The shift from being a systems administrator, where I worked for a series of people who occasionally appreciated me but mostly didn't give a shit about me, to being the job and the function that without which none of the modern technology era gets to exist, that was gnarly.
Where your own self-esteem was like, "Hey, none of you care about me at all." And then going through the cultural shift to being like, "Oh, actually, you need me."
Charity: The nerds have inherited the earth.
Adam: You have inherited the earth, but then realizing that it wasn't because you were bad in the first place. You swallow that point of view by yourself where you're like, "Yeah I know I'm not a software developer. I'm not a business person," Or whatever. And, "I'm not as good as you guys."
Charity: I've never had that.
Adam: I believe you.
Charity: Every place I've ever worked ops has ruled the roost.
Adam: We certainly had power. I think there was a minute where as a team, or as people, we had influence and we could say no to things. And we could--
Charity: Oh, no. People would come to us with a platter, just like, "Here. Here's my thing. Is it good enough?" I loved that.
Adam: Yeah, I guess I got some of that too. This story is the perfect example of this.
So, we had a boss. He was actually my boss's boss. And it was Systems Administrator Appreciation Day, and this was a thing that happened. What was the first one? I don't know. It was sometime in the late '90s.
Charity: BOFH era.
Adam: Yeah, it was when that began. When we realized that we should do nice things for the Morlocks. And he took us to a bar, and it's a nice gesture.
So, there's this team of like, 20 sys-admins. He takes us to a bar, and he's like, "I don't know. You're all kind of the dumb guys. You're the ones who couldn't hack it as software developers."
We're like, "Motherfucker, it's Systems Administrator Appreciation Day. You brought us here to buy us beer because you were supposed to appreciate us." And he's like, "I mean, I would but," and I remember the money quote was, "I'm on the patent for the Windows Registry."
And of course this room full of systems administrators were like, "Yeah. And that was fucking dumb. Congratulations on bringing that one into the world, guy." What a dick.
Charity: That would be a dick move.
Adam: I feel like that was, in terms of the gnarliest culture change, it was that one. It was the one where I had to convince myself that that wasn't because I was actually not as good as those people.
There wasn't actually a qualitative difference between me and them. It was just that I liked this piece, and they like the other piece.
And in the early days of Chef we saw that all the time. We would sit around conference tables and there'd be a manager and then a line of five little bunnies sitting there and they'd be like, "Well, everything you just said sounds great, I understand why we try so many things."
My people aren't smart enough to do that. This Ruby thing? That's a bridge too far. And the people he's talking about are literally at the table. They're sitting next to him. And you could watch them deflate. They were like, "Oh, I'm not as good." And you're like, "Oh man."
And so then you spend a lot of time calling out that dude and being like, "Alright, then you were set up. You as a manager. If that's true, if these four people sitting next to you are the chuckliest chuckleheads of all time, then it's time to quit. It's time to get out."
Charity: If there's one thing I have learned as a manager, it's that people are generally dying to be asked to step up.
Adam: All they want to do is be great!
Charity: They do, and they want someone to see that in them. But this is probably a great time for you to introduce yourself.
Adam: I'm Adam Jacob. I'm the CTO at Chef, I'm co-founder of Chef, I wrote Chef originally. I wrote another thing called Habitat here in the last couple of years.
I've been doing the operations systems administrator DevOps thing for a while. I'm 40 and I started when I was 16. Everything in-between, that whole period, was that.
Charity: I started when I was 17. I thought that I was early.
Adam: I ran my first bulletin board when I was 8. Technically that was systems administration.
Rachel Chalmers: Text adventures when I was 12, but only because we didn't have bulletin boards in Australia. We didn't get a transpacific link until '89.
Adam: But you had modems, right? You could have--
Rachel: We had modems.
Adam: You had bulletin boards, right? You must have.
Charity: They only got like, five packets at a time, though.
Adam: FidoNet took three extra days to get from Australia.
Rachel: Oh my God. Pictures? That was like, "Other people had that."
Adam: Yeah, exactly. "That was for them."
Rachel: So, you and I have known each other since Chef was Opscode, which makes us really old friends now.
Adam: Sure, a decade.
Rachel: One of the cool things about Opscode was that you were taking these ideas about configuration automation, and how to reinvent systems administration, into the enterprise.
A lot of the software companies here in the valley sell to other software companies here in the valley, and The Great Worm Ouroboros devours its own tail.
But, you went out into the real America. Tell us about that! Tell us about selling config automation to banks and insurance companies.
Adam: What happened was that we were consultants. We were selling absolutely to The Ouroboros Worm. We would do fully-automated infrastructure for startups. You'd pay us a flat fee, we would automate all your stuff. The fastest we turned it around was 24 hours.
But part of the plan was that we were going to have this code base that we could use to get leverage out of all these customers, and then we would just be Earth sys-admins. We would run thousands of companies on this hyper-efficient code base.
And that was just a dirty lie. It wasn't a dirty lie, it just didn't work.
Rachel: A consultant who lied? I'm shocked!
Adam: Well, we didn't lie to our customers, our customers were stoked. We lied to ourselves.
Every individual customer was great, I could have kept being a consultant for the customers forever. But our own lives were not what we wanted because managing that code base across everybody was awful. Turns out, that's the genesis of Chef, and it also is the genesis of the large enterprise.
If you think about companies as they grow, you've got thousands of applications. We were at a customer not that long ago, they gave us a list of 12 hundred and change commercial off-the-shelf software packages that they use to run a bank. And that's software they did not write.
Sometimes they had commissioned on spec, sometimes it's stuff that they bought as a commercial software package. And all of that software has to get deployed, it has to be managed, people's jobs depend on it and it runs a multi-billion dollar line of business much bigger than almost anything Silicon Valley has created in the entire history of its wealth creation.
There's exceptions, but the vast majority of them don't touch the GEs of the world, the Fords of the world, the Credit Suisse. I was in the Credit Suisse building not that long ago in New York City, and it's this massive marble hall. It could have been Roman for all of its gritty insanity, in a good and terrifying way.
And I think what I learned was that if you went to those places the number one thing people say to you is, "I don't know."
Essentially, what we said to ourselves on Systems Administrator Appreciation Day. It was like, "Well we're not good enough to do that. We're the garbage place where we don't know how to have nice things." Or--
Charity: "We can't learn things."
Adam: "We can't learn things, it's all a mess, there's all these politics." Which is true, of all of them.
But then if you just say back to someone, "Well, really? Because I've met the Facebook folks and they're great, but you're every ounce as smart as they are. There's nothing about your talent, it's not your intelligence. It's just the difference between you and them is that when they go to work in the morning they tell each other they should, and when you go to work every morning you tell yourself you can't. And that's it."
And at some point I think the enterprise will wake up. Someone, some large organization is going to wake up in the morning some day and they're going to look at their bank account, and they're going to look at their balance sheet, and they're going to look at Facebook, and they're going to go, "Wait a second. I don't have all of the nice things. I can have anything in technology that I want. And the only difference is that we're not asking for it."
And so, over the last decade or so of doing that in the enterprise, the big thing that has changed is that it's gone from us sitting in someone's building telling them they can have nice things, to them coming to me and going, "I know that I'm supposed to have nice things." Like, "I'm supposed to have continuous delivery and I'm supposed to have observability and I'm supposed to have all these things. I have no idea how to go from where I am, to this Martian amazing space."
Charity: This is the question that I keep hearing, is the, "How? How do we get there?"
This is literally why we decided start this podcast, is because people keep saying, "That all sounds great. How do we do it?"
Adam: "How do we do that?"
Charity: So, Adam, how do we do that?
I think that part of it, you were teeing in on the perceived status differential between dev and ops, and I think this is why Phase 1 of DevOps is very much, "Ops people, you must be more like software engineers. You must learn to write code." And I feel like--
Adam: The developers can be more like us.
Charity: Well, I feel like it's just in the last couple years that it's really swung back the other way, and people are saying much more, "Okay software engineers, it's your turn. You need to learn to operate your own things." And then they're like, "Oh, but I can't ever be woken up. I didn't sign up for that." Literally people will say this.
Adam: Of course, because privilege is a bummer to give away. I don't want to get woken up either.
Charity: I don't either, I'm over 30.
Adam: It sucks. Yeah, I've got family. I'm done.
Charity: So, obviously there's ops and our masochism, and that hasn't really helped things. I'm over that, but I feel like this should be a not depressing message.
It's so much better when you're a software owner, not just a developer.
I feel like developers are like absentee parents. Just dropping their sperm and just walking away.
Adam: I think there's some truth to that.
Charity: But it's better!
Adam: Yeah, I think there's a misunderstanding.
Charity: Isn't work better when you care? When you feel viscerally attached to what you're doing, and you care about it and you identify with it?
Adam: It doesn't matter where you work or what you've done, you have a moment in your career if you've been in it for any really relatively useful meaningful period of time. This is a thing you can't see when you're year 2 in your career. You might be having it at year 2 in your career, but you can't see it because you don't have the benefit of hindsight. Just gnarly, grizzled age is what allows you to believe it's true. Which is that there are moments where you're awesome.
If you stay in the same field it's because there's a minute where you were the LeBron James of that thing, you were the best in the universe at that. Sports analogy is probably the wrong realm, but it's fine.
There was a minute where you were the best and those minutes carry you for years afterwards. You're like, "Man, we crushed it. That was perfect. We were the best." And for a lot of people you have that experience at some point in your career and then forget that that was a choice.
You made a decision to be that person, you made a decision to make a difference, or to change something, or to add, or to move at that speed, and to just make an impact like that. And that invitation to make an impact again often wakes people up.
And so, there's two pieces to the change. There's the technical, "How do we do it? How do we put the system together in a way that works that we can understand? What are the components we use?"
The cultural piece of it is mostly just about that. It's mostly about just waking up and going, "Hey, if I'm in on the large enterprise, and it takes eight weeks to get a virtual machine provision," which would be normal.
So, six to eight weeks, I would call that your average for getting a virtual machine. For getting a VM-ware virtual machine that you can log into, six to eight weeks. And that was orders of magnitude faster than it used to take you to get hardware.
And now, if you're doing a good job, you can do the same thing with some more automation. Let's call it 17 minutes. That's amazing. You won, you do a victory lap on 17 minutes. But you worked at Facebook, if it took 17 minutes for you to get a machine to log into, what would have happened? It's just not a thing.
Rachel: Menlo Park would be burning.
Adam: Like, 17 minutes. It was interminable to even provision. There's a pool of leased resources so you can have them faster than that, because the minute it might take to spin them up was still one minute.
Charity: That's what I'm saying is we paint a better world, and we just be ambitious and look at the benefits.
Adam: Yeah. But if you're in the enterprise, no one has yet really gotten to a place where they've internalized that they're allergic to the 17 minutes.
This second they're still like, "17 minutes? That was dope." And it is dope, and you should celebrate it. It's amazing. If you're the person who got it to 17 minutes, God bless you. That was super hard.
Also, you got to get it down to 30 seconds.
Rachel: What I'm finding really charming about this conversation is I have this theory that people create the software and the companies that are an expression of their core values.
So, Honeycomb is the distillation of Charity's restless curiosity, and what you're describing is taking these automation techniques out of big, very technical companies like Facebook and Google, and taking them to the rest of the world. But taking the culture as well and telling people, "Yes you can have best in class tooling. Yes, you deserve it. Yes, you can be just as awesome as those guys in the palaces in Menlo Park."
Adam: Yes. And it probably won't look like the place in Menlo Park. Like, to your question earlier about, "How will we do it?" Right now, the most common line of thinking is, that what we will do, is we will take some large enterprise. Pick one, you guys pick.
Rachel: Procter and Gamble.
Adam: Procter and Gamble. And what we're going to do is we're going to rewrite Procter and Gamble so that it works the way Google works. So, if Google had built Procter and Gamble, what would Google have done? And that's our strategy. That's our big theory. And if you think about that from a software architecture point of view,
if you were in a code review and somebody told you that their big plot was that they were going to take the entirety of your business, burn it to the ground, and then rebuild it as if they were another organization they have never seen before, in a pattern they've never seen?
Maybe they work, maybe. But boy, you better need it.
Rachel: They had to destroy the village in order to save it.
Adam: Yes. Everyone must move. And that right this second, that's the predominant strategic theory, "If I take the tools, and the culture, and the pieces of the palaces in Menlo Park and I somehow figure out how to go from being Procter and Gamble to being Google, then it's going to work." And I just think that's false.
I think the truth is what's going to happen is someone at Procter and Gamble, it won't be me and it won't be Charity, it's going to be someone at Procter and Gamble who picks up Honeycomb, who picks up Chef, who picks up Kubernetes, who picks up whatever. And they're going to make it Procter and Gamble.
What pops out the other side, it won't be us who tells Procter and Gamble how to transform. Procter and Gamble is going to transform, and then they're going to tell everybody else, after a decade or more of us being like, "Hey, you should try it this way. What about doing it like this? Hey, Google does it this way. How about this? Facebook works like that."
They're going to be like, "Hold my beer. This is how Procter and Gamble does it. We're deploying 100 thousand times a day, we're doing the world's greatest science you've ever seen and we're adapting technology every inch as fast. We can get resources whenever we need them. We spun up a cancer research project in ten minutes on internal hardware. We developed custom motherboards to do the blah-blahs."
And everybody is going to be like, "You did what?!" And Procter and Gamble's stock price will go through the roof, and next thing you know the enterprise will actually start transforming. But they'll be transforming to themselves, not to us.
Rachel: So, still my favorite example of this is FedEx in the '90s when it reinvented itself as logistics and you could track your package on the web. That was mind-blowing.
Adam: Such a big deal.
Rachel: And to me, it seems like a better metaphor is that book buildings that learn, and rebuilding a modern building in the shell of a factory, or a brewery, or something like that. You want to retain as much of the structural integrity and the original bones while putting new facilities and new capabilities inside that envelope.
Adam: Yes, the soul matters. You can't just say that you don't need those things.
Rachel: One of the biggest anti-patents I see with Silicon Valley companies going into the enterprise is precisely that arrogance you're talking about. The, "Burn it all down." Sorry, Charity, I know it's part of your catchphrases.
What they completely overlook is the domain expertise within companies like State Farm who know everything there is to know about insurance.
Adam: What do you know about State Farm? Nothing.
Adam: Also though, if you're at State Farm and what you want to know is, "What have other people like State Farm tried to do to get better at leveraging technology to do insurance?" I've been at five other insurance companies in the last decade and I can tell you how what they did and what worked and what didn't work. And I can point you in the direction of where success lives.
I can't tell you precisely what to do, because I am not State Farm. I don't have that domain expertise so I can't actually fix it for you.
I can help you move that way, and I can remind you that it won't be an outsider, it won't be me that's going to fix that. I's going to be you that fixes it, and my job is to get you to a place where you realize that.
Charity: How do you inspire people to want to make the changes?
Adam: First off, you just ask them if they like where they are, and if they want it to be different or not. And I mean, the answer to that is, "Yes," but then the next piece is that it's a little like telling someone that they could go to Mars and no one's ever been to Mars.
Let's assume that we have a Martian colony, and in our little analogy Facebook or Google or whatever is the Martian colony.
Rachel: Isn't Las Vegas on Mars? Looks like it's on Mars.
Adam: You have led directly into my analogy. So, you're wherever you are, and you've never been to Mars, but you know there are people on Mars. And you're like, "But I've been to Utah," or, "I've been to Nevada. Is that like Mars?" And you're like, "I mean, kind of, in that there's rocks, and it's dry, I guess."
But no, it's nothing like Mars. And so the first thing you have to get, is very quickly you have to get people out of the environment they're in and have them spend a minute building something, anything, in the way that people who work and have this experience of the big web have.
So, how quickly can I get you to know what it feels like to be on Mars? And your first moment after you feel that is like, "Oh, that felt good." And you're like, "Yeah, you want that again?" And they're like, "Yeah." And you can ask them.
You could be like, "Hey, do you want to go back to the way you worked the day before?" And they'll be like, "Never again. No. Never, ever, ever." And they're like, "Okay, well can you do what you just did inside of your company?" And they're like, "No." And it's because there's this huge list of obstacles between you and that outcome. There's corporate proxies, and firewall rules, and network teams, and all sorts of stuff in between you.
So then, their bosses and their executive sponsors have to get to a place where they too have seen that their people can do that work at that rate of speed, and that they can exist in that way. And then you start saying, "Well you know you can do it because you did it. When we were in this hotel room together, you did it. You know that you can get to Mars. So, let's go do it." Now we have to go knock all these obstacles down.
When you do the reverse, and we tried this for years, or you just start arguing about why the obstacles don't matter, you get murdered. The whole organization will rise up and just find you and kill you in the night.
Rachel: That's one of the best descriptions of moments to joy I've ever heard.
Rachel: What is the appetite in the enterprise for this new generation of software development? Which is much more agile and responsive, and where you can push out features very quickly, and A/B test features with different populations.
Do you see traditional enterprise looking at those kinds of development methodologies with lust or thirst in their eyes?
Adam: Yes. The issue though, is that they've never been to Mars. So, the vocabulary gets adopted so quickly.
Adam: It takes no time at all. Observability is a good example. I don't know how long it's going to be before somebody who is running, I don't know, CA's monitoring stack from 15 years ago. I don't even know the name of the product, I'm just assuming they had one.
Rachel: Unicenter TNG?
Adam: There we go. It tells me that they don't need to do observability because they have Unicenter TNG. That's their observability answer. And if it hasn't happened already, it's like, 3, 2, 1.
You see with Agile, where you go in and everybody's like, "I haven't met an enterprise in the last decade who didn't tell me that they did Agile software development." The number of those enterprises, who if I went to a software developer inside their teams and said, "Hey, tell me the story of why you're building what you're building today. Tell me the reason. What's the impact?"
And they can't do it. They got nothing. But the vocabulary of Agile, it's all there. We're doing Scrums and backlogs, and we're doing it all. But, it's not.
The risk is that the appetite is there, the hunger is there. But the confidence gap is real.
And so when somebody shows up and tells you, "Hey, I'll teach you Agile." OK. Who's teaching you? And what did you learn, and what was that experience like?
I think the appetite is there. I think the skills to do it in a really incredible way is there. I think the magical combination of appetite, and skills, and guidance hasn't hit in a large enterprise yet. Next three years it will, someone's going to pop.
Charity: What do you think is driving the actual need for it?
Adam: Depends on the industry. But if you look at retail, if you are in retail and you can't compete with Amazon on retail technology utilization, you're going to die. It just is what it is.
When I say you're going to die, it doesn't mean that you'll disappear necessarily. Will you grow at the rate that you want to grow? Will you wind up in a niche that you wish you weren't forced into occupying?
Rachel: Will your shareholders be happy on the quarterly calls?
Adam: Yes. And those answers are, "No." And so you don't have a choice in those cases. I think when you look at other industries what you're seeing is this cascade of that side effect.
Let's say you're a rent-to-own furniture company. I don't know how many rent-to-own furniture companies there are in the United States, more than one. But one of them is going to figure out that, "You know what would be great? Is if I could sit at home on my iPad and scroll through furniture and then add the furniture to the cart and have it all delivered to my house. And then you take away my old furniture and you bring the new furniture." And that's the new rent-to-own furniture experience.
Whoever does that first gets to win for a while, while everybody else figures out what to do next. And if they did more than just develop that product, but instead they developed the muscle that allowed them to innovate in their own market that way, they're going to extend that leadership. Their shareholder value is going to grow, and you'll dominate that market for some extended period of time.
And that's the driver. It's why it's increasingly all of the enterprise. It's not just banks, or those sorts of things.
Rachel: We talk a lot about the need for these new ways of developing and managing services in the context of very large scale distributed systems with tons of emergent behaviors and complexities.
Did those kinds of environments exist on the enterprise side?
Adam: Yeah, that is the enterprise.
Rachel: It's a different vector of complexity though, isn't it? You talk about the 1,200 applications and off-the-shelf software that they're managing.
Is it the same in terms of scale?
Adam: "Is it the same?" I mean, I think it is. It's different architectural scale. Let's use the Amazon homepage as an example.
So there's, I don't know, I'll just call it 300 services between friends. None of us work at Amazon and that number is over a decade old, so it's probably thousands now, or maybe it's a monolith again and it all went back to being written in C++.
Point is there's a bunch of services, they build up the web page. How different is that from the eight hundred pieces of commercial software that fulfill consumer loans in a large enterprise?
I don't think it's that different, except the factoring of the software. Architecturally it's different, no doubt. And the rate of transaction is different. So there's many, many, many more people doing many, many more things on the Amazon home page than are getting consumer loans. That's different, but the need to understand what's happening across the system isn't.
The difference is the number of different, high-cardinality information. That flow is different. So there's many, many more individual users who might have a problem. There are many, many more of those things in the Amazon world than there is in the other one, because the transaction flow is different. But the way that we set up those enterprise systems originally was because we understood the enterprise architecture.
We could spend a bunch of time trying to analyze where failure was possible and then we would put monitoring, focused on those failure points, and you do that in the microservice world too.
You are getting to a place where once you start letting people deploy at random, and once you start letting the system emerge in that way, it drives you toward that observability.
Rachel: You can only plan for the outages you know are going to happen. You can't plan for the outages that you didn't see coming.
Adam: That's right. And so the velocity change in the enterprise is what drives the change in the way we think of observability.
It's not the architecture. It's not that you've factored the application differently, that's not it. It's the rate of change and when and how that change is triggered and the rate at which we can understand it. That's what drives it.
So, it's not because you're a microservice or a monolith. It's because the velocity of that thing gets bigger, and as it gets bigger your ability to understand--
Charity: And the number of components and their interactions, because for so long the way that we've debugged systems is with all of his intuition that we have. Just submerged intuitionist scar tissue from undergoing all of these events.
And I have this motto, "That's why the person who is always the best at debugging is always one who's been there the longest." Because they have the most context, and our tools have been so bad that we just look at our dashboards and in our brains we try to make a model of the system that explains the data that we're seeing. Instead of taking it--
Adam: Right, I mean that's literally enterprise architecture.
Adam: What we do is we build a model that tells us what we believe the system is.
Charity: But we haven't really had the models. In vi, they can ask a small question, look at the answer, and ask another question and iterate on it. So the information is not just in your brain, it's in the tool.
Adam: Yes. And I would argue, in those enterprise architectures, that model is not executable. It's on paper. It's like it's in conversation, it's in people's heads, and when it comes time to debug the model the only reason it works is because the rate of change of the model is so low that whatever broke stays broke. And so even if you didn't refactor it, there's one thousand components already in the consumer lending thing.
The product is consumer lending. It's not the thousand other pieces, it's the final outcome. And if they can't debug that, they got nothing. They might as well be a weird cult-like religion that's built up around the consumer-loving-who's-a-whatsits.
They don't know how the system works. Nobody knows how the system works. They just know that if they don't mess with it, it's cool.
And as soon as you start messing with it at a high rate now it all pops back up again like, "Hey, Blake I have no idea why this broke." "What did you even do?!" And you're like, "I don't know. Couldn't tell you." And you're like, "Can we put a system around it?" And you're like, "Sure, in a year after I finally go spelunking to understand what it does."
And that's the difference, because if in the model where what we're doing is just taking in all of that information and then letting me dig and explore and experiment against the information, that's why we need it. But I stand by that, it's velocity that gets you there.
Charity: So what is the state of the art for deploying software today?
Adam: "For deploying software today?" Is it self-serving if I say it's Habitat?
Charity: I was expecting that.
Adam: But it is Habitat.
I believe that we have been approaching the way that we factor the systems wrong this whole time.
We've been building up infrastructure and then we've been holding the application as the final pinprick on the top of this massive infrastructure mountain. And whenever what we have is an application problem, our answer as an industry is more infrastructure.
So we go, "Hey, I have a deployment problem." And you go, "Hey, you know what I got for you? How about a container scheduler that replaces all of the network and all of the CPU scheduling and does all the deployment for you." And they're like, "Sure, sounds great. Do I have a deployment problem anymore?" And you're like, "No." And they're like, "Nope." And so next thing you know we're container scheduling.
But of course there's always a buzz and the buzz is, "Well, what's the software that goes inside? How does that behave? How does it update itself? How does it do dynamic configuration? Can I check health? Where does the statistics and the data I should be gathering for observability come from?
"How do I know how to pick it up? How do I know how to retrieve it? If I add a new one, how do I know it got into the system in the way that I expected? And, how do I run it on my laptop? Can I run all of the services I need to test the service? I want to develop locally on my laptop in the same way that I run them in production across a fleet of hundreds of thousands of systems spread across multiple data centers."
And the answer to those questions is kind of, "No." And whether it's Habitat or something like Habitat, the idea that the application is responsible for its behavior across the entirety of its lifecycle from cradle to grave from how it's built, and its built environment, and its dependencies, to how it deploys, to how it updates itself, to how you check its health.
Charity: I think that, "Application-first," is clearly the way that we need to be thinking about everything. It kind of mirrors what we've been talking about, moving from the health of the system to the health of the event. All that matters is that your request can get the resources that it needs from the application to complete. That's it. Everything else is somebody else's problem.
Adam: That's right. And one way you know that you've found the right architecture, I think there's two ways you know that you're on to something clever.
Way one is that someone comes out of the woodwork who is relatively orthodox and goes, "That will never work. Your thing is a stupid thing. My thing already does that thing." So, that's Clue 1.
It doesn't mean that every idea you have that someone tells you is a bad idea is a good idea, because 99 percent of your ideas are bad ideas. Most of the time when people come out of the woodwork and they're like, "That's a bad idea," it's because it's a bad idea.
But I've never had an actually good idea that didn't have people coming out. I've had lots of bad ideas, where everybody was like, "Ok. Seems like it's good enough." But I've never had a really good idea that didn't have people coming at me being like, "That? Don't do that. You're dumb." And so that's thing one.
And then, where was I heading? I got distracted by my own--
Charity: You were talking about "Application-first."
Adam: I was talking about "Application-first," and then the other one is that the things you have to put around the thing you're automating, or the thing that you're abstracting. Everything else clicks into place.
When you find the right architectural shape, everything else gets easy.
It's like when you're putting IKEA furniture together, and you're like, "Go get the mallet, honey." That's when you know you're doing wrong. You should not need a rubber mallet.
Charity: We talk about this a lot, because one of the hardest problems that we have is just convincing people that it can be so much easier than what they have now. They're like, "No, you can't just ask the question of a high cardinality blah-blah-blah." You need your logs, and your metrics, and your fucking everything.
Adam: I had this conversation as I was leaving with a new hire who is asking me a question about Habitat. And he's like, "You know, it would be great if we could send people an email when a dependency needs to be updated, so they would know that they need to rebuild." And I was like, "Do I have a story for you. I can take all of the software that's ever been uploaded to the depot, look at its list of transitive dependencies, build a graph, find how many independent spans of that graph exists and then compile the world." And that's precisely what we do.
So, when there's a new version of OpenSSL. We just recompile every piece of software that has a dependency, even a transitive one, on OpenSSL. And build you a new release. And it's there, and you deployed it in the morning, and you didn't do anything. I can do that no matter what infrastructure you're running the application on. And when you say it, you're like, "No you can't."
Saying, "No you can't," is the easiest thing in the world. It's like, "You are full of lies." And it's fine. I'm not full of lies, it super works.
But it's going to take a minute because there's a gap.
Charity: Vendors have a history of not being trusted because they do make all these magical claims that--
Adam: Yeah! Look, I want my daughter to go to college, and I got a mortgage, and I sell software for a living no question. And that's all true, but--.
Charity: How can you tell when a vendor is reliable?
Adam: When they're focused on whether or not I'm successful, as opposed to whether or not I got a deal.
When a vendor shows up and tells me that they can solve my problem if only I give them money, then I don't trust them. When a vendor shows up and tells me, "I'm going to solve your problem, then you're going to pay me money." Those are the vendors I trust.
But it's in that order. If my problem gets solved. If so, money flows. Money is no problem.
Charity: So we've talked a lot about culture, but what comes next? What still needs to happen in engineering culture?
Adam: Well I think we have a dramatic still inclusivity problem, and that inclusivity problem is part and parcel of what is also a diversity problem. But the inclusivity problem is bigger.
Earlier today there was a Twitter kerfuffle where The Register, everybody knows The Register, they have a tendency to write offensive headlines. And this one was ridiculous, it was essentially talking about sex and Intel. I could barely read the article, because you can't really get past the headline, because the headline was like, I kind of don't want to repeat it. It was awful.
And so one of the members of our community commented back to The Register. They were like, "Seriously? This is not good." And what they got back was 20 furious tweets from The Register being like, "Why are you such a wimp? Show me where the bad headline hurt you." And other people piling on being like, "Wow, why would you say that and you're blah-blah-blah."
And so we cancelled our sponsorship of their conferences.
Adam: And walked away. I don't think we deserve a cookie, but it's like, A) what that person did by saying, "Hey that headline is awful." If I came across that headline on my own, I don't know that I'd have taken to Twitter and said, "This is unacceptable, what are you doing.?" I'm pretty sure I'd had just gone without. I've have gone about my day.
Charity: Inclusivity is an interesting thing. I come from very poor background. And I think that I've often felt more out of place due to that than my gender.
Rachel: Oh yeah, class is a huge one that we don't talk about. Age is another. But I think it's actually related to what we would talk about with the vendor Stockholm syndrome.
People get accustomed to a certain level of friction, a certain level of restriction in their daily life and they're really reluctant to change that even for something that would be easier.
I mean it's very apparent to me that including more people in the software development process increases the number of points of view that go into that software improves the best practices it improves the products.
Charity: It's kind of the same thing as why people are scared to ship more often because, "Oh but if we change it, who knows? It might all fall over, it might be really painful.".
Rachel: They'd rather stick with the bad reality that they're familiar with than move into an unknown future.
Charity: Very risk averse.
Rachel: The bad news is we're moving into the unknown future whether we like it or not.
Adam: You're just going no matter what you do.
Rachel: Right, Right.
Adam: I mean, I also think the biggest battle isn't the overt one, it's the implicit one.
We have a board member, I'll tell my own story because then I can name the truth of it. It was long ago, very early on in the life of Opscode and we were talking about getting some help, some outside software developers to help move pieces of the product forward. And we were talking about who we should hire and where we should hire them. And nowhere in the list was an Indian outsourced firm. There were none. There was zero.
We're in this board meeting and my board member was like, "Really? Not a single Indian software development firm is on the list? In terms of quality, or price, or any of those things?" And we're at the table, and everybody was like, "No, because they're all awful. Everybody knows that Indian software development is the bottom of the barrel." And my board member was like, "Really? Really. That is some racist shit."
And we were like, "Really?" And my first reaction was like, "I'm not racist, I have Indian friends."
Charity: "We just all know this to be true."
Adam: Like, of course I'm not. And then you let it sit for a second. And I was like, "Yeah. Okay. You're super right. I didn't mean to be, but that was a super racist."
And it turns out, by the way, that we've had incredibly fruitful relationships with Indian software development firms for the better part of a decade. The same people who might as well be part of the company for as great work that they do and as long as they've been there. And that stuff is the next frontier of it.
Because it's the obvious stuff, The Register writes a ridiculous headline. OK, that stuff is easy enough to attack. But all that implicit internal stuff? That's the bummer.
Rachel: It's a good story because it does describe the way those of us with power and authority need to sit with that discomfit, need to sit with the fact that we are biased. We are limited in our perspectives, and to move into a glorious future we just have to get past that. We have to own it and move on.
Adam: And you need people who will tell you that that's happening, and you need people who will give you grace. Who will be like, "Yes I have no hair and a beard, but I'm not--"
Charity: Yeah, it's about the call-out culture thing just as much as it is--
Adam: Yes, which has its downsides. It makes everything terrifying. It makes people that are less willing to speak. There's a bunch there that I think is tough.
Rachel: No one is motivated by shame.
Adam: Yeah. But you are motivated by wanting to be better.
Adam: That's not the person I want to be. And he said it in a hard way in that minute. But it worked, and he then backed it off and explained himself, but it was a loving prod. I think that's the next culture frontier for sure.
Rachel: Kudos to that guy. Adam, thank you so much, it's been so great having you on the show.