June 25, 2020
Ep. #21, Learning Systems with Jessica Kerr
In episode 21 of O11ycast, Charity and Liz speak with Jessica Kerr. Together they explore complex learning systems, how to view outages as o...
In episode 31 of o11ycast, Charity Majors and Ellen Chisa of Dark discuss the benefits of making observability tools more accessible, tailoring feedback to individuals, and the many ways software may change in the years to come.
About the Guests
Ellen Chisa: So one of the things I have noticed over the last few years is how much we really get big shifts in technology that really changed how we interact with other people.
So the printing press is obviously one pretty great and obvious example where we all start to be able to communicate and record what we're thinking and share it much more, at least a small segment of people with books.
And then I think smartphones did a very similar thing with photographs, and so now people can create words.
People create photos, they can share those videos as an extension, but we aren't really there with software, where we still think of it as this craft or specialist where you have to school for four years, you have to be the type of person who's a good autodidact and able to learn things.
I think what we're going to see more and more is we're going to get tools that enable all of us to be able to build software. And I don't think those tools will be the same for everyone, but I think it'll definitely change.
Charity Majors: That's interesting. It's kind of a natural evolution of moving up the stack.
You move up the stack so far that it doesn't even feel a stack anymore, right?
You're using building blocks or something like that.
And I feel like there's an acceleration that is happening as we are getting better observability tooling, just because so much of our cycles are wasted.
Every time you have to just switch contexts and dive way down into the weeds to understand why LD_PRELOAD isn't doing something or other, it slows you down, it takes you out of your flow.
It takes you out of, you know? And we've learned to enjoy that pain.
We've grown to identify with it and find enjoyment there, but that doesn't mean it's really moving the business forward or helping us innovate.
Ellen: And it takes so long; it's actually what I was doing right before this.
I do a little bit of angel investing and a company I invested in, I've been doing a little bit of a project with.
But they released a new build of something yesterday and it just made everything very slow.
And so we're trying to debug it, but it's only happening for me.
And so it's like we need to get on the Zoom so I can show the thing.
And having that visibility into things is so important.
Charity: Oh, that's so aggressive. Well, this seems like a good time for you to introduce yourself.
Ellen: Yeah, so I'm Ellen Chisa. I most recently founded a company called Darklang, where we made a programming language that was tightly coupled with its own editor and infrastructure.
But before that, I've done a lot of product work at a bunch of startups.
Charity: Yeah, I'm a big fan of Dark, you know. I think it's really visionary. You think of Parse, right?
It was a mobile backend as a service, right?
You had your SDKs and your APIs, and you could build mobile apps and not have to think about, you know, all that stuff.
Which meant I had to think about it, of course. It's not that there are no servers, it's somebody else is dealing with the fucking servers.
But with Dark, it's you're just writing code and as soon as you've written it, it's live. And that scares the bejesus out of a lot of people.
Ellen: It really does. I think it's so funny because the thing that I loved most about it, and you see it in tools or not serious things all the time, like Scratch or the Turtles, or programmable robots. Is when you can do something and you immediately see it working, you immediately see what happens, that's how you learn.
Ellen: And so it's obviously scary to someone like you, who's been working on these big systems and if something goes wrong, it's actually bad. But if what you're doing is small enough or the change of small enough, it doesn't really matter. It's just a better way to learn quickly.
Charity: That is so true. And I feel right now, the entire industry is reckoning with the fact that we never did the D part of CICD.
We talk about CICD and we do CI and we get real religious about it, but nobody does CD, almost nobody.
And this whole debate about, should we do deploys on Friday or not, really, if you were doing CD, then it's like saying, "Should we work on Fridays?"
If you're writing code, then it should be going live.
It is because we've just decoupled that so wildly that people are like, "Ooh, there's something special about deploying."
There shouldn't be anything special about deploying, right?
I've been advocating, I'm just off on one of my tangents now, but I feel like the number one thing that most teams could do to kickstart a really virtuous cycle in their entire organization, is just shorten that feedback loop as short as possible.
If you can treat it almost an atomic event that if you merge your code to main, it's going live. Just tightly couple that, make it as short as possible.
If you can get it down to a couple, few minutes and automate it, you know, no manual flags or nothing, just one action, right?
You merge, maybe it takes a couple minutes but you know that it's going live, then you look at it.
Right, that's how you learn! And so many things flow from that.
So many people are tinkering around the edges of their systems, trying to craft better on call schedules and do all these elaborate things when if they would just make it so that deploys were automatic and easy, so many of those problems would just go away and not even have to be solved.
Ellen: That's definitely true but I think there's mixed incentives, not just from the engineering side.
So on one hand, I think people are like, "Oh, well we want to mitigate risk. We'll mitigate risks by only doing it so often. We'll think about this and that's why we'll avoid the Friday," and I think that's kind of a false optimization.
But I think you still get the same pressure from the business side where you have, say, a sales team who wants to have a big new release so you can sell a big upgrade to something, or a product team or a marketing team who wants to have an excuse to write it, we have all this stuff.
And so I think you kind of end up with these conflicts between those.
Charity: I mean, you need to decouple the idea of deploying and releasing.
Ellen: Also very true.
Charity: Just because your code is going live does not mean that your users are seeing it, right?
This is where feature flags come into play and you know, all of the elaborate gates and everything.
I don't think anybody's advocating that as soon as you've written some code, everyone should be using it.
I would like to use all of my code as soon as I write it but yeah, that's exactly how we did it in Dark too, where it was as soon as you read it, it was deployed but it was by default behind the feature flag, that is how you learn.
Just being able to see it. When I think of what is a senior engineer, I think of somebody who's gut I trust.
And what is their gut been trained on?
Well, for me, if you aren't training your gut in production, you're training it on false data, right?
Which is why I actually find it difficult to think of engineers as being senior engineers unless they've spent a lot of time in production, which I recognize is very backend-y of me.
Nevertheless, how do you know to trust your instincts or not if you aren't watching users use your code in production?
Ellen: Yeah, I think that, yes, you don't know if it works until you actually see it work, is how I would always think about it.
You can pontificate in a corner forever.
It's one of the same issues I have with management consultants is it's very different to sit on high and write a general PowerPoint about how people should do business.
And it's quite another to do business.
Charity: Right, it always is. You know, I've been trying to talk about observability less in terms of this and that data type or whatever, and more in terms of the functionality.
And I think that the step function or the leap that observability provides over monitoring is that granularity.
It's like the difference between wearing glasses and looking through a microscope, right?
It's the difference between sort of hazily seeing on the horizon a thing happened, and being able to peer into it and go, "Oh, these events where this thing changed is different from those," right?
Which is why I get so pissy when people are just like, "Eh, there's three pillars. Eh, do what you want, you know. Everybody's definition of observability is different."
And I'm like, "No, I don't accept that because if you can't actually see under the hood, what's going on, then it's no better."
We need better. Anyway, one of the reasons I wanted to have you on and by the way, my dear cohost Liz is, I think, feeling under the weather.
This is why it's just the two of us. I'm surprised that she trusted me with the keys to the kingdom.
But the reason that I wanted to have you on is because you guys were early Honeycomb adopters and you in particular used Honeycomb a lot for understanding your product.
Can you talk about that?
Ellen: Yeah. It's funny, when you're building programming language and you're building a platform, some of the things you optimize for end up being amusing, but of course we wanted good observability because we wanted to know what our users were doing in case anything broke, because we have this responsibility of running everyone's infrastructure.
So that was mission critical.
Whereas do we need your typical CEO dashboards, be able to see what's going on? Like hmm, probably not that important.
We'll figure it out some other way. And so what I did instead was I actually used Honeycomb to be able to figure this thing out.
And so what I didn't have was long-term retention, but what I did have was every single day, I could go in and I could see which users were using Dark, how actively they were using it, did something seemed to be going wrong?
What times of day, what days of the week were happening?
When do people tend to come back after their first onboarding session when I paired with them?
How many actions resulted from an onboarding session? So I can basically get anything I wanted to be able to help.
Or someone reached out to me and said, "Hey, something weird is happening; can you look at it?"
I'd have a very easy way to go figure out what's going on.
Charity: It's almost what you're saying is that the same questions you need to ask to understand the business and the product are the same ones that you need to understand whether or not your code is working.
Ellen: Yeah, because I think at the end of the day if something isn't getting used, it could be that something is actually broken in the system or it could be that it's not working for the product and no one wants to use it.
But either way, you've got a problem or no one's using that thing.
Charity: This really points to something that I think is core, too.
So infrastructure metrics, and you have a CS background.
You're familiar with the gnarly old systems dashboards where it's like, you've got 50,000 dashboards.
Everything that's in Slash proc has its own graph, right?
Every single knob that you want for IPv4 or IPv6 gets its own dashboard and everything.
You know, those were valuable for infrastructure questions. They aren't so valuable for business questions, right?
And I think that this is what divides, in my mind, infrastructure and metrics from observability, is observability is about your core business differentiators, right?
And monitoring is about, you know, the health as a platform, the health of the service.
And it's fascinating to me 'cause I feel like, you know, I was a systems geek.
I started out running DNS and mail and doing everything. 90% or more of what most engineering teams were doing was infrastructure.
You had to do so much work in order to be able to get to the work that you needed to do, right?
And we should talk, you know, outsourcing and stuff.
But as all of these services are springing up that can take care of these components for you, what we're getting is to a place where more and more of your engineering team can actually focus on your core business, almost approaching 90%.
Ellen: Which is what you want, as opposed to right now, what's going on is every company that becomes large enough has to hire an entire team of people to run basically the same system that every other company is running, which isn't the thing that differentiates their business.
Ellen: Extremely inefficient.
Charity: Yeah, I just wrote a couple of articles about the future of ops jobs because I feel like, you know, some people are like, "Well, then we won't need any ops."
And like, no, if you have a business, you have ops problems because ops is how you deliver value to users and you still need to do that well.
And I think that just looking at the entire socio-technical system and tuning it to ship more quickly.
Honestly, I was just talking about getting that atomic act.
Right, merging and then deploying, that is ops.
That is what every ops team out there should be doing right now to show massive value to their companies.
But I kind of regret, I never worked with product people or design people in my career as a systems engineer.
I didn't do that until Honeycomb and I regret that.
If I was to do my career over again, I would spend more time building products, I think. I think that's a key skill.
Ellen: What do you think you're missing because of that? Or how do you think it makes you make different decisions?
Charity: You know, it's funny because I always prided myself about being very pragmatic and sort of ruthless when it came to technology, like I was a MongoDB engineer or I was a whatever, you know.
But at the same time, I feel like just the rhythm of shipping code, of trying to point to where I wanted to go and then go there, and trying to estimate the amount of work it would take me to do something, you know?
I'm more used to firefighting, and so if you had asked me, "How much time do you think it would take to build this thing," I would just cross my eyes and go, "How the fuck should I know? We'll see when we get there."
And I think that there are just core software engineering skills that I didn't learn.
Ellen: It's funny, I know it's controversial and everyone likes to argue about estimating but for me, one of the markers of how senior someone is the interval they give me on how long will it take.
And it's not necessarily that they're correct.
It's like, how long will it take and what degree of confidence do you have around that?
Because that's what helps you actually know how likely are you are to be able to deliver it to meet your goal that you have for it.
Charity: Yeah. I feel like I mentioned the Stripe developer report every week but you know, they surveyed all these engineers and it turns out self reportedly, we're like, 40% of our week wasted every week.
Ellen: Yeah, it's a lot. I talk about that report all the time too.
I find it to be absurd that oh yes, we've paid these people tons of money to not really do anything, half the time.
Charity: Yeah. And that's an optimistic take on it, 'cause it's self-reported, you know, and that's just, it's insane.
There's so much to be done there. I read this great article by my friend Gergely.
I don't know how to pronounce his name so I'm not going to try it, 'cause it's embarrassing.
About the difference between how Silicon Valley treats their engineers, grossly categorizing, versus, how non-Silicon Valley companies do.
And in non-Silicon Valley companies, he was contrasting where they're always trying to make sure that the developers are spending most of their time writing code, just as many hours a day writing code as possible. Shield them from meetings, shield them from decisions, just crank work out of them. Instead of in Silicon Valley, there's more of a habit of inviting engineers into the business, inviting them in earlier in the process, you know, asking for their opinion to varying degrees.
But he was pointing out that we're problem solvers and you can get a lot more out of us if you actually let us try to solve the problems instead of just cranking code, not to mention what a boring as fuck job if you're just taking tasks.
I can't even imagine.
Ellen: I think it's a very old school perspective on engineering 'cause I went to Olin College of Engineering for undergrad and the whole mission is like, let's educate engineers differently and let's do projects, rather than just training people to sit in a cubicle.
But tons of engineering disciplines have that problem, not just software where people are like oh, that's the cubicle with the mechanical engineer.
You send them a question and then they send you back math for the thing that they had built and the CAD file, and that's it.
Charity: And maybe that even makes sense for more mature engineering disciplines, maybe?
Ellen: I don't think it does. I mean, if you look at it, I think it's half of Fortune 500 CEOs have engineering degrees.
There's a large overlap with ability to solve problems in any context and having some experience actually building something, be it software or a more classical engineering discipline.
Charity: Makes sense.
Charity: Well, you had some interesting thoughts, switching gears a little bit, around non-engineering involvement in incidents.
I think that was kind of one of the things that came up right away when thinking about having observability tools that everyone could use, is the nice thing about having observability tool and having it be accessible is that when something's going wrong, everyone can look at it if they want to.
If you've already set things up so it's easy for multiple people to spelunk through the data, you don't necessarily need to know how to fix the problem to be able to help find the problem.
Ellen: Or even be there to have the conversation to work through what's going on here.
It would be different at a larger company but especially in the early stages of a small startup, it's kind of all hands on deck when something like that happens.
Charity: And it's constantly a little funny to me.
I think there's some engineering snobbery that kind of creeps in, like we forget that pretty much any human is capable of reading and interpreting a graph, and clicking on it and changing it and futzing with it, right?
Especially since with observability, you're moving up the stacks.
You're no longer talking about low level systems things that mean nothing to most people.
Now you're talking about it in terms of end points and variables and functions, and things that are hopefully named intelligently.
So pretty much any person who's fairly literate with your problem space, whether it's sales, marketing, whatever should be able to understand and pick out the patterns of what's going on.
Ellen: Yeah, exactly.
And then same thing, once you have those people involved, if you've had them up front, understanding and hearing the conversations about the trade offs, it becomes that much easier to help with the communication side too when you're able to do the retrospective and do the postmortem and communicate externally.
Charity: Yeah, I've often said that I feel like the edges of tools is where silos are created.
'Cause if you have to spend all your time and energy arguing about the nature of reality, instead of the actual problem, you know, I've seen so many times it devolved into, "Well my tool says this."
"Well my tool says that," And it's like--
Ellen: Doesn't matter.
Charity: Doesn't even matter.
Ellen: What is the problem and how are we going to fix it?
Charity: Yeah, so I'm a big believer in that.
You know, one of the things that blew my mind at Parse that kind of led to us creating Honeycomb was having the experience where Disney would ship an app and it would go to the top of the iTunes store.
And, you know, it would take us days to figure out why Parse was going down constantly because maybe they weren't doing lots of queries, but they were doing a couple and it was taking us down.
So everything was getting so, you know, it was just a mess. It was impossible to figure out.
It would take an open-ended number of hours or days to figure it out.
And once we started shipping some data sets into scuba, the time that it took us just to diagnose and identify, put our finger on it dropped to seconds.
It wasn't even an engineering problem anymore, it was a support problem.
Our sales team could do it, and these were non-technical, non-engineering sales folks.
They could just go click, click, click, "Oh, it's dizzy," right?
And that's the level of complexity that you really want when it's 2 a.m. and you're trying to figure out what's wrong.
Ellen: Well, and when you have that and it's able to share throughout the company, it becomes very easy for people to see like, who are the big customers?
Which then becomes easier to support and it's easier for people to see, "Oh, who's new, what's growing quickly?"
And you sort of build up the shared cultural understanding of what matters to the business.
It's much harder to do when you don't actually know.
Charity: Exactly. Everybody does a better job at their job if they have access to data.
I love this story from Stripe from years ago.
I don't remember what the data set was, but they slapped a thing on top of it to just issue queries about their users or whatever.
And then the genius part was that they retained the history of the questions everyone was asking.
And after a while, I think it was over half the company would be there just asking questions every day.
And it became like a "Oh, what are other people looking at? Oh, I want to see what they're looking at too" and copying queries from each other or just digging into the data.
If you can make it into a game, then you've won.
Ellen: Yeah, we actually, when I worked at Kickstarter, there were some things you could ask for yourself or I would muck around and write my own SQL queries 'cause I was curious about certain features.
But we actually had an entire work list where Fred Benenson, who was the VP of data science at the time, would when he had a few spare minutes, answer highly specific questions for people.
And some of those were where some of the really cool Kickstarter data blog posts came from.
Where people internally would get curious about something and we'd learned something really interesting.
Charity: Nice. Yeah, in systems land, for so long we've had tools that would just shut you down if you were curious.
The archetypal thing in my mind is you put a software engineer on call for the first time ever and they're terrified.
They're like deer in the headlights, you know?
And they're looking at the graphs and their eyes widen, they're like, "I see a spike," or "I see an error," just like, "What is that?"
And they're all. And you know, everyone's just like, "Oh, grasshopper. Happens all the time."
That's terrible, honestly. Wouldn't it be so much better if we were like, "Yes, let's go look at that, let's go understand that."
And the reason we don't do that is because we have gotten so worn down, we know that it's like a pit of despair, that you will never find the answer, that you could spend hours and hours just chasing clues. And all you're going to do is find more things that are broken that nobody knows about and that's going to blow your whole day and you're never going to sleep well again.
And it's not an experience that rewards your curiosity, it's an experience that punishes you for asking the question.
And the thing that I like about something like Honeycomb, is it makes it so easy.
This is what we do with Bublup, right? You just point, draw a little bubble, this is what I care about.
And we tell you what's different about everything inside that bubble, right?
It's curiosity rewarding, which draws people in and encourages them to learn more about the systems because who doesn't love that fuckin' dopamine hit?
Ellen: Yeah, no, that's a really great point.
I never thought of it in exactly this way, but I think there is a thing where we discourage the curiosity because it might create a new problem or it's kind of like, "Oh, I've done that and it's not fun."
And so instead of saying, "Oh, look at it and you'll find the same answer," it's kind of like, been there, done that, find something new.
There's sort of the cynicism that really everyone who looks at a problem is probably going to see it with fresh eyes and have a different take on it.
So it's not true that it's exactly the same. It's a different person, it's a different day.
The company is in a different place. You might as well go look.
Charity: Yeah, I've been on teams twice now where I think of this as a defining feature of old school teams.
Old school, well, almost every team that everyone's ever worked on is defined by, the best debugger is the one who's been there the longest.
Because it's so much a function of your scar tissue, how many outages you've been exposed to, the stories you can remember, your catalog of experiences.
Which is a fundamentally very disheartening thing, right?
Like you could never really catch up. You can never be the expert if you haven't been there for XYZ years.
I remember being on my honeymoon in Hawaii and getting paged by the CTO.
They're just like, "We're so sorry. We can't get the site back up," you know?
And on the first couple of times, that's fun and it's flattering, and then when you realize that you can never really be offline, it kills your soul a little.
Ellen: It seems like that should be knowledge and I don't think I've never seen somewhere that does this well, but it should be easier to institutionalize that knowledge.
And it shouldn't be that it lives in the one person's head, but building intuition is hard.
Charity: The thing is that moving from a monitoring based team to an observability based tooling, I've seen teams get out of that pit and shift to a model where the best debuggers were not the people who had been there the longest.
They were the people who were the most curious, people who use the tools the most, the people who were the most persistent.
You know, not everybody loves debugging and plowing around through data and looking for shit.
But the ones who do should be the best debuggers. And I find that just fundamentally very democratizing and encouraging.
Ellen: Maybe you've already covered this, but one of the things I love most about Dark, where there were certain wow, hit moments people would get, where you made your first API endpoint, you had a whole little world in three seconds and it was amazing.
And you saw your first live introspection and a value and you saw it was amazing.
Maybe you've already talked about it, but what is the moment when someone first manages to get Honeycomb into their system where they see something and they're like, "Oh, now I get it."
Like, "This is the thing I want for my observability"?
Charity: You know, we have spent a non-trivial amount of the last six months trying to figure this out.
One thing that we figured out is it's different, depending on what your background is.
If you have a background metrics versus logging, versus tracing or whatever, for a lot of people, it is some version of the breakdown by a high cardinality value.
When they realize that they can group by user or app ID, their eyes widen, they're just like, "Oh my God!"
They've been told it's impossible and they've blown out the key space in Datadog trying to do this, and all this stuff.
For people who don't have that background, it's often flipping back and forth between traces and events because they're used to having to copy paste IDs from one tool to the other.
Ellen: It's so sad that we're still at the phase of software where we think not having to copy paste is the pinnacle.
Charity: I know. For a surprising number of people, honestly, it's the fact that query results never age out.
So we just store them in S3 as little files, and so you can embed them in Jira or wherever, just incident reports, whatever, and they never go away.
Right, versus people are just used to their elastic queries expiring or something on them.
And for a lot of people, and this is something that we're hoping to build on a lot more in the future.
It's being able to see your history and your team's history, and just being able to follow the footsteps of people who have debugged before.
Ellen: That was always very fun. I spent a lot of time going in and seeing what other people had looked for.
Charity: This is why we're investing so much in our design team.
We're going from a design team of zero to six within a year 'cause I believe that there's so much to be done here.
Just incentivizing people to add some words from their brain to the graph, just to describe or add tags or whatever, so that you can slice and dice and find things, patterns, things that your team is doing, things that you did.
So much of collaboration is a collaboration between your past self and your future self, as much as you and your teammates, 'cause you forget shit so fast.
And just being able to see what past you would have done six months ago when you were working on this part of the system is killer.
Ellen: I find a lot of things that past me did that I have completely duplicated.
And the flawed past me is like, apparently we're at the same document.
Charity: Yeah. What's this about adopting new tools versus proven ones?
Ellen: It's one of the things I've been thinking about a lot recently, is this idea of, to some extent, when you're starting something new, you want to use everything that everyone else uses.
Because you want to know it's good. No one's ever going to criticize you 'cause you picked the safe choice. It's proven out.
Charity: Right. No one ever got fired for choosing a WS.
Ellen: Yes, exactly.
Whereas if you're going to pick something brand new like Dark, or even with choosing to invest in your observability solution instead of choosing to invest in a monitoring solution.
I think you're a little past that point now, but at the very beginning, it would have been pretty out there for people and I think trying to get people to adopt things that are core to that platform when they are inherently taking on a lot of business risk is challenging.
And so, yeah, I was just curious how you've been thinking about that too.
Charity: Hmm, yeah. You know, honestly, I used to wrestle with this so much and I almost feel like we've cheated in a way because I don't know that we have any better answers to it than we did, but the world has bent so far in our direction over the past three years that it's just not as hard anymore.
I remember how hard it was, trying to convince people.
Everyone was like, "It's a solved problem! You're too late, nothing left to be done."
And then everybody in the world started stealing our marketing language, and then it got much easier all of a sudden.
Ellen: It's better to create the movement than have to follow and change your branding to match the movement, in my opinion.
Charity: Yeah, I suppose. When is the right time to start thinking about observability and other analytics?
Ellen: I personally think there is no time that is too early.
I think in retrospect, there was a period of time with Dark where it was early enough and it was mostly just testing with a few friends in alpha and small, like people are writing Caesar ciphers, problems that didn't really matter.
And so my Airtable spreadsheet was fine at that point.
But basically, as soon as you have any users that aren't people that you personally know and are sitting there watching, I think that's when you want to start to be able to observe what's going on. You're always going to see cool stuff that you don't expect people to be doing that'll help shape your product direction, or you'll learn that your personal intuition actually has nothing to do with what anyone else wants.
Which is, I think, upsetting for many founders, but often true, and so I think it's much harder to add it after the fact, and then it's kind of like, "Well, is this month a month?"
And if you just say, "Okay, we need this. We're never going to know what's actually going on without it," you'll have it.
Charity: Yeah, I feel like it's the same sort of paradigm shift.
You know, everybody comments their code now, but that was a fairly hard won battle.
I feel building your observability at the same time as you write the code is just something that we have to get people used to do it.
It should feel wrong to not be writing in some sort of observability.
Your development process should involve a feedback loop that includes you consuming the output of that observability telemetry, right?
You should feel like you're driving blind without it, because you are.
Ellen: Right. (laughs) If we could go back to the beginning, you want to shorten the loop and you want to actually see.
You don't have a loop if you don't know what happens. You have a line with no way down to your line with no feedback in other than guessing
Charity: Exactly, you don't have a loop, you have a line. I love that; I'm totally going to steal that.
But you know what I mean 'cause people feel sort of guilty now if they're writing code without tests or without comments or something.
They know better, they feel bad, right? They wouldn't commit it because they know it wouldn't pass a code review.
And I think that the same needs to be true of telemetry.
You should never accept a diff, you should never submit a diff if you can't answer the question, "How will I know when this breaks in production?"
'Cause it's not if it breaks, it's when it breaks.
And it's just, it's easier to do it from scratch. It's easier to do from scratch and if you go back later and try to do it later, you will actually have lost that original intent.
You will not have the same original intent in your head as you did when you were writing the code and it's impossible to recapture after that.
It's almost maybe a better idea to just start over from scratch.
Ellen: Well, honestly, chances are, even that point you're doing it, it's because something has gone wrong and now you're specifically writing it to fix that one problem, which isn't going to be the problem the next time.
Charity: Exactly. It's almost like you've written some software.
Ellen: How it always is. I do think it's interesting though, 'cause I feel like it becomes easy to say every piece of code needs a test and everything needs its Weimar decree and everything goes through the accessibility review.
Charity: Not everything, everything in production with users.
Ellen: Yes, everything, I guess in my product brand, everything is in production with users for me.
My goal was always to put it through production.
Charity: Me too, usually.
Ellen: With users.
Charity: Exactly, but that's not actually, there's lots of code that never goes into production.
I would say that that's dead code. It's useless, but that's the world that I live in. But you know, there are hobbyists out there.
There are plenty of use cases for code that do not involve highly available, you know, 24/7 internet services.
But if you are writing for one of those, then your code should have tests.
Your code should have observability, full stop.
It's interesting because it points to how much of this is a cultural change, as much as a social change, a change in people's heads about what feels good and what feels right and what is acceptable and what is, you know.
And people are always saying not to shame people, but maybe this points to my raising as a fundamentalist, but a little bit of shame is sometimes a good thing in the right context, you know.
Shaming people for not doing the right thing. Just a little bit, a little bit of salt. You don't want a cup of salt in your dish, but a little.
Ellen: And sometimes it depends on the person. I think it motivates some people. I think it sounds like it motivates you.
It will sometimes motivate me, but there's definitely people that just totally shut down, and so I think that's about knowing your team.
And making sure you're tailoring your feedback to the individual. And also making sure that people understand it upfront.
I feel sometimes, you work somewhere and you get 10 different answers about what code review is for and you're like, "What's helpful to me?"
Charity: Yeah, I guess the shame doesn't need to be inflicted by anyone. I guess it's just a thing that I feel.
Ellen: I do think, yes, I would feel shame if I shipped things without the ability to see how they were being used.
Ellen: Mostly, I would feel frustration.
Charity: Well, that's the thing. I guess people haven't universally had this experience, but it's always easier to write code with observability.
It's easier; you're doing yourself a favor. In the short-term, in the long-term, in the midterm.
In all the terms, it is easier to write code that you can see and understand.
It's why I put my glasses on before I drive down the street, because it's easier if I can see where I'm going.
Ellen: Well, and I liked your point before, though, about this idea of when it goes down or when it breaks in production.
And I think when you shift your mindset from, "I'm trying to avoid any mistake or anything ever going wrong," to "It is going to go wrong and I'm trying to give myself the best set of tooling for when happens," I think that really changes how you look at the work and what seems like the right investment.
Charity: Mm, that's probably true.
You know, people often ask me what they can do to get their engineers to care about observability and my answer is always, "Are they on call?"
Ellen: I think it's also, "Do they care about your customers?"
Ellen: I think if they cared about your customers, they would certainly care about the observability.
Charity: Yeah. Well, any final thoughts as a product person, ex computer science-y person who I think has a lot of really interesting thoughts about the future of technology?
Ellen: I guess I've realized twice today in this conversation, sometimes, people will tell you that something is impossible and will never happen, which probably means there's something interesting to be done there.
And sometimes, someone's telling you everything has been solved and it's going to be really boring.
That also probably means there's something to be done there. Both ends of the spectrum, ripe for innovation.
Charity: Well, thank you so much for joining me, Ellen. This was really great.
Ellen: Yeah, thanks for having me.