May 28, 2014
Every Minute Counts: Coordinating Heroku’s Incident Response
The hardest thing about ops and incident response isn't designing robust systems, debugging production, or quickly repairing technical issue...
In this episode of Don’t Make Me Code, David and Steve have Liz Bennett in the studio. Liz is a Senior Software Engineer at Loggly, a SaaS based logging company. The three discuss the many benefits of dogfooding as well as some unique-to-developer-tools pitfalls of using your own product to build itself.
About the Guests
Liz Bennett is currently working as a Senior Software Engineer at Loggly, Inc, a SaaS based log management service. Her main area of expertise is Java development, particularly in distributed, high scale, and low latency systems.
Steve Boak: We're calling this episode of Don't Make Me Code 'Dogfooding' and we've got our guest Liz Bennett here with us, a senior engineer at Loggly.
Liz Bennett: Hi.
Steve: So we always start with a little background. Can you tell us a little bit about what you do at Loggly and how you got there?
Liz: Yeah, sure. So I work on the infrastructure team at Loggly. I've been there for about two years. And yeah, I work on almost every part of the data ingestion side of things, so parsing, indexing, searching, caching, retrieving data. For those of you who don't know Loggly, we are a SaaS-based logging service. So you send us all of your log files and your log data and you can search through it or create monitors or do dashboards and reports.
Steve: Yeah, and before the show we were talking about topics and this one has been elusive for us. We've done a bunch of episodes and it feels like we should have talked about it already but, like, dev tools.
We're developers building software for other developers, and it's like one of the best things about this line of work, we get to make things that we feel fix a problem in the world.
And so how do you do that at Loggly? Or like how does that impact your everyday at Loggly?
Liz: Yeah we have this large infrastructure stack, a lot of services, a lot of machines running, and we have the same problem that a lot of other software companies have, which is what do you do about all your log files that are distributed across, like, you know, N number of machines? SSHing or grepping is super inefficient. So basically we have built this tool that makes running the tool easier for us, and hopefully a lot of other people to run their own tools.
Steve: Yeah, so it was funny, before the show when we were talking about all the topics under dogfooding, we came up with this one big, great thing which is we get to solve a problem for ourselves and then we couldn't think of anything else good to say about dogfooding. So there are all these what-ifs, like we try to build things for ourselves but our customers might not be the same as us. And so how do you check your assumptions?
Liz: That's a really interesting question because you can use your own product for your own use cases but that could limit you and it could limit the product that you build for your customers.
David Dollar: I'd be interested to know kind of, you know, at Loggly you're using your own products to monitor your own logs, like you said. How do you prioritize? You know, you have feature requests I assume that come from your customers and you probably even feature requests that come from internal teams now, so how do you decide what to work on? How do you prioritize that feedback?
Liz: Yeah, sure. We have two things that we think about. And we have cute names for them. There's inside-out and then there's outside-in. And the inside-out priorities are things that our product managers help us decide by getting in touch with customers and figuring out what the customers' pain points are.
And then there's inside-out, which is things that we decide as engineers what do we think we want, what do we need, what are our personal pain points? And some of those could be directly having to do with the product or just, you know, infrastructure-related. Like, "Oh, we need to update our version of Java," or whatever. I think we try to keep it a healthy mix of half and half. And as backend developers, a lot of the work that we do is determined by what we think the infrastructure needs.
Steve: Do you involve customers in any of those internal decisions that, you know, we want to build X? Is that just entirely inside or do you then go out and find customers, too?
Liz: Yeah we go onsite, we talk to customers, we record videos of them using our product and trying to understand what areas are they getting stuck on, what things do they not understand how they work. Because
Internally when we use our own product, we know exactly how everything works. We're the ones who built it. So we don't get stuck on our own stuff.
Like, we just know how it works. That's an interesting thing about dogfooding is a customer might not realize how something works. They might get really confused about some feature that makes perfect sense to you because you built it and there was some limitation in the infrastructure that caused you to build it in such a way so you understand how to use it.
To them it might just come across as being really confusing. So that's an interesting side effect of dogfooding, I think. And so it's really important to balance. If you're using your own product, you must also get feedback from customers.
It's not enough to just use your own product and think that because you're getting a value out of it your customers must also be getting value out of it.
Steve: Yeah that's a really interesting topic to expand more because there are so many ways that our own companies can be different from our customers' companies or the way we work can be different from the way our customers work and all of those assumptions play into our decision-making. Like we, Opsee, are a very small company, like about half a dozen people, and so we make a lot of decisions with that in mind. But we want bigger customers using us and so we have to go and find them and ask them.
Liz: Yeah, definitely. If you focus too much on what you use and how you use your product, you might be limiting the scope of your product. A big enterprise customer might have a certain requirement and you might not even understand that requirement or ever think of it yourself. You would just never need it for your own product.
Steve: I thought it was interesting what you said earlier too about even if you internally disagree with your customers that it could be a good idea.
Liz: Yeah, because your customers are not you. They have slightly different use cases. If you're building a product for devops to monitor their stuff and you're using your own product to monitor your own stuff, so you're a devops company, you're using your own tool as a devops company. Another company might be a consumer-oriented company. They're using your product to monitor their consumer-oriented product. They might have slightly different use cases or different things that they need.
So yeah, I think if they come to you and they need a feature, they just absolutely need you to enhance your product in some way or another, and even if you think it's totally, you don't understand their product request, definitely spend the time to see why they want it, try to understand if other customers want it, and if that one customer wants it, the chances are other customers are also going to want it. You might just not understand their needs.
Steve: Can you think of a specific time that that happened to you at Loggly, like where your team internally thought something was going to be a really bad idea and then it ended up being awesome?
Liz: Yeah, totally. Yeah, we had this customer come to us and they had some really bizarre-sounding feature. It was just like, "How are we even going to incorporate this into our product?" Like, "This is really weird but okay, you guys are a really big customer and we want you to be happy so we'll just figure out a way." And we added it to the product and we deployed it and within a month or two we noticed a lot of other customers had started using the same obscure feature.
And when we were dogfooding we were using our product, we would have never, never needed that feature. Like, we would have never even thought to use it. And it ended up, you know, really improving the value that our customers were getting.
Steve: What was it?
Liz: It was basically like if a customer sends a log event we parse it. So if it's like a JSON event, we'll parse out the JSON and build a map that's structured data so that you can search on specific fields. But sometimes you might send a JSON that has one of the fields is like an escaped piece of JSON, so it's like an escaped JSON string. And we didn't really have the ability to parse that.
But then we added this feature so you could go in and sort of like recursively parse escaped JSON strings from nested documents. And it's just not something we ever log. We don't ever log nested escaped JSON in our log files because we know that doesn't work well with our product and we use our own product to monitor our own logging.
But other customers had this use case and especially if you're integrating with other products you might not have a lot of choice. Like, some monitoring product is going to just wrap your data in a JSON event and escape your data and it's really just a wrapper needs to be taken off when you send it to Loggly and for analysis.
Steve: That reminded me of something we've talked about internally, about how like the different ways as a company that we're lazy and not lazy. And you were saying something that made me think of it, which is like that if customers are used to getting something without doing much work, then they're going to want it to be that way, as opposed to like, "Well we internally have our processes set up and we've put in the extra effort to get it this way, and so, like, why would anyone do this the lazy way?"
David: That is pretty interesting. One of the things you said was that, like, you had never considered this feature needed because you had, like, built your systems with all of the limitations of your product already in mind.
Liz: Yeah, yeah.
Steve: It's pretty interesting how you can become almost recursively down your own assumptions based on that like, "Oh, of course we're making the right assumptions, look everybody's using it because we built it that way."
Liz: Yeah, it's true, and as a developer when I'm writing my log statements I make sure to write them in a certain way because I know they're eventually going to end up in Loggly and there are certain kinds of formats that just end up being more flexible when they're indexed. Like I happen to know if I format it this way it's going to make my life a little bit easier when I try to make a dashboard later on.
So that's kind of an interesting thing when your dogfooding is you know so much about the best way to use your product. And you could even tell your customer, "Hey, if you change your data to be a little bit different then you can use our product better." But that's not really something you should be asking your customers. You know, ideally you can just use data that's in any old kind of format and it's going to work just fine.
Steve: That is a really good point. That's what I was trying to come up with before. So like
because we know how our product works so well, we'll do other things in a way that makes them work well in our product.
Like so for us that's health checks. And we'll write code for both health checks because our application will read those for both health checks and do good things with them. But our customers don't necessarily do that and they don't necessarily want to spend the extra time to make that happen. And so if we assume that other people are willing to do that, we're doing something wrong. And that's kind of a problem we've run into, is like
we thought with access to these cool health checking patterns people would just be willing to do the extra steps to make those happen, but that was not correct.
Liz: Yeah, that's kind of tough because it's hard to just break out of that assumption or try to make yourself think like a customer. You know, you almost have to go in and sort of screw up your log data for us, like make it as unstructured as possible.
Steve: Yeah, or it's like that beginner's mind kind of thing that people talk about, where you have to assume that you don't know everything about your own product and how things need to be set up for it. Yeah, and then it's closer to the customer's mindset.
Liz: Yeah, crazy idea, if a new hire comes on, the first thing you say, "Okay, use our product, like, use our product for something. Like, that's your first assignment. And tell us what you thought. "What did you run into? What happened?" And they have that fresh mind. They don't know how the product works. They maybe could give some interesting insight.
Steve: Yeah that would be a great practice for any dev tools company, I think, just every new person, have them onboard, have them try things out.
Liz: Mmm hmm.
Steve: Give us feedback.
David: Gotta get 'em while they're still fresh. It's kinda crazy, it seems like you almost have to, like, keep hiring to make that work. Because yeah, it is almost like you have to go out of the building for that, right? Like, even at Heroku it was interesting. We had a whole bunch of teams and some of them were sort of like kernel-space teams and some of them were user-space teams as we thought of them. Like, built the platform or used the platform. And even then, like, when you have people that aren't even involved in the day-to-day building of the thing, you still come to know all the shortcuts and like develop all of these habits and know how to design for the platform. And you're basically useless as a beginner's mind.
Liz: Yeah, and once you know that stuff you can't unknow it.
Steve: Yeah, and that's more of like a better way to say this. Not putting yourself in the mind of your customer. It's somehow trying to erase your own memory so that you're not corrupted by all that deep knowledge of the product.
David: There's even sort of like another side to this coin, right? So we've talked a little bit about how, you know, using our products sort of affects our own product feedback loop cycle, but there's also sort of like another side of this which is like operationally, right?
David: I imagine at Loggly if you're logging things to Loggly then when Loggly goes down, if you lose your logs that's a terrible thing, right? So you have to have some sort of way to think about that. And I think as dev tools companies it's sort of like across the board we use our own products but if our own products are having problems, we don't want to, like, compound that with this crazy recursive thing.
Liz: Yeah, and that's a really interesting problem. At Loggly we have that and we also have the issue of having an infinite loop where back in the early days our QA, we were logging from our QA environment to our production environment, and our production environment was logging to our QA environment. And we had this sort of infinite loop that happened where our QA environment took down our production environment, because it was logging so much, And production was logging and QA was logging.
So yeah, I think you can't have a cycle like that. It's a bit dangerous, not just for the infinite blow-up problem but because if one goes down, you have to use the other one. What I've seen a lot of companies do is they'll have their QA environment, they have their production environment, and then they have their monitoring environment as like their third environment. And that third environment is pristine. Like, it is the last one to get new code. It's the last one to get upgraded if you're upgrading a third-party service. It's just, you know, you need that environment to be up.
Steve: And that is an interesting characteristic, I think, of most dev tools companies as well, is that we're all very concerned with availability. These are not typical consumer apps where if it goes down for a few hours no one's really going to care that much.
Steve: These are business tools. They're tools that people are depending on. And you know, at least in our cases, like, they're monitoring tools as well, so they ought to not be going down.
Steve: When we were talking about consumer companies and how they dogfood, like, some seemed pretty natural, even.
Liz: Yeah, yeah, Netflix. Sometimes I think, like, "Wow, Netflix is such a great product. Oh, I bet it's because all of the people at Netflix use Netflix. If they see something they don't like, it's like the next day they come in, Hey guys, let's fix this."
Steve: Yeah, and I know a lot of companies like Airbnb I know gives all their employees credits to use the service. I think Uber does the same for all their employees. It's like a way, yeah, it's like a built-in mechanism for user testing.
Liz: Yeah, I used to work at LinkedIn and there was a big company-wide push to get everybody at LinkedIn to use LinkedIn more. Yeah, it really is just the best way, I think, to build an awesome product. But it does have drawbacks, which is what we've been discussing a bit.
Steve: Yeah, and there's something, I don't know how to unpack it just yet, but like the consumer companies because people can use those products in their everyday lives, like, you know, Airbnb or Netflix or whatever, like I'm just going to use that anyway at home. I don't need my company to tell me to do it, but LinkedIn or even Opsee something I'm using at work, there's a different kind of dogfooding that's happening there.
Liz: Yeah, because you kind of make yourself, maybe you're not making yourself use it but you might be trying to find ways to use it that maybe it isn't extremely well-suited for, just so that you can use it, just so you can have more time on the product. And that's kind of an interesting problem.
Sometimes we really try to force ourselves to use Loggly for things that it just doesn't do so well at this point, and maybe it never will do well because it's just not really well-suited for that problem or whatever. But we still force ourselves to do it, which can kind of, depending on the problem, maybe it holds you back, like you're not the best tool for the problem at hand.
Steve: Yeah, and that's popped up in some customer conversations for us when we start talking about competitors, like if they're using some product to do something, we probably have gone through some extra effort to make our product work in some ways but customers who have used something else know that it's way easier on X and so they're still comparing us to that and unless we're good about doing all that research, we might be missing that.
Liz: Sure, yeah. And a customer's just going to try your product for a little bit. It's like, "Oh it's really good at this one thing. This other thing, it's not so good. I'm going to immediately move on the next product. Or, you know, incorporate another product into my stack to make up for where your product doesn't handle so well." Whereas with dogfooding, maybe you're going to try to, like, make your product do all the things.
Steve: Yeah, and we'll be more tolerant of its faults.
Liz: Mmm hmm, yeah. But the idea is if it doesn't do one thing so well, the best way to fix that is to realize it doesn't do it well and then actually build out the functionality for that.
Steve: Mmm hmm.
Liz: So it's kind of a discipline thing. If you have the discipline to just kind of stick through it and put up with the flaws until you've gone in and fixed them. And the more it hurts you, the more painful it is, the faster you're probably going to fix it. It's a great way to go in and relieve the worst pain points as quickly as possible if you just make yourself do it.
Steve: So before we talked about a case where, you know, a customer came to you with an example of something that you didn't necessarily want but turned out great. The inverse also seems like a really interesting one where I think a lot of us, like, there's a huge reward in knowing that we created something that other people love, especially because it's developers, like people in our own community.
And the greatest feeling in all this is like making something that nobody necessarily asked for but all of a sudden, like, everyone wants.
Steve: Has there been a moment like that for Loggly?
Liz: I mean, I guess it's hard to say, like, "Yeah, we built this thing. Nobody wanted us to build it but we did it anyways and it was great." I mean there are features that were a lot more successful and just a lot more of a big deal than we thought they were going to be. And when I joined, the first thing that I did there was add the ability for customers to custom parse their data.
They can build their own regular expressions and just decide themselves how their data's going to be parsed. And yeah, as soon as that feature came out it's like instantly everyone was using it, like everyone was using it for a lot of their data.
We were expecting it be a pretty big deal but it just, almost no customer uses Loggly without that feature. It's funny because we don't use that feature that much because we know what Loggly parses out of the box and we format our data that way. I guess that's not exactly an example of the situation where we built something that.
Steve: It is. I think it's a good example of why we do what we do. Like, there's this built-in reward when you know you made something that other people want.
Steve: And other devs want.
Liz: Yeah, definitely. I'm trying to think of is it more rewarding because you didn't want it yourself but your customers really want it?
Steve: Well I know on our product roadmap we have these big ideas, like a couple of them in particular, that we haven't built as of yet because no one's come directly to us and said, "Hey, I really want this." And we feel really strongly that if we, and these are big undertakings, but you know, we don't have a lot of data to back this up and we couldn't because these things, they're just things that don't exist in the world right now and so, like, there's no way to go out and find data to back up our decision.
Like, we can't necessarily find the proof points for it. But at some point we just have the take a leap of faith and try it. And I think, you know, we all love, or at least I love the idea that we would do something like that and it would turn out really well.
Liz: Yeah. I guess I want to say at Loggly it's more cut and dry. It's like we're solving these problems that people have been having ever since they had servers. Like, it's a pretty clearly-defined problem, I think. You know, I guess it doesn't really happen that often where we think of some crazy idea that might just work if we just tried it.
Most of it's like things we know would be so awesome but it just, you know, they take us time or they're just difficult problems to solve so we slowly churned through it and we finally come out with the solution and it's as great as we thought it was and it's as great as our customers thought it was going to be and--
Steve: We kind of touched on this back-up tool, like the systems that we put in place to protect from our own products failing. I don't know if both want to expand on that some more.
Liz: Yeah, it's not so much like we have a back-up tool in case our own product fails. I think in the end it's like we concede that our product maybe isn't the best fit for this use case so we end up using some third-party service. And there's a huge, huge ecosystem out there of monitoring tools and just a lot of companies out there that are focusing on system health and making sure APIs are up and running. And so yeah, we do use a handful of other services.
Having a back-up tool would kind of make sense but then it's like everybody has to know how to use multiple tools that do the same thing and you have to pay for multiple tools. So it's not a huge priority. You know, our back-up tools, we go SSH to the box and do grep. Like, that's what Loggly replaces. So if push really comes to shove we can always just do that.
Steve: Do you have anything like that in place, David? Like, back-up processes, tools, like something? You know, what have you done in case Convox breaks?
David: So it's actually pretty interesting. Convox is more a piece of software than a service. Like we do have, you know, one sort of small service but the majority of it is just software that we give you to install and run somewhere. So there's not, like, one central thing to break that is like a Convox self-hosted thing.
Steve: Mmm, yeah, and being built on open source you're relying on your customers kind of bringing that into their tool set and there isn't really a whole lot that it can do to break after that.
David: Right, it's more isolated environments, so it's not really, like, a global-shared thing that could go down. I mean, it is definitely interesting. I mean, I remember from I think the first iteration of the Heroku Status site. I don't remember what hosting provider it used but it wasn't Heroku. Actually the first one was Heroku and we figured out how bad of an idea that was pretty quickly.
A couple of engineers got pretty overzealous with using Heroku itself and it turned out we couldn't actually start the whole platform back up from zero because it relied on itself too much.
Yeah, it's definitely something that you have to think about and it's almost that in a way you want to be using one of your competitors' tools just as, like, the very last case. All of my stuff is down but I still need to figure out why.
Liz: It could be kind of interesting to use a competitor's tool.
Liz: That's not something that we do much at Loggly but kind of an interesting option that you have as a devops company.
Steve: Yeah, I mean if only for product research to see where you're falling behind or where you're lagging behind a competitor.
Liz: Yeah, exactly. Yeah, I think the back-up tools, I mean you don't have, even if you're not using your own product, a company doesn't really have multiple tools that do the same thing in case one fails. Like, you could. I mean, we have back-ups for our chat service because chat services go down a lot.
Steve: We had this certification process recently. You become a technology partner with Amazon and part of that process was an audit of our techonology. And as a tech provider for Amazon and also a company that's hosted on Amazon, part of their audit is actually looking at how we store logs and customer data and making sure that they're in, you know like you were saying before, an isolated environment.
And it's actually part of the review process and I guess they've seen this enough times now where like technology providers have issues with this exact problem that they made it part of their certification process.
Liz: Like the problem of where they store their logs?
Steve: Mmm hmm.
Liz: Going down and not being able to recover or?
Steve: Yeah, like putting logs in a different environment and they want to make sure that you actually have that stuff isolated from the rest of the product so if it breaks you don't lose your data.
Liz: Yeah, that is true.
Steve: You know, we talked about the forgiveness that we have for own mistakes using our product and like, you know, we're a small company, some of our customers are big companies. Like, what are the other ways that our customers might be different from us or the other assumptions that we make about our own product that other people might not make?
Liz: Yeah, the first thing I can think of is just, we kind of touched on it already, but assuming certain things because you know how the product is implemented.
Steve: I can't really speak to this from any personal experience but, like, the platform strategy of like, "We're going to provide APIs and a platform oor our customers to extend our product, and then we, as the big company, if we like what they're doing, we're either going to acquire that company and make it part of our core product or we'll just steal idea." And we've seen both of those, which is an interesting, yeah, we could talk about this a little bit.
This is another interesting thing that happens with dev tools companies. Like, we rely on platforms and I've seen this go well and very badly, where a company's building something that extends a platform, that company gets acquired, the core product becomes better. Or the inverse.
Liz: Yeah. That's happened with a couple of our competitors, actually. Like, people building these logging solutions and then they get acquired and a lot of times they might focus on some specific use case, like security log monitoring. You know, monitoring your logs, looking for security breaches. And that's kind of interesting to just take something that a small company does and then maybe they eat it up and turn it into something that they want as a big company or something that they themselves need, and then build that product.
So it's like they acquire a company so that they can use that company to dogfood their problems but then improve that company's product so that other people can eventually use that product.
Steve: Yeah, like the virtuous cycle of dev tools companies. Or maybe not always virtuous. I wonder, David, like, you talk a lot about open source and it's not something I've been involved in a lot, but, you know, dogfooding and open source are very closely-related topics.
Like how we all depend on open source tools now, all of us. You know, the entire internet is built on open source. And there's like that, you know, the story of the left-pad guy taking the ball and going home and a bunch of websites broke because of it. As a maker of an open source tool, you have a community that depends on you, and so if you go and break stuff, maybe that's not really dogfooding.
David: Definitely. I mean we certainly use Convox to host a lot of our own internal services, right? I mean all of our internal services. And we kind of share that with all of the people that are also using the same tool to deploy their own applications. It is sort of interesting in a way where we work directly in collaboration with our customers to build out new things in Convox because, you know, they can actually submit code changes to our product.
But yeah, we definitely use our own software to deploy every day, so it's something very near and dear to us is making sure that this thing remains functional for our own uses.
Steve: I feel like it sort of relates to that conversation we had with Mikeal at the Node Foundation about the technical committee for Node and how they as a group kind of have some ownership of what gets into the core platform but that the community can do whatever it wants, and if something good comes out of the community it can be brought into core. So there's this element of, like, internalization but also, you know, anything from outside that's good can be brought in and there's this whole flow to it in open source.
Liz: I think as a dev ops company you have such a unique opportunity to use your own product in the same way that your customers do. It's like if you're a surgeon you can't do surgery on yourself to see how good of a surgeon you are, but if you don't use your own product, then you're forgoing such an amazing opportunity to learn and see yourself, see your company through the eyes of a customer. I think the only thing is to just make sure you don't limit what you build just because you need it.
Steve: Yeah, like you have a really great opportunity to use a tool that you've built to solve a problem.
Liz: Mmm. And so few companies out there and so few people in their jobs get to have that sort of recursive relationship with their own work, and that's such an interesting thing.
Steve: And at its best it can change the way software gets made. Like, telephony was really hard to do until Twilio came along. And so now lots and lots of apps do that. And so like, yeah, as new tools and frameworks come along the nature of software development changes. I'm just thinking about all the things that have come along, like cloud computing, and like it totally changes the way we write code. And microservices changes the way we write code. All these different things, like these new patterns emerge, tools change, and like the industry conforms around them.
Liz: Yeah, definitely. A new tool, like, I don't know, Node.js coming out, it just radically changed the way front-end development worked. Like it radically changed so much about web development. NPM, if NPM weren't around maybe people wouldn't be using a million different packages in their Node.js applications.
David: The thing that's kind of interesting to me, I've just been sitting here thinking about, is like when you're dogfooding your own products, you do get to have sort of the experience of the features and you do get some useful feedback from internal people. But in a way it's almost tainted entirely because you get to use, like, the unlimited version of your product for free, right?
Liz: Oh yeah, definitely.
David: And it's almost like pricing is such an important part of this whole experience and how people choose tools and how they feel about them when they're using them and everything. How do you roll that into, you know, the experience of a Loggly engineer who gets to use unlimited Loggly for free and somebody else who has to, you know, pay for a small amount of it. Like, it could be completely different things.
Liz: Definitely, yeah. That reminds me of one of the most heated discussions I got in at work. We were talking about how we wanted to log in a certain way and I was saying, "No, if we send those logs to ourselves, we can't analyze them. Like, we are going to go over our limit that our production logging account has." We break Loggly when we do that.
And the person I was arguing with was like, "Oh, well we can just go into the configs and change our own limits and everything will be fine." I'm like, "We can do that but our customers can't do that. We need to feel the pain that our customers feel. We can't just go in and make our product much better for only ourselves."
Then we don't feel the pain that they feel, and then it's not on our radar and we don't complain about it to the product managers and the product managers don't put it on the roadmap. So yeah, I think that's a really great point.
Just because you can fix something for your own personal use case, you have to fix it in the way that fixes it for all of your customers as well. Don't do the fast, easy solution because you can.
Steve: Yeah, you can't just hack it for your own needs.
Liz: Yeah, yeah, exactly.
Steve: And that other kind of point that you had, David, about the mental calculus that we don't have to go through about, like, not just is this valuable, but is this valuable for the amount of money that we're paying? We just had a conversation internally about pricing and it was really hard. And I think a big part of what makes it so challenging is exactly that, that we don't have that same touch point of comparing the costs, because we don't have to consider that for our own use of the product.
Liz: Mmm. Yeah, because it's just free. Yeah, if we sat down and figured out how much would we have to pay ourselves for our own account it might be a bit of an eyeopener. Yeah, that's pretty huge.
I think every product manager should figure out, if you're dogfooding, how expensive is our internal account and how much value are they getting out of it themselves?
And maybe you might need to adjust your pricing if you needed a humongous account to get a lot of value out of it.
Steve: Yeah, or are we using like you were saying, Liz, about how are we using the product in ways that our customers might not because we don't have to worry about the limits of the product?
Liz: Oh, yeah, yeah, definitely. If a customer might file a ticket to a customer service rep, the rep talks to us, we have to go in and maybe adjust something with their account if it's a really unique use case. If all we have to do is just go in ourselves and adjust it, that's so much easier. We might lose sight of how annoying it is for that customer to get the same functionality and you might forget that oh, it would be really nice to build an automated way for that customer to adjust their account in that way.
Steve: Yeah, and definitely another thing that we found talking to customers is that if one complains there are probably 10 more that didn't say anything but are still seeing the problem.
Liz: Oh yeah, definitely, definitely. Yeah, if a customer comes to you and wants something or has some piece of feedback, like that's so immensely valuable. And if you don't have the same pain point that they do, it might behoove you to really examine their use case.
Are they using your product in a novel way that you could kind of advertise to other customers or did they onboard incorrectly somehow? Like, they made some false assumptions? If they did make false assumptions, why did they make those assumptions and compare them to your own assumptions about your own product and figure out where is the disconnect. What things are so self-explanatory to you and totally confusing to your customers?
Steve: And how many other customers does this one talking to you represent? Are they the silent majority? You know, are there a lot more people out there with the same issue?
Liz: Yeah, and if they're coming to talk to you and giving you feedback, they had to take a lot of time to do that. They could have just gone to your competitor and said, you know, whatever. "I'm not going to use this product. It doesn't have this thing I need."
I think to backtrack back to the subject of pain, we were saying if something feels really easy for you because you have direct control over your own experience with your own product, you know, that's one side of the coin. The other side of the coin is when you're using your own product, I feel like there should be some amount of pain.
If you don't feel pain when you're using your own product, you're not stretching your product to the boundaries that it can be stretched to.
You're not using the edge cases. You're not figuring out new ways to use your product that maybe it's not super well-suited for. And if you're not feeling that pain, you're not growing. Maybe you're not finding new ways and new paths you can go down and I think that's pretty huge.
If you're dogfooding and you don't feel pain, you're not getting the most out of dogfooding.
Steve: Yeah and you're talking, like, sort of the comfort zone that if you are totally comfortable with everything the product does, and you're never really stretching in any way, then you're not exploring enough, you're not, like, looking for the opportunities.
Liz: Yeah, yeah, yeah. You're not thinking about new ways your product can be used or you're not stretching it to the limit of its scalability. That's another thing. Our internal monitoring account is a really large account, actually. The environment we use to monitor our production environment is intentionally under-resourced, like it doesn't have quite enough resources for the amount of data that we send it.
And that's a really interesting thing that we impose on ourselves because we see the scalability. Where does it break in terms of scalability? What services are running out of memory? Which ones are having GC issues? And we see the breaking point of our own product in our own dogfooding cluster before it gets to customers.
Steve: And this is done intentionally, so, like, you impose limits on it so that you will see scaling issues?
Liz: Yeah, I mean I think it sort of happened naturally. So we started sending more and more data and it started to pull apart at the seams. And we were thinking, "Okay, we can just throw a bunch more hardware at it" but we learn a lot of interesting things, too, when it's sort of coming apart.
You know, we learn where is it going to break? Where do we need to be really concerned about our scalability? Yeah, especially around Elasticsearch. We use Elasticsearch pretty heavily and we've discovered a lot of the scalability limitations of Elasticsearch just through our monitoring environment.
Steve: All right. Thanks again to our guest, Liz Bennett, for coming by.
Liz: Yeah, thanks for having me. This was really interesting.
Steve: And how can people get in touch with you online?
Liz: You can find me on Twitter @zzbennett. That's two Ns and two Ts. Or LinkedIn, Elizabeth Bennett. There's probably a million Elizabeth Bennetts out there but just search for Loggly, too.
Steve: All right, thanks again for stopping by. We'll see you next time on Don't Make Me Code.