In episode 16 of O11ycast, Charity and Liz are joined by Abby Bangser of MOO. They discuss observability from a testing engineer’s perspective, the key factors that lead a company to spin up a testing team, and the highlights of DeliveryConf 2020.
About the Guests
Charity Majors: Abby, you're a test engineer. How did you make your way to observability from all the way over there?
Abby Bangser: I think that testing has so much in common with what people are trying to achieve with SRE, and with operations, and being able to ask interesting questions of our data in our system.
I actually got in for a much more selfish reason though, which was that as a test engineer I was really sick of having to debate whether or not a bug should get fixed.
I thought we should be able to figure out whether or not somebody will ever do this , and I really wanted to start being able to track what was going on in production, and I didn't have a lot of access on some of my projects.
So, this was my way of finding more out and trying to get more involved.
Charity: It's amazing how blind everyone has been flying in production, isn't it?
Abby: It really is. I think the idea that we can validate everything before our users get hands on it is a fallacy for sure.
So as a test engineer, talk about being handcuffed if you don't get a chance to see how things are being used.
Liz Fong-Jones: Right? It's about the mission of, "What are you trying to do?" Versus, "What tools and access do you have?"
Your mission is to make sure people can use the software, and it doesn't stop the instant it hits production.
Abby: Absolutely. In my experience as a test engineer, starting as "I'm going to write Selenium for everything," to learning about how building quality in through good discussions early on is key, to learning about DevOps and building pipelines for quality delivery, and now into the Observability space.
Charity: I feel like once you've been using observability for a little while, it becomes really awkward and difficult to answer the question of why you would want it, because it's so obvious.
Once you've taken the blindfold off, the idea of going back-- That's what we should be asking.
The question is, "How can you live without observability? How can you do your job without actually seeing the impact of what you've unleashed upon the world?"
Charity: Maybe you should introduce yourself.
Abby: Hi, I'm Abby Bangser, and I currently work as a senior test engineer at MOO, which is based in London.
Liz: MOO is a company that actually deals with physical stuff , unlike many of us that deal with only the digital world.
Charity: Only the bits.
Abby: It's fantastic. As a new joiner we have two warehouses, and part of our journey as a new joiner is to go to the warehouse here in London and see where the products are made and understand that.
So, it's a big part of our company and our jobs.
Liz: If your product has an outage, then it means that the printing presses physically stop being able to work because there is no work for them to do, or it's not working?
Abby: Yes. I actually was on call two weeks ago and there was a power outage in Rhode Island, and there really was not much we could do about that, but we got paged and needed to try and help out our teams there.
Charity: It's interesting to me how many of the early adopters of observability have been people with this link to the physical world.
People like delivery companies and stuff, because there are real consequences. Like, visceral physical consequences when outages--
Liz: Yeah, it's not like in ad serving where you just serve an empty ad and it's a tiny fraction of a cent of revenue loss.
No, the people stop being able to physically work and the machines stop working. So Abby, you're a test engineer and you're on call. How did you come to that?
Abby: I was very excited to be in that role with that activity as a part of my job, so I believe that testing is about quality delivery and that includes pre-production validation and adding test automation and exploratory testing, but absolutely also includes how the users are using your system and understanding that impact.
I really wanted to get closer to that, and software testers don't always get access.
So when I saw an opening for a platform engineer tester at MOO, it seemed like a great fit to make sure that I got that experience of on-call engineering.
Charity: Are all test engineers there on call?
Abby: We still have only a single team on call for over nights and weekends.
Abby: During business hours, it's essentially having our software engineers on call because they're fantastic about jumping in and identifying things.
They do incident commanding and debriefing as well, but out of hours it's just the platform team. I would be the only test engineer on call for now.
Liz: That's not a very typical thing for companies to have a platform test engineering role. That's super exciting. I hope more companies do that.
Abby: Yeah, it's been amazing. My first experience to being involved in this work is being involved in the "DevOps work," the pipelining work.
Then from there I worked with Keith Morris on an infrastructure project where he is a huge advocate of diversity of roles, being involved in infrastructure and platform, and that kind of thing. That really helped expose me.
Charity: What leads a company to decide to spin up a testing team? I don't feel like I really have a good handle on that these days.
Abby: I can tell you how I never got a callback from a company, which was that they were looking for their first ever test engineer and they had five developers.
I was like, "That's really early. Normally I talk to organizations looking for a first test engineer with --"
Charity: So you talked them out of that?
Abby: I thought I was giving them the benefit of the doubt.
I was like, "What's made you think this? That you need a test engineer? Usually it's closer to 50 or even 100 developers before you get one."
They were like, "We're starting to move slower because we need more quality."
I was like, "OK. Do you see this role as writing tests for the developers, or about coaching or building frameworks, or whatnot?"
They were like, "Writing tests for the developers." I was like, "Oh."
Charity: Oh, my God. OK.
Liz: Oh, my goodness. We run into that problem a lot with ops too, people think that if you add a sysadmin, the sysadmin will do all the on call and then the developers won't have to worry about the on call. It's like, "No."
Abby: Yeah. So I then shared my beliefs around testing, and they very politely said "Thank you so much. We'll get back to you." And I never heard from them again.
Charity: So Liz and Abby, the two of you just got back from DeliveryConf. That sounded exciting.
Liz: Yeah. DeliveryConf is a first year conference, and it has a interesting premise and an interesting format.
Abby, why don't you talk about the premise of DeliveryConf focusing specifically on CD as opposed to DevOps in general, and I'll take the conversation about the format. How does that sound?
Abby: Yeah, sounds good. I was really excited about it because I attended DevOpsDays a few times and absolutely loved the conversations and loved the topics, and some of the organizers from DevOpsDays are actually the organizers for this DeliveryConf.
So they obviously think highly of DevOpsDays as well, but they noticed that with the single track format and the larger format, that a lot of the topics revolve around culture and are more broad strokes.
They thought it would be worth diving into some of the details around continuous delivery and the technical "Gotchas" in there.
That's really what drove the content of the conference, and it really came out in all the conversations.
Liz: It's really the first time that there's been a dedicated conference specifically around CICD, as opposed to talking about the CICD as part of the DevOps journey.
Charity: Interesting. So, what were the talks that most resonated with you all? Or did they have the Un-conference format?
Liz: They didn't do an Un-conference format. Instead, what they did was after every 30 minute talk there would be a 20 minute discussion, a group discussion afterwards.
Like a reverse panel, where you invite audience members to talk about what they learned from the talk, what they thought about the general area of the talk.
They made it so the speaker does not have a dedicated microphone during it, so you as a speaker are an equal participant in the conversation afterwards rather than being the person answering question.
Charity: That's a great idea.
Abby: I think it was fantastic. I've been involved in something similar at the CAST Conference, which is the conference for the Association of Software Testing.
They have a red, yellow, green format where you can ask new questions or continue on the same thread of discussion, which I think is a really awesome format, but it would have limited us at DeliveryConf because they were able to record these discussions.
I think the value of being able to share those discussions was definitely worth having the format of coming to the front, even though I think the barrier of entry for some people was there and they weren't super comfortable coming up and talking.
I think definitely one of the more interesting things would have been both having a recorded discussion, as well as having the people have the option to break off into side groups that are not recorded and have mini discussions of their own. But it was structured in the form of, "What has been going well for you in this area, in your own experience? What has been going poorly, and what would you like to see happen over the next three to five years?" Those were the directions.
So, in terms of the actual talks that we especially liked, I don't know. Abby, which talk did you like the most?
Abby: There were a good handful, so obviously being in testing I want to give a shout out to --
There were two awesome talks about including testing within delivery pipelines, and how that works when doing exploratory testing, which is inherently manual.
As in, inherently offline and not automated. How do you do that when still trying to deliver quickly and via continuous delivery or deployment?
Liz: That was Lisa Crispin's talk, right?
Abby: That was the first one, it was with Lisa.
I found it to my point of SRE's being so close to test engineers, we had a lot of chats about how test engineers, it's hard to get good ones because we don't always pay them as much as software engineers, but we should.
Someone shouted from the crowd, "Just call yourself a DevOps tester," and I was like "We're an SRE."
Because it's ridiculously close in skill sets, so yeah. It was quite entertaining.
Liz: I know Charity has gone on this rant before, about how when someone is a tester or someone is an SRE, that means that they have the skills of a software engineer and more.
Charity: Didn't Google, in the early days--? I've heard this story, I don't actually know if it's true or not.
But that Google in the early days of SRE had a hard time getting enough people to be a SREs, and then they looked at the pay rates and they were paying them less than software engineers.
So they made the SRE pay bracket slightly higher than software engineers, and suddenly no more shortage.
Liz: Yes . There was a program called Mission Control to encourage software engineers to come into SRE, and they would give you a 20% pay bump for your first year that you were trying out SRE.
Charity: It's like capitalism might work in some circumstances, or something. I don't know.
I think it's really silly, but I think that the whole dick swinging about "My skills are better and your skills," whatever.
I think that it speaks to the difficulty of doing outcome based management, like being outcome oriented, because that requires so much more of your leadership to actually do the work of knowing what "Success" means.
Defining it, being right about it, breaking it down and helping coach people. It's really hard work, so we push all of that onto the ICs.
Like, "You're responsible for it." And then we suck at doing it and measuring it, so we're like "Let's look at some other thing that we can classify as being worth paying more for."
Liz: Yeah, I think that's the other interesting thing. When I first joined Honeycomb I asked you, Charity, about elements of the culture.
One of the things that Jin-Soo the COO and you did early on which was defining the ladders upfront, defining "This is what we reward and promote people for."
Charity: Yeah, and I resisted it at first. I was like, "That seems awfully big company of us."
Then I realized, "Either we're compensating people for the impact that we hope that they will have, or compensating them for what? Their negotiating ability? That doesn't correlate with good engineering skills."
Liz: I got the same pushback when I was defining the developer advocate ladder. It was like, "We have one developer advocate and now we're going to two, and you want a ladder for it?"
And the answer is "Yes. We need a ladder."
Charity: We do, because otherwise people don't know how to grow. We all crave this, we crave the progress and achievement and mastery and growth.
It can be very frustrating to be stuck somewhere where nobody is willing to put in the work to tell you what that even is.
I feel very strongly that job ladders are--They belong to the teams that they describe.
They don't belong to the manager, they belong to the teams, because you should be participating in "This is what it means for me. This is what growth would mean," not having it told to you.
Liz: I think to wrap this conversation back around to testing, Abby, we're having a conversation about what good testing is not.
It is not doing the tests for the engineers. How do you develop leadership? How do you develop the skill ladder for testers?
Abby: I was going to say that is that challenge around how to define output from people and from teams and from organizations, that is so difficult and is very true for any enabling role.
And I think test engineering is a huge example of an enabling role.
That was actually the other talk at DeliveryConf around testing, was Maryam Umar and Jez Humble were talking about how Maryam as a test engineer doesn't do the testing for people per say, but what she does is she can expose them to information that can help them realize gaps.
The most common example there is something like test coverage through a tool like sonar cube.
Having access to that, making sure that tool is running, making sure people are aware of where the gaps in automated test coverage is one thing that test engineers can help shine a light on.
But there's so much more to it, including telemetry and flappy alerts and all sorts of other things as well.
Charity: I've always been really fascinated by the overlaps between test engineering and operations engineering, which is where I come from.
We're both software engineering-adjacent professions that historically are somewhat diminished, but it's impossible to do your job without us.
Abby: That's when I met you. I remember you saying, "I really hope operations doesn't go the way testing has, in that it doesn't become forgotten about or pushed against. It's included."
Charity: That was a dickish thing for me to say, but you know I'm saying.
I feel like the ops profession is te etering on the precipice of people just saying, "It's all technical debt."
All operations work is just stuff that any software engineer should be able to do for themselves. It's just legacy, it's just mopping up after people.
Liz: It's like, "How dare you say that my skills have no value."
It is a skill set that is worth cultivating on your team, and maybe it's worth cultivating on everyone on your team, but that doesn't make it any less valuable.
Charity: I do feel like testers and the test community did a bad job of selling themselves over the past decade or so, because y'all completely dropped off my radar until I met you, Abby.
Then I was like, "This is still a profession?" I felt bad about that, but I don't want the same thing to happen to operational skills.
I'm really happy that testing seems to be stating a bit of a comeback.
Abby: Yeah, absolutely. I think that's about going to conferences like DeliveryConf and conferences where we share across roles, so I helped organize European Testing Conference last year, and it's running again in a couple weeks as its final event.
But it was a great example of where you get people across all the roles and you get testers and you get developers and you even get managers, and it's not even just individual contributors.
That's really where the conversations happen, where you realize the vocabulary is actually just different, but the topics are the same.
Liz: Yeah, we're solving the same conversion problems, just in different silos. We need to break down the silos.
Abby: We were joking at DeliveryConf that it would be a really entertaining talk at a conference one day if we were to get funny people.
So not me from the testing community, but somebody who has good standup chat for testing, software dev, and operations and put them all on a stage together, and maybe product as well.
Just have them start describing something and have someone else be like, "Wait a second. What you're talking about is actually this other thing."
It's just the different words, and if we come together as all people trying to deliver software we would get so much further.
Charity: I have learned so much from sitting closer to product engineers over the past couple years and trying to act like I am a product engineer from time to time, thinking about things through that lens it has really made me better at my craft.
Liz: I think the other element of this is the element of context.
You as a test engineer do not have the context that someone who is developing product necessarily has, and therefore it makes sense for them to write the tests corresponding to their context.
Liz: It makes sense for people who are writing product to write the instrumentation, because they have the context in their heads.
Charity: The people who are closest to it need to do certain types of the work, but then as you get to economies of scale, you also need someone to zoom out and look at the whole thing and systematize and add regularity to it.
One other thing I was going to say, I think that it does sound sometimes like we're telling everybody that they need to know everything .
That you need to be an expert in everything, break down the silos. But you need to be a product engineer and an ops engineer and a test engineer, and all these things.
I think what's actually happening is, because we have the same amount of attention and focus and brain capacity as every other human throughout history, so we don't get more of that.
But I think that what we're seeing is as specialization of cures, we're paging out stuffs that you don't need to worry about and think about just as fast as we're paging in things that you need to worry and think about, but on different scales.
For example, hardware. I know a fuck ton of things about hardware that I haven't had to access in my memory for a decade or more , but I used to go to the code and flip the switch.
I used to know how to sling hard drives and all that stuff, but now my ops people work for Amazon.
I freed up those brain cells, but now I have to think laterally across more disciplines around product and testing and stuff, and I think that it can be very daunting for people who listen to us sometimes and are like "Are you literally just telling me I need to be an expert in everything?" The answer is "No, because we as an industry are developing, we're progressing and we're developing better abstractions. We're developing more specializations, economies of scale, and so forth, that are surfacing entire industries to you through an API that you no longer need to be an expert in."
Liz: I think that was exactly going to tie into what I was going to say about my favorite talk.
My favorite talk was the talk by Jessica Care, who talked about this build versus buy conundrum and how we need to extract as much complexity from our day to day activities, and outsource and make it someone else's problem.
They understand and carry all of that complexity so that we only have to think about that clean API, and to a limited extent know how to debug, or at least who to talk to in the event that the API doesn't work.
Charity: That sounds like a great talk.
Liz: Yeah, it was an amazing talk that really eviscerated this notion of "You have to build everything in house artisinally."
Abby: Yeah, that one was fantastic as well. I think that speaks to the idea of having specialists on your team or not.
You keep asking me about test engineering and how it relates, and all the time people will say to me "I've worked on teams without a test engineer,"or "I think we should not have test engineers on a team."
And every time I go, "There's activities that need to get done. Whether or not those get done by somebody in the role of Test Engineer or QA or whatever you want to call it, I don't particularly care, but the idea that a software dev should both be able to test the quality of their implementation and think about the bigger picture is a lot of weight on their shoulders.
That's where having somebody who has a specialty in keeping the big picture and making sure to shine the light on the things that need to get talked about and focused on, while letting everyone else focus in their smaller areas is quite helpful."
Charity: Every organization is a special snowflake.
It is a special solution to a particular problem that has never existed before, will never exist in precisely that way again, and this is where the human creativity-- This is where the creativity and management comes in.
Who's trying to size up the problems in front of you and apply humans that don't fit in any neat boxes, but trying to cover the surface area of the problem with the resources that you have?
Liz: Yeah. There is a very wide range of people participating in the group discussions at DeliveryConf.
There was everyone from senior principal engineers at Shopify , which is a multi-thousand person organization, all the way down to folks like Danielle and myself who are on a 12-person engineering team.
We have very different constraints. The people at Shopify are much more worried about de-duplication.
How do they make sure that people don't build the same thing five times, whereas we're just trying to figure out "What are the things that we minimally need to build at all?"
Abby: We had CEOs and we had engineers , and it was really great, that diversity for sure.
Charity: All right. You've sold me on the conference.
Liz: I was arguing that we really should have sponsored it this year, and we'll sponsor it next year.
But it was really great getting Jez Humble and Nicole Forsgren and Rebecca, the CTO of Parse and the CEO of ThoughtWorks.
Getting all of these luminaries who had come up with these concepts, and then to have them both talking about how they are seeing the concepts and also have people give them feedback on "This is what they're actually experiencing the field." That was really cool.
Abby: One other talk that really resonated for me was Steve Pereira spoke about "Where's the map to your pipeline?"
Which boiled down to talking about value stream mapping your delivery process, and I think this is something that I learned--
I worked with ThoughtWorks for six and a half years and that was one of the ways in which we got understanding about our clients and understanding about their delivery processes, whereas mapping out all of the steps both online and offline that were needed to get any products into production.
This was something that I used before in the past, and I've even evolved into turning into tracing here at MOO.
We now trace our pipelines to understand the processes to get from development machine up to production, and it makes such a difference when things are visible.
Here at MOO we were able to go to continuous delivery for our oldest application, our largest, oldest monolith, rather than a two week release cycle. Because we actually took the manual steps and turned them into steps in our pipeline.
Liz: Yes, you don't have to fully automate everything as long as it's a checklist with an API.
Maybe the API is fulfilled by a human, but there's an amazing talk by Max Louvea, who is at Google.
This is a talk where he describes the process of turning a Google cloud region turn up from a matter of months to a matter of weeks.
You don't get there by automating every single thing, you figure out what's important to automate.
Charity: Yeah, you make a system. Humans can be part of the system, but it's about making it repeatable so you don't have to engage the creative problem-solving novelty part of your brain, because that's what is open ended.
Abby: And it makes it visible.
Abby: Just understanding. For us, it was the visibility of we would do these offline Wiki page fill-ins that everybody knows of these for delivery processes in most organizations.
It turns out that actually the timeline of when those got filled in versus when code was changing wasn't even aligning, so there were times when we would do validation sign off.
But actually the code changed after that sign off, but before it went to production, and no one even really knew that because they were so disconnected.
By making those manual check offs be attached to every single commit, what would happen is you might not want to validate every single commit.
That's fine, but the commits you decide you want to push through to production, you can visibly see have been validated by any manual processes that you have.
Whether that be security or exploratory testing, or something else.
Liz: I think there's also this interesting thing about batch sizes at Honeycomb.
We deploy things in batches of only one or two commits, and that reduces the surface area that has to be tested.
Whereas if you're batching up who knows how many commits, then you have to run all of the exploratory tests on everything as opposed to just testing the thing that you changed.
Abby: That was also a big part of our move to continuous delivery, was to look at the frequency at which something went wrong due to the size of the batch and the difficulty to identify and rectify anything that did.
Now we deploy, so it went from 2 weeks as a deploy cycle to now it's about four or five times a day. It's one commit.
It's sometimes a big commit, depending on the feature and things , but it's one commit.
Charity: In many of my rants about Friday deploys , I'm like "If you can do one thing to make your deploys less scary, it's deploy one thing at a time.
One change set at a time. Because the worst outages of my life have all been trying to get bisect which of this range of commits was responsible for the problem we just pushed out."
Nobody knows, and it can take days.
Liz: It's not even a problem necessarily of production reliability, it's also a problem of predictability, of delivery.
If you cannot get your individual commit in and instead it goes in a batch that either gets you kicked back along with 100 other things or go through.
That's super unpredictable and frustrating.
Charity: It's very demoralizing and disempowering.
Abby: The last thing I'd add about is it's just so amazing how expectations shift so quickly.
We went from a two week release cycle to being able to-- Our pipeline took almost two hours when we first did the continuous delivery pipeline, and within a week of doing that pipeline people were like "It should be faster."
I was like, "We could wait for two weeks if you'd like."
Charity: Your frame of reference shifts, and suddenly once it is tractable then there's this relentless pressure for it to get smaller and smaller, which is great because you suddenly see all the things you can optimize.
You see how much easier, but it has to get within shouting distance before--
Liz: You have to have observability into your delivery pipeline, and you have to have the agency to make changes to it.
Charity: The first time that I saw somebody apply the Honeycomb tracing to their delivery pipeline, it blew me out of the water.
I was like, "Why haven't we all been doing this all along?"
Abby: Yeah, it's fantastic. And that's one of these cases for tracing for us, I can't say it's our major one but we definitely use it.
But I spoke to Robbie Russell on the maintainable podcast, and was asking me "What do I think makes maintainable code?"
My response was, of course, everything that other people have said around testability and all that. But it's actually people's desire to make change to it , and I think doing a refresh on something brings people's eyes to it and it creates that creativity of how to make it better.
That's why all of a sudden, from going through that 2 week release cycle, going to continuous delivery, now people are seeing it and getting creative about how to make it better.
That is hugely beneficial for us on a lot of fronts.
Liz: That was another thing about Jessica Care's talk that was super interesting, which was the idea of zombie code.
The idea of "If you cannot repeatedly deploy even the same code unchanged, then you have zombie code that shambling around that you cannot modify, that you cannot security patch, that you cannot make incremental change to because you've forgotten how to deploy it. You've forgotten how to build it."
Abby: I absolutely loved that. It fits with something that we've been doing recently as well, we have something like 40 pipelines within our platform team just because of all the small Docker images and tools and things.
It's hard, and we don't touch a lot of those very frequently.
We've started running the master pipeline on those every week, for every single one of them, and it's always during business hours and it's actually Tuesday to Thursday.
Only those three days of the week, just so that bank holiday Mondays here in the UK might as well try to aim for people being in the office if anything goes wrong.
But so far, nothing has gone wrong. We've actually caught some things like package upgrades that cause an issue, and we're able to fix those on our time instead of having to do them immediately because something's on fire.
Liz: I think there was also a thing that came up during Brian Lyles' talk on pipelines and standardizing pipelines, is that when you have people who are treating their pipelines as this opaque thing of "Jenkins' pipeline number 32 doesn't work anymore. Help, Brian."
It makes it harder to understand, as opposed to "Here the well-known Duplo blocks and Lego blocks that you can assemble a pipeline out of. Here's what they're for," so that it's either "There is a bug in your pipeline" or "There's a bug in a pipeline component," but you can disentangle those two things.
Abby: Yeah, his talk was fantastic as well. I've never seen him speak, and he was very entertaining, very helpful. It was a great talk, as well.
Liz: I'm not necessarily sure that I'd agree with his assessment of "All your deploy pipelines have to be Kubernetes," but I understand who pays his paychecks and that's fine.
Abby: That's fair.
Liz: But definitely no "Pick one thing" as an organization to standardize your deploy pipelines onto.
Charity: This has been super.
Abby: Yeah. Thanks for having me. It was a long flight over, but absolutely worth going to DeliveryConf. I really hope that they do it again next year.