August 17, 2016
Treat Your CLI Like An API
In this presentation, CoreOS Head of UX Rob Szumski discusses how his team treats CLIs like APIs including their tactics for better user exp...
Hello, thank you for having me. It's always great to speak with employees and founders at early-stage companies, as you all have to deal with a mess of problems across domain boundaries every single day.
By that I mean problems that don't necessarily fit into these very specialized things that we get at these much larger, later-stage companies. So, I am going to use that as an excuse to go a bit broad in this talk.
I do want to get to an example of how we ship large, risky changes at GitHub, and how we use quantitative and qualitative methods to help make the decision about what to ship and whether or not the thing we've got is good enough to ship.
But first, we'll need to orient ourselves to a few key guide posts to get our way there, and since I've got you here, we're going to take the eclectic tour to get to that spot.
We're going to cover a couple of things: dual process theory, decision-making theory, what our obligations are as job creators in this economy. I heard some claps in the back there. And what it means to balance rational analytic work with design-thinking approaches.
A key theme through this talk is that our industry is quick to highlight and make decisions based on what it recognizes, to collapse down the infinite depth and craziness and otherness of humans into sameness. We'll keep poking on that as we go. It's an idea that I don't want to lose sight of, and you all in the audience here have had a good 20 minutes to ponder this title.
I'm sure, given the introduction, you're curious to see where this will go. All right, who am I? I usually lead this slide with my professional credentials, talking about my work life, but life would be pretty boring with just professional details. I like to take photographs, cook food, read sci-fi, read philosophy and psychology and other things. I like to read papers from different fields such as biology.
I think it's important to lead an interdisciplinary life, to be out of your comfort zone, to be pushed to think about things in different ways, to hear things that sometimes you do not want to hear. There are many reasons for this, but the key for me is to guard against sameness.
Sameness is when we collapse down that depth and range of humanity into tropes, habits, stereotypes and unthinking repetition.
The draw to sameness is strong, though, as it lets us make decisions faster. It's pretty key to day-to-day life. We'll return to this point in a few slides. But on to the rest of the standard, professional background.
My professional background, which I do think is helpful context, and as I mentioned a little bit in the introduction, is in distributed systems and infrastructure, building the backend systems that power sites with millions of users and services that process gigabits per second of data.
Why is that important? Because doing that type of work forces particular restraints upon you. Primary among them is that data, and through that, measurement and instrumentation is necessary to make decisions.
An off-sighted framework for making decisions is the OODA loop. Observe an unfolding situation, orient yourself to the information you have on hand, make a decision, and act. You then observe the outcome of your action, like a loop, and continue.
In practice, short of crisis situations, many actions are taking in the world without following up on their outcomes. For those of you building companies, shaping products and doing things, that's key. You have to actually look what the outcomes of your decisions are. And many decisions are made without orienting oneself to the tools and information at your disposal. All of that is pretty hand-wavy, of course. We'll get into what the actual details of that mean.
On a similar note from the OODA loop, it said that every infrastructure change, every infrastructure deploy, should change a metric. If the metric you're changing isn't there yet, you must first instrument it so then you can actually go deploy your code, the thing that was actually supposed to change something.
Likewise, every product deploy changes a metric. These metrics are, of course, a bit different from the operational metrics referred to in our infrastructure day-to-day life. They are, instead, user and customer metrics, but these metrics change regardless of whether or not you instrument them.
A key note that I do like to point out is that I'm not a stats expert, I'm not a behavioral economics expert, I'm not a philosophy expert, not a lot of things, and most of what I'm going to talk about is self-learned or learned on the fly, or otherwise. I'd encourage you to research things yourself and find good practices for hiring folks that are trained in these things and know about them.
Step zero in any process: learn ancient Greek.
My job nominally entails data and analytics. And what does it mean when we say data and analytics? The first step in many a journey involves pulling up the etymological dictionary so we can actually tell what our words mean.
Datum, which is the singular of data, is from Latin, literally means something given, the neuter past participle of dare. My Latin pronunciation is not very good, but "dare" means to give. We'll skip the grammar discussion. We'll actually come back to that a little bit, but never mind that.
This is an important way of thinking of data, "that which is given to us." What will we do with this gift? Analytics means, roughly speaking, to loose a ship from its moorings prior to setting sail. Everything is in order, and it's that final act before departure. This is, of course, a great analogy given our propensity to ship things.
The next time someone says "ship it," remember that picture in your head the moment where the moorings have been let loose, everything is ready to go out to sea.
To draw the meaning a bit more, analysis is the conceptual opposite of synthesis. This loosing means to break down an idea, concept or problem into its component parts, such that you can see it. This will probably be the time to dive into deconstructionism, but we'll have to leave that as a footnote. We can come to that in the Q&A.
So, when we speak of data and analytics, we need to take the world of given information and to break it down into something comprehensible, something ready to set sail in the world. Speaking of ancient Greek, and the world, it's worth pausing, as well, to note aletheia, the Greek work for truth or, more literally, disclosure and unclosedness.
In Greek, the alpha prefix is negation. Letheia, minus the "a," letheia means to be concealed or covered. The process of finding the truth is an uncovering or an unconcealing. We might also connect this to our idea of analysis, of finding those cracks in the surface that we can loosen up to reveal the truth.
To jump forward a couple thousand years. Dual Process Theory is a cognition posit that there're two distinct systems, pathways or processes that our brains use to make decisions. System 1 is fast, implicit, associative and quite possibly irrational. System 2 is slow, explicit, rule-based, and with some applied discipline, it can lead to rational, analytic thought.
Unconscious bias, which I hope a lot of you are aware of or at least have heard of, is also referred to as implicit association, probably in a more narrow, well-defined definition of it, AKA, System 1 thinking.
System 1 is key and necessary for our day-to-day lives; it's evolutionarily old and means that we don't have to spend 24/7 thinking through everything from first principles, but its associations are often biased.
We group together abstract concepts by gender. We connect back to grammar. We remember our childhood by the smell of a cookie. We think we know who should get a job, because they fit the picture in our head. We think we know who we should be building our product for.
The process of becoming an expert is training your System 1 to act correctly.
I think this is an important point to dwell on. What does it mean to become an expert in something? It's to take the slow, methodical, System 2 process and train our System 1 to do that and to do it right.
For instance, memorizing your multiplication tables, being able to rattle them off in sub-second time. A great way to become a better programmer is to take an LSAT test prep course and to actually try to take the LSAT and actually do well on it.
The LSAT test prep course will drill into your System 1, logical problem-solving skills. You no longer have to eject back up to the slow method; you now have it built-in, fast. Short of training your System 1 in all things, you can also train yourself to do that ejection up to System 2, up to the slow path.
A great way to demonstrate this is to take an online implicit associate test. You'll be asked to use your System 1 thinking to go fast, to think through answers to questions that you can literally feel yourself trying to pull yourself up into slow mode, and you're like, "No, I need more time. I know this is not the right way to answer this question." But the context there is while you know that. You know you're being tested for it, and you're trying to fight for it.
Part of the lesson here is to drag that into other areas of your life, other areas where you don't know that you're being tested. Since I have you here, we're going to take a brief digression onto structured hiring.
Pretty much everyone in this room, I imagine, will have to do hiring at some point, maybe doing interviews even today. I'm not going to dive into a full theory of how to do this well, but I think it's important to call out, because I have the opportunity to speak with you. I have the opportunity to help shape your companies at the phase that they're at right now, but some things that you can do upfront, that will have very big dividends in the long run.
So, let's touch on a couple of bullet points. One, go look up the "unconscious bias at work" talk from Google Ventures. Go watch it, go read all of the papers cited. All right. Then, a quick summary of that, which is:
Write down what you're looking for before you look at resumes or go into an interview.
All right. You're all going to do that, right? Ask yourself, "What am I assessing for, and how am I going to assess for it?" The reason those things are important is because you're using your System 2 thinking in advance, and you're bypassing the other mechanisms that are going to come into play later.
If you do not write down, up front, what you're looking for, you will make implicit associations later on to change the goalpost, to change the way you thought about that interview. And you won't know that you did it, and you'll be able to justify to yourself that you made a logical, rational choice when you did not.
Beyond this, and beyond structured hiring, which I'll just bracket, and we'll move on from that, ask yourself, "How can I give people more opportunity? Where can we find opportunities to give people jobs, to bring people into our industry that are different from us?" All right. Cool. Thank you for allowing that brief rant. Decision-making theory.
So, now that we know that we should be ejecting up to System 2 for most analytical decisions, what do we do with that information? How do we actually make decisions? A great summary paper is available at the URL pointed to there. We'll figure out how to get you those URLs after the talk as well. That paper is a great overview of techniques used in the field of Decision Making Theory.
To summarize a bit from that paper: in the decision theory approach, you assign costs to the various possible outcomes you have in your situation, and you try to minimize those costs. The inverse of cost of benefit, right? You can think of these equivalently.
One common way of trying to assign those costs is through a decision tree. An exampl decision to think about in said tree: "Is it right to ship this particular product change today?" But you can also model out the decisions about the value of a change itself.
For instance, in your decision tree you can have a node representing making the decision that this change will be good for our users. It's a little hard to read on these screens, but the example here is from environmental sciences. And this is basically positing three different thing at our three different stages here.
Step one in the decision tree is what a scientist should say about a particular topic. In this case, whether or not pollution is affecting forest health. And there's three branches: the top one is "yes," the middle one is "no," the third one is "no comment."
The second column is whatever the truth of the situation is, and it either is having an effect or it isn't. The third one is what the regulators will do with this information. Either they regulate something, or they don't, i.e., the decision that they make based on this information.
And then on the right you can see consequences, and you just lay out a table of your consequences. All right, so, we've got this weird analogy from environmental sciences. We probably don't care a whole lot about regulators in the tech start-up industry anyways, right? So, where the heck is this analogy going?
If we replace scientists with people who have to ship product, with CEOs, with people who have to make decisions, whatever, the truth stays the same. The truth is the actual whatever we think the truth of matter might be. And regulators we could sub in our users, our customers, what their response to this stimuli that this change is going to be.
More advanced versions of this decision tree incorporate probability data, either empirically gathered, or just taking your SWAG guess at an opinion. On a technical level, there are constraints and tradeoffs to this model. But I'll defer to the more than capable Wikipedia page on the details, which you can find by Googling "decision tree."
I'm also, given our time here, not going to go into the full, in-depth analysis of said tree. Decision trees, though, are also important to understand, as they open up the world of decision-tree learning, which is like the opposite of this, machine-learning algorithms that generate decision trees for the purpose of categorization or regression.
That is, they come with, if you have some data set, and some categorization method, they find the hyper plane as represented by said tree that maximizes the difference between your categories or finding your regression or other fun things like that. This is a really fun one to really try to model in hyper dimensional space, because we can't possibly write that down. But hopefully you have some idea what I mean.
The classic ML of this is the elegantly named C4.5 algorithm, which again, I'm not going to describe, but it's available via the Googles.
One further bit of orientation before we jump into shipping at GitHub: how good is our intuition at predicting the impact of product changes? Depending on how well-understood the domain is, we have between a one-in-10 and a one-in-three chance of guessing right. This usually scares the crap out of people.
If most of our ships don't do what we want, shouldn't we give up and despair? Do we run into the embrace of existential dread? I mean, we'll still getting paid, so we probably shouldn't. But no, we shouldn't. We shouldn't run into the embrace of existential dread.
The precise unpredictability of our work is what makes it interesting, compelling, difficult and, I'd argue, ultimately rewarding when we do get it right. But the bad news is it takes a lot of applied work. Many of you are here, not at your first startup. Many of you had other startups that didn't work, but we got up and we made another one.
All right, so take that same effort, instill it into anyone whose thinking about product changes. But keep in mind you have to actually look and see if it works, right? And you have to figure out the methods of figuring out, of actually gathering that information and orienting yourself to it.
So, 15 minutes, a little more than that, 18 minutes, something like that, in, we get to shipping at GitHub. We've talked about a few things in here. What I want to come back to is the title: otherness, otherness versus sameness.
We talked about analysis and data, and thinking rationally, and avoiding all of these pitfalls in our head, but when it comes down to it, we're shipping code and software to people. But it's not the code and software that they buy, right?
As Ken Keiter said on Twitter the other day, "People don't buy code and software. They buy a way of thinking and working."
The things that we're measuring, as we change our product, are people. And no matter how much data we have on hand, we're going to be at a loss for complete information.
That's because people are wonderful and amazing and great and represent a huge, wide variety of ways of living and being and thinking, of languages, of cultures, and because of all of those things going on in their lives and worlds, we will never fully understand them, and that is the whole point.
They will never fully be encapsulated by our data, our models or our analysis. No matter how much our System 1 brains want to put them into little associative boxes, they won't fit.
So, why am I ranting about that? What do we do? The example I want to talk about is shipping out a change at GitHub about the way permissions work. The reason that permissions are important is because they literally control everything that anyone can do inside of GitHub, for one thing.
We had a remarkably simple model, and we went to an extremely complicated model. We foisted this change up onto roughly 10 million users on github.com, plus some unknown number of users on GitHub Enterprise, and who knows however many future users. So, pretty big change. Pretty fundamental. But in essence, actually by itself, not this big headline-grabbing type of change, which I think is why it's interesting to talk about.
Previously, the way we managed organizations in GitHub is that individual accounts are on a team, a team has access rights to particular repos. And one magic team, the owner's team, has administrative rights to the entire organization. And, that's the whole thing.
Perhaps there's a small lesson here in how far you can get on an NVP. Maybe that's the whole lesson that you take away from this, which is, actually, don't build the complicated thing. Just get the simplest thing out there, and go. Because it actually works for years and years and years, and then you can go back and try to fix everything and come up here and talk to future generations of people.
But this past year, we changed it to be much more fine-grain model. Users belong directly to the organization. Roles are fleshed out and defined, and have lots of different meanings. There's no constrictive magical teams or anything like that.
This is a pretty big conceptual change to impose on our users, and to go back to the data and analytics into this, we have a very key metric that we wanted to make sure that we didn't mess up, which is the number of private repositories being created, as our business model is to charge for private repositories. If we all of a sudden made it difficult to actually create these, we'd be making it harder for our customers to get work done and we would make less money.
That's actually not the total key to this whole thing, even though that's an extremely important piece, though.
The more important piece is the many, many different ways that this can fail for our users, making this change in kind.
So our approach, and at this phase I want to call out people that are not in front of you, that are not me.
The excellent and hard work done by our user-research group, who drove this process, was to work in two phases. A hands-on, qualitative phase working with pre-release groups and a quantitative phase that was hands-off and done via control experiments We're going to talk though what both of those phases mean, and we're going to connect this back up that whole sameness-otherness thing, right?
To step back, and this is a gross oversimplification of both of these things. It won't give you the complete history, of course, of how we developed GitHub, on one slide. That's quite impossible. But back in the day, meaning two and a half years ago or something, when I started that, we built product for ourselves.
The way that we divined whether or not it was a good thing was we staff-shipped it, meaning there's a little bit on the user's table that says whether or not you're staff, and we would use that effectively as a feature flag so that we could see new things and test them out.
We didn't really have dates around that. We would kind of get something out to staff, get it shipped out in front of eyeballs, get whatever kind of organic feedback you were going to get, loop back to it at some point, and if there was enough consensus that "Hey, this is great, let's ship it. Otherwise, meh. Or like, hey, this is terrible, don't."
I'm going to call out a few key things there, one of which is you've got to use what you have on hand. You're trying to get a company off the ground. You don't have resources to go do a giant research study.
You don't necessarily have time to go out to each one of your customers and look over their shoulder how they're working. I would argue you actually do, but, you know, you got to do something. And this worked really well for us, for a while.
Like, we're GitHub. A lot of you showed up probably just because of GitHub in the title and nothing to do with me. Which is great. It makes it a lot easier for me, by the way.
But at some point, it stopped working. And it stopped working because we stopped being representative of who our customers are.
It turns out there are other people in the world. There's a lot of other people that aren't in this room. So, how the heck do we get in contact with them?
To jump back to the slide up here, I'm not going to read off the slide directly. We'll jump back into our example. All right, cool. So we want to ship this big permissions change. We have, again, our two phases: qualitative phase, quantitative phase.
Qualitative phase, we go out, we find a group of people. In this case it's 20 customers in each group. We go bring the product, we're like, "Hey, we built this wonderful thing. Do you want to try all of our cool new stuff?" They're like, "Yes, this is awesome." We give it to them, and it breaks everything.
We go in and we find out what the heck it did. It doesn't technically break, sure there might be bugs in there, but what we want to find is how this collides with their idea of reality, right?
We have very few tools on hand to figure out what their idea of reality is besides encouraging them to try out this new thing and then seeing what happens.
It's actually not that complex, but you have to recognize that there's a lot that you will not be able to anticipate, and a lot that you actually can't see without going through it.
So, we did that the first time. We actually radically rethought it. We did it a second time. The second time, we brought in a new prerelease group of 20 people, and we gave the updated version to the first group. We ran into yet another set of problems.
We did this a third time, and I think maybe even a fourth time. And by that fourth time, we were finally getting to the spot where, all right, cool, qualitatively we can stick this in front of a set of people, and it does what it's supposed to do. That's awesome.
Because we actually did the hard work of seeing what it would do, we went out into the real world and did that. So, the question being from that point forward is how do you extrapolate up?
Moving on to the next phase, which is the quantitative phase of running controlled experiments. I'm going to alight most of the detail here for this audience, as it's unlikely that you're necessarily at the scale where that's even going to matter.
And that's not a bad thing; it's just not going to be a great use of time to dwell on it. Totally happy to talk more about it in Q&A or afterwards, though. But we should talk about the actual metrics, what the heck we were measuring against, given we don't talk about the methods of randomized controlled experiments.
So, this brings us to our next large thing which has, on the surface, nothing to do with product development, nothing to do with otherness, which is quantitative marketing. And the reason I mention this is because this is an important keyword to go Google, again. We've had a couple of them here so far, but this a great one if you don't remember any of the other ones to go Google.
Quantitative marketing, you could also just skip right to the bottom, which is brucehardie.com, a link to a couple different papers that Bruce Hardie authored. And then you do the normal thing which is go through all the references and go follow that entire network out and go find other interesting things, and then bring back all of these crazy analogies from this totally different world of quantitative marketing.
This is born out of trying to sell you catalogs and send you mailers to figure out who's going to buy the next hot thing at Target. But it turns out that all of the analogies and all of the math from there actually is super important to under in your own businesses.
So, we're going to talk about two things. One is customer lifetime value. Value here is typically referred to in terms of dollars. If you're running a subscription SaaS service, or something like that, you figure out how long your customers are likely to stick around for, how much they pay in each time period.
You sum all of that up, and you have customer lifetime value. If you're not on a subscription basis, you have some non-contractual people, you have an e-commerce site or something like that, you figure out how many purchases people are likely to make, the value of them, you add it up.
The thing that's important here is that this is not just money. You can do this for anything else that's quantitative, right? In particular, activity, engagement, time on site, other things like that. The important bits here are to think about for year particular domain for the area that you're in, what are the actual metrics that the other companies in your business area are looking at your benchmarks, all that jazz. Not super interesting on that end of the scheme.
To get the customer lifetime value, you can hold on to transaction data. That's pretty straight forward. To back up to the question of, if you apply this to all your other activity metrics, do you need to store the entire history of everything that you have done? I would argue that would be great.
If you're starting a company now, starting a product now, there's somebody doing analytics and data stuff at your company. Just start recording everything, because it's actually possible. Because hard drives are really big these days.If you can't do that, or you have that on hand and you're trying to figure out how to summarize it, the great analogy is: recency, frequency, money.
And the reason use RFM, this really weird term, is because it's Google-able. But, the actual, specific one might not be. Money might not be the thing for you. Engagement or duration or other things might be important to you. The relationship between recency, frequency, money and customer lifetime value is the important thing to understand.
Then we go read this following paper, which you can find, which establishes a formal method for grouping out. There's many different forms of recency, frequency, money, different behaviors of your customers can show, and a lot of. But yet they still have some relationship to the high-paying customers, the ones that are going to stick around, give you a lot of money.
The ones that you should be focusing on, the ones that are going to churn out, the ones that you need to go figure out what is failing, what's going wrong with them. If you don't have those things on hand, you don't know who to get more money out of, and you don't know who to go give help to, so you got to do that.
Besides that, thank you. We'll cover a few conclusive statements though, which is, thank you for bearing with me in this semi-experimental talk. I definitely want to dive into more technical details if anyone wants to talk about that during Q&A, but I do want to highlight, again, this whole point of getting outside our own heads and getting outside our own experience and being comfortable with being uncomfortable outside of ourselves.
And trying to push each other to do that more, because that's the only way that our industry's going to get up and out of this really weird, immature stage, that there's really a small number of us from a really small sets of backgrounds. Get up and actually participate in the world, and get the whole world into what we're doing.
We have two systems, one of which is open source. It's jnunemaker/flipper, I believe, a GitHub open-source Ruby gem. So there's feature flipping, and then there's randomized control experiments, and while I diagram them as being non-overlapping the sets, they're actually, of course, quite similar.
The systems that we use internally, primarily, developers are using the feature-flipping software whereby you put in whatever bit of conditional logic into the code, and then we have an entire UI around that that determines, based on different criteria, who's going to be in either the control group or test group, or multiple groups, or however you have it set up for your particular feature flipper.
We don't have, as far as I'm aware, any automated processes around that in terms of automatically ramping that up or down based on outcomes. Because we do all of that on the experimental end.
So, for feature-flippers, we're primarily doing that for things where we want to send out a new feature to five percent of users, then go check by hand the operational metrics, make sure nothing fell over. For randomized, controlled experiements, it's substantially more on the product impact, user impact.
First of all, we probably should talk about experimental design, how you actually randomize things and determine their independence. We'll align that a little bit. But the system looks primarily the same in terms of, you define an experiment, you have conditional logic that has each one of your variants in it, and the system itself is tied to a particular outcome metrics.
And then whichever variants are performing better based on those outcome metrics, more people get assigned to that until you've reached a point where you can make a statistical conclusion.
The math we use to outline that is Thompson sampling. But there're other approaches as well on that end. Are there more specifics I can get into that would be helpful?
Audience Member: I could geek about this all night. Does everything become an experiment? Do you have a threshold where you decide whether or not to feature-flip something?
Maturen: Yeah, that's a really good question. Ideally, and this is totally me speaking as me, and not me speaking as GitHub, we want experiments to be in the process for every single deploy, meaning every deploy is going to have some impact. We'd better figure out what it is, right?
But there's a lot of work to do there in terms of convincing people of what the value is of that approach, getting the tooling in place to lower the barrier to entry, and generally changing hearts and minds.
In terms of the heuristics around that, it really depends on the team. The way that we work is we have a product department and an engineering department. Under a product there's user research and design, and a product-management group. And for the most part, each engineering team has a group of software engineers: an engineering manager, a product manager, one or more designers.
It's up to that group of people, primarily the product manager, at the end of the day, to determine what they want to be experimenting on. And that really goes to, what are the goals and objectives of that particular team? And the ones that are substantially more metrics-driven are relying heavily on experiments, because otherwise they don't know whether or not their work is effective. For other folks, it's maybe less important.
Maybe we could back up a second and talk about GitHub the product versus GitHub the business. GitHub is one product. There's github.com, and we sell that as an on-premises version as well, but it's four, roughly speaking, four businesses, right? And this is kind of by analogy, not necessarily literally.
We have the open-software development, which basically takes the form and shape of a social network. Meaning that we have following and stuff in GitHub, but that's not really the primary thing. It's the actual relationships between people that makes up that network.
We have a B-to-C SaaS product, which is personal accounts. They're subscription based. We have a B-to-B SaaS product, which is organizations. We have a B-to-B Enterprise, on-premises, full services-integration business as well. And they all use the same product, and it's all the same code.
When we ship a change over here to help out with, by analogy, the social network, it's also going to have some impact over here with some on-premises thing at big company X. So perhaps this goes to the heart of what you're talking about, which is that it's hard to distinguish those things.
When we run experiments, what we try to do is we want to understand the independence of the assignment. In a network that's obviously, frequently difficult to find points or breaks, where you're not going to have a use that's part of multiple, different organizations, and is administrator on them. To see different behaviors across different ones, because they're going to be in different feature-flippers or in a different experiment enrollment.
But generally speaking, with enough applied effort, we can figure out where those breaks are to do a good enough analysis, and we treat, on top of that, just to back up to the general analysis, we treat the metrics for each one of those business conceptualizations a little bit independently. And we try to be mindful that changes will impact different places, and that everyone's going to see everything at the end of the day.
But primarily if we're trying to make an open-source software development community bigger, we think about the metrics and things that go on there, and it's going to have some side effect elesewhere. But we focus on that. So that may be the super general answer to your question. Do you have more specifics? Again I'm probably going to follow up everything with, "Do you have more specifics?"
Audience Member: As a follow-up to that, does that mean if I'm a user, and I have a private repo but I also work for a company that has a repo, when I'm taking actions, when are you segmenting me into the B-to-B versus the B-to-C? Brcause, you know, one user could conceptually fall into three of those four, cause you can't do the Enterprise, but could fall into one session, be working on open source in their own private repo, and their company's repo. So how do you segment their actions across the three different kind of business opportunities?
Maturen: Primarily it boils down to the repository that they take the action on, right? We basically have this matrix diagram of org-owned to personal, for .com, so that's org-owned to personal-owned, private, public, and that's the primary quadrant.
In terms of everything, it's broken down through that lens because, for the most part, all four of those behave differently. Conceptually speaking, you can think of all of the open source activity that you know of on github.com. Now multiply that by 10, and you've got the org-owned to private quadrant, but the behavior there is so radically different, because everyone's getting paid to use it day in and day out. But that's how we distinguish that context.
When we think about user analysis, we think about the things that might influence them, the things that might bring them back day to day, in terms of there's a lot of variables at play about a user. But for the most part, when we do that, just the baseline reporting, we think about repository, because that helps distinguish that so easily.
Audience Member: So does that mean when you're thinking about the events the users are taking, like in a user journey, each event is categorized based on whatever view that they're viewing on that time being? They're viewing this repo, therefore the events they're taking while they're viewing this repo fall under this quadrant. They switch repos, now we're associating all their events in this other quadrant, is that kind of how that works?
Maturen: For the user onboarding or user-growth milestones, we look at that performance relatively independent of the context that they're in for that type of reporting. So the team that's responsible for user growth, which owns onboarding and acquisition, and some degrees retention, things like that, which is another team that I manage if anybody wants to talk about that, besides the data and analytics. Also pages and gist, and search and some other things.Search I don't have to manage anymore, but have managed for a while.
It depends on the context that you're in. So, what we see is, this goes back to the whole heterogeneity of all of our data point, right? The fact of the matter is that we have however many sign-ups per day, every single one of them an independent human in the universe with a huge number of things that we're not going to have data on, and we kind of take that for granted in that, OK, well, they're going to exist in some particular context, and that's going to, in aggregate, give us some distribution of the percentage of people that accomplish given things in a certain period of time.
So when we're trying to do things like make sure that our onboarding flow is working for the bulk of people, we do a few things, one of which is go talk to user research who are doing a lot of the long-term, ongoing, qualitative and quantitative studies, which is very different at our scale necessarily than at a smaller company, but nonetheless an extremely important activity and thing to think about.
Because that helps summarize a lot of that "What the heck is actually going on here?" But when we look at the actual numbers, that's going to wind up being some distribution. We think that within this persona or scope of it, the set of people that are onboarding that are not likely to be onboarding into a paid organization. They're going to behave differently than the other folks. We're going to provide them with other things.
The way that we really thing about that though is like, "Are you a professional developer or not? Yes or no? Are you even a software developer? Are you coming to GitHub to accomplish something else? How much experience do you have? What are your interest areas?"
But for us, do you know Git? If you don't know Git, we teach you Git. If you don't know GitHub, we teach you GitHub. Primarily for us, it turns out that our theory and our thesis around this, we'll see how much of this plays out, is that actually it's that tutorial type of information, and educating everyone so that they can actually participate, is the key piece. But nonetheless, it's whatever that makes up some big chunk of people. But there's a whole bunch of user cases in there.
Yeah, that's a great topic. So, you've got your SaaS company, and you're going along and you're like, "This is awesome. We have a bug. We'll just go fix it." And then somebody says, "Yeah, but we got to sell this thing." So then you make an on-prem thing, and then you can't fix your bugs.
Our approach to that, which I'm not an expert in, but I'll talk through what I do know about it. You certainly have a release team. They manage that process very closely. We have pretty hard deadlines around when features need to be in a particular branch by. And then the QA process, both automated and manual. A QA team that works on both ends of that to go through the entire QA process.
For us, our releases are roughly quarterly, something like that, to do that major release, to do the communication outward, to go get people to upgrade, of course, and then do whatever point-releases that are necessary due to security vulnerabilities or otherwise.
Conceptually speaking, though, it's a difficult thing for an organization to wrap its head around if you have these two very, very different deployment models. My best advice is to jump in with both feet. My second best advice is if you don't have to, don't do it. But it's up to you, obviously, in your own particular context, where you think the market's going to be, where you think the opportunity's going to be and how much drag that's going to put on your ability to do product development in the extremely agile way that you can with a SaaS product.
Audience Member: You mentioned the phrase, "to get them to upgrade." Does that mean they elect to upgrade, or do you push the changes to them? And if not, do things get out of sync?
Maturen: Yes. For us, our model started out being extremely hands-off in the sense that the people who wanted GitHub Enterprise wanted nothing to do with us whatsoever. Having any access to their box, they want the most. They want to be completely cut off from the entire internet, like in a bunker in Iowa.
Our value proposition there was, we won't know anything about what's going on there. We don't have any stats or any insight. We won't even know what the most recent version is that you're running. And that starting point has tracked a lot of our history, and it isn't until recently that we started even offering the ability to phone home and see if your version is up to date.
I think we only launched that relatively recently, but even beyond that, it's up to the particular administrators of that GitHub Enterprise instance to choose to upgrade. And I know we have processes in place to, basically, snapshot the instance and do a test run of it and things like that. We don't, at this point, as far as I'm aware, have any, push-button, completely automated processes, but I'm happy to get you in touch more with the folks that know that system.
I did. This was circa 2009, something like that. I moved here after the 2007, 2008 crash, where I was having a hard time finding a job, and it seemed like people wanted credentials that showed that I worked at a big company or that I had a CS degree.
I had undergrad degree in philosophy, and no one seemed to care about that, which is fine. And so the thinking was, "Cool, I guess I can go get like a masters in comp sci or something, and if I'm going to do that, I could do something more interesting. I could go to law school and become a lawyer.
I did the LSAT prep and took the LSAT, which was actually quite a bit of fun in itself. I went to law schools and sat in on classes, the classes were great. I talked to my colleagues from undergrad that had become lawyers, every single one of them said, "For the love of God, don't do this. You'll be saddled with debt, and you won't have a job."
And then I applied, while waiting for law schools to get back to me, I applied to startups again, and that was spring of 2010 or something by that point. Everything was coming back. The money was coming back. And I all of a sudden got job offers again, and dodged that bullet.
Yea, so first of all I can say you should go watch Chrissie Brodigan's talk from Monitorama, at the least from last year. Tenaciouscb on Twitter, you should go follow her. She has done a couple more talks recently, one at O'Reilly. I'm not sure if that's published yet. But she has a lot to say on the subject and is a great expert on it. She also has a great blog and does some excellent work.
Beyond that, I'm not super well qualified to talk about, from a theoretical perspective, what the approach of user research is. But it primarily boils down to, for me, being willing to get up from your desk, go find your customers and sit down with them and see how they use your product, see how they use other products, see what their problems are and how you're fitting into that and how you're not fitting into that.
You can work your way backwards from there, probably ways to scale that up, and other things, running surveys, of course other great things like that, more lightweight tools than having to get all the way up in there. But that's a great starting point, and I would point to Chrissie's talks and her blog posts for, again, that entry into the entire field, and follow all the references. Find the whole network and unravel that thread.
The basic premise is you have to figure out what it takes to do the job well, right? That's what you're assessing for. And the logical problem is you can't do that. You don't actually know. You're going to have all these things that you're going to try to triangulate, and all these things that have told you that a person should be great at this thing that may or may not be completely relevant.
Even at GitHub, we're 500 people, we still don't have the interview volume nearly to take quantitative approaches to answer that question, either. But nonetheless, you have to go through the effort of thinking hard about what it takes to do the job well, and only assessing for that.
Because at the point where you're assessing for anything else, or leaving it open-ended or doing anything else that unstructured, you end up hiring for some other reason other than their ability to do the job well. So, that's my primary advice.
In terms of what you need to get done at a company, it probably depends on your context, but likely you need generalists, you need people that want to be able to, are able to, solve lots of different problems. You need self starters, you need autodidacts, you need people that are going to try to make the people around them better.
I think a scale-and-variant property of what we try to look for and what we try to review on is, regardless of the particular domain that you're in, are you making yourself better? Are you making the people around you better? How the heck you assess for that in two hours or four hours or six hours is very difficult. But keeping that in mind, keep in mind the technical things you know a person is going to need to do, and figure out whatever other leadership-type things that they can get in the door.
Audience Member: I have a follow-up question to that. Particularly at that early stage, whoever that analytics, that data-person, reports tend to color what type of data gets surfaced. In our opinion, what, honestly, where should that person report? I mean, is it a business metric? Is it a community- or user-metric?
Maturen: It depends on what your thesis is. For GitHub, our primary motivation was to build a community and to put a moat around that community and protect it at all costs. And until very recently, we didn't have a marketing budget. We didn't have anything else, because our thesis was that if we build up the community, it'll help get us the rest of the way there.
There's a lot of other baked-in history about why GitHub made the decisions it did, but those are going to be, probably you might try to learn from them, but it's probably going to be independent of your own bet as to what the right way to do things are.
So, in terms of who the person reports to, whoever your founders, your CEO, whatever your board structure is, you have to be making an intentional bet around, OK, this area deserves the most. This is the lens we're going to apply, primarily or to start with, and to make that choice intentionally. And that really depends on the context that you're in.
The other way to do it is, of course, to have a "flat organization," and not have them report anywhere, because no one reports anywhere, and then you're just kind of leaving it up to the roll of the dice. I'd say you might want to make bets about like, "OK, is the fundraising market going to get better or worse? Do we need to be more or less mindful of cashflow than we would be, say, a couple years ago?"
I imagine you all have been talking about that, but that might help inform where you might want to focus. But then again, maybe you think focusing on that now is going to drift too far from your actual ambitions and the actual scale that your company can achieve.
The super simple version of that is it boils down to something like creating, I mean like the super-qualitative version of it, creating super fans in terms of that user journey of, "How do we create somebody who's going to go into an organization and rally around getting us on GitHub and talk about why it's so great, and make it happen, and then convince everyone, and then once they're on there, they'll be happy."
There's that train of thought which kind of conceptually connects up to the bottom-up growth model of, "Go get individual developers. Get them to do things." We have journeys for different entities, though, so we have journeys for particular users.
We have journeys for organizations themselves, and we don't necessarily care what an individual user or organization did before or after, or on the side while at an organization. What we care about is what they did in the context of that organization, if that's what we're focusing on.
So, for the most part, we don't necessarily have a huge reporting issue, though maybe it would be better offline to kind of hash through some of the details work of what you're talking about. So I don't know that I'm quite getting to exactly what you're asking about.
A lot of this goes back to the really blurry lines between concepts. On the one hand, company strategy and what the company can even hope to achieve is mirrored and constrained by the structure and organization and set of people, regardless of their position of power or whether or not they're on the board or whatever else.
For us, I've been there for the last two and a half years,when I joined we didn't have management. I didn't have a boss. I didn't have a title. We didn't have any levels or career paths or anything else like that. I still got paid, that was awesome. Not quite sure how that all worked out, but it kind of did. It really mirrored what GitHub as an early product was. It was being representative of that community. By analogy, not necessarily when you look at the actual composition of people, which I think would be a significant criticism.
It's the open-source community of people in an ad-hoc way coming together to do things. That was happening inside of the company. And we got to this point where, "Early phases sell funding through personal accounts. We are thinking through just the lens of a person paying for an individual personal account. At some point, launch organizations, because clearly there's the demand for that. But it takes a while, and at some point that becomes our main revenue, source by percentage.
At that point we're like, "OK, well, directionally we know that orgs is going to go like that, and personal is going to be constrained by the size of the total number of software developers in the world and how much we actually want to charge them." At that point we're like, "Wait. That not longer make sense for us to try to charge individuals. It makes sense for us to charge companies."
But the shape and form of the company was different. It wasn't set up to do that. It wasn't even set up as a company, which is kind of awesome and crazy, and a great experiment. What we did is, through a process of a couple years, we were like, "OK, well, we have to change shape and form in order to better mirror what we want to accomplish. Figure out where we're actually going to have opportunity in the future. Figure out where those levers are and what we're going to have to do to ourselves to get there."
And it's going to be different. It's going to be a lot different, and it is very different today than it was two and a half years ago. And, you know, where we're headed, it's probably going to be very different in another two and a half years. How that change happens is definitely a very big trade-off between consensus of what's working and what's not working, leadership and authority, and who actually has the job to speak to a given thing.
In that case, for us, it's our founders, the authority that is given to them as a founder, a person who's been around since the beginning. But that's not always the case at every other company. At other times, you have to find those refounding moments to bestow authority unto a new person, and that whole process is a whole interesting thing to dive into. Awesome. Thank you.