Alexis Lê-Quôc
Advanced Reporting & Analytics

As co-founder and CTO, Alexis Lê-Quôc brings a strong focus on technical elegance and operational efficiency to Datadog. Prior to co-founding Datadog, Lê-Quôc served as the Director of Operations for Wireless Generation where he built the team and infrastructure that served more than 4 million students in 49 states.

Collapse
00:00:00
00:00:00

Introduction

One question you might have is, "Why is somebody from Datadog talking about reporting?" The way I thought about it is, when I think about what we do-- By the way, what we do is observability as a service, analytics about performance and availability for applications in the cloud. When I think about it, it's ultimately some form of reporting and we do that for a living.

Advanced Reporting & Analytics

It's some form of reporting because we're talking about observing what's happening in your stack, looking at transactions, consuming vast amounts of data and providing some condensed version of it so you can understand what's going on. Where we may be a little bit different from you is, this is our bread and butter, how well we report matters immensely, how effective and how fast you can find the answer by looking at our dashboards, responding to alerts we send. Sometimes, through PagerDuty.

It's really important to our core use case. We get to invest a lot, we test and experiment. In that sense I've certainly learned my fair share of lessons, in terms of reporting. Not to say that we do it perfectly, but hopefully I can share some of what I've learned.

Why enterprise? Why am I here on an on an enterprise topic, is that we started in 2010 and back then it was cloud native tech startups. Along the way we graduated to hipster enterprises, which are great to work with, and then also moved on to traditional enterprises as customers. Traditional enterprise, we heard about the beauty business in the Midwest did a billion in sales per year, it could be new healthcare, banks, and so on so forth.

Questions First: Answers Second

Now as we continue to grow we onboard more and more of these guys. I hear on a regular basis what it is that they want when they talk about reporting. One way that I was thinking about, "How do we how do we think about it internally?" So going back to first principle, it's really thinking about, "What are the questions that reports are designed to answer? What are the questions that your users are going to have when they want to use your reporting features?"

Answers in a sense, what answers you'll produce comes second, and once you start thinking about, "What questions do they do they have when they come to your product and to your report?" Fundamentally you get to a natural tension between open-ended reports and pre-canned reports.

Open-ended is really, "I want to ask a lot of different questions. I don't really know which one it is." On the other hand, pre-canned is, "Here's the report. It's on an email." It's interactive or maybe not even interactive.

It answers very specific questions in a very specific way. It formats and it really constrains how you think about the data that you're consuming, and that's that.

You'll face that tension, when you think about reporting for your own product you'll face that tension and you will think, "Do we need to build open ended or do we need to build a lot more pre-canned?" In reality you'll need both.

But one way to make a decision there, which way are you going to lean, is to ask yourself "Who's asking? What's the role, what's the person? What's the context of the person asking?" Then you ask yourself, let's say we produce the answer, "Would they be able to understand it?"

Lastly, now I give them the answer, "What are they going to do with it?" When you ask yourself these three questions you get to an important piece which is, "What is the intent of the person asking for the report?" In some cases, in enterprise software, sometimes you'll have "I need reports, my boss needs a report," and ultimately what they don't say is because-- I'll produce a report on a weekly basis because this makes us look good, or this justify the purchase of the tool. You shouldn't discount that.

It's an important piece. It shouldn't be everything, but you'll see that definitely. You'll have all the cases where sometimes it's the same person in different contexts, and sometimes it's totally different people saying, "I need to do a deep dive into data that you have on my behalf, because I want to validate some theory." That obviously gives you more of an open ended answer to the report, which should be really open ended.

Ultimately you'll find everything in between. Now we obviously find ourselves much more in the deep-dive, open-ended camp, obviously because of our users who are highly technical, there's a ton of data to consume, they face novel problems in infrastructure and their applications fail in novel ways each time. There's a lot of this quick cycle of open ended, ask your questions of the data, I get an answer and I continue until I'm done.

Then we also produce plaintive graphs because the brain is really good at consuming tons of data in a compressed fashion through the eye and the brain behind it. Let's take a look at how we approached open ended.

Open Ended or The Blank Sheet Problem

So, open ended. One of the key issues that exist with it is what I call the blank sheet problem. On the left hand side you have a data dashboard once you say create dashboard. It's full of promise, but you're like, "OK. Now I can create a bunch of-- I could drop widgets, I can create graphs. Where should I start? What should I graph first?" For instance. So, that's great.

But it's not a great first experience if that's all you provide out of the box as reporting to your user. They're going to be like, "OK. Maybe I'll come back." The reason for that is that they won't have an intuition of what the report can do, or in this case, what the dashboard can do. What your reporting can and cannot do. The way we worked around that for open ended reporting is we provide a lot of examples you can play with, you start with a template and you can you can modify it.

You can learn by copying. I have two young kids, and one thing I've learned from them is that you'll learn a lot by copying. They see you do things and they replicate it. Ultimately you don't lose that as an adult, and it's still great way to learn, just see what's been done and make some modifications and move on and graduate.

Once you've learned, once you've got an intuition of what the reporting tool can do, then you can face a blank sheet and say, "OK. I know I want this and that sliced and diced by these dimensions, and so on." Then you're in business. But you need to work through that first phase. If you provide your user an open ended reporting platform, you need to work through that initial phase, the handholding phase.

Pre-Canned: Frozen in Time

The other spectrum, if we look at pre-canned reports, which is, "Here's the answer." This is really how it goes. It's not even what is the question, it's directly, "Here's the answer."

One of the problems we've seen is obviously very easy to consume, and you constrain the problem a lot. In that case it's desirable. One of the issues we've seen with pre-canned reports is they tend to die. In other words, they live as long as you invest time in keeping them up to date. Otherwise, there's this long and sometimes slow, sometimes fast, spiral of death.

Where you've built this great report and you send it every week to your users, but then your data set becomes richer and the report falls behind. Then maybe you measure open rate through email, and you see that open rate so gradually decrease that it's like, "Maybe it's not as valuable as before."

When you think about investing, "Should we bring it up to date?" The data will tell you otherwise. The data will tell you when it is less used, because actually it's less useful, so you'll tend to put it on the wayside. "Maybe it was an experiment. It was useful but we're not going to invest in it."

So pre-canned reports having themselves this almost weakness, that that they're really good at conveying information where you want to provide the answer and not have the user really think hard about the questions to ask.

The problem is, they get out of date really fast. So we thought about a couple of ways to prolong their lives. What I call escape hatches, which is pretty common. Just come and ask, "Hey can you give me the CSV extract? Dump it in S3," and so on. That's fine. That's relatively easy to do. It's not a heavy investment.

The problem there is you're giving your users work.

It's like, "OK. Here's the CSV. Good luck." It'll only work with the very dedicated users, because they'll have to not only figure out what analysis they want to do on the data, but also have to import the data and keep it up to date and so on and so forth. A second escape hatch we found is building open source integrations to reporting tools. Third party reporting tools because maybe that's what data analysts use it for.

For a given customer, they use ART and they use Tableau and so on. That's usually a nice one-off investment and it sidesteps the pre-canned report that is frozen and is ultimately going to die. But it only goes so far, there's only a small fraction of your audience who is going to respond to that. Then there are the ways you think about it, to sort of not essentially have to build reporting that's not valuable or that's not used.

The Hook

So, if you step back a minute, I'm sure you have backlogs of features you want to build and my guess is that reporting is not, "There were some brave souls who said 'Yes I'm excited by reporting' but generally speaking it's not top of the list." It's not what makes or breaks the company.

One way we thought about it is, "What's in it for us?" Like, "Why do we--? Customers are going to ask about reporting, but why even build it? Why not say no? Why does it matter?" One way to rationalize it, and we found a way to rationalize it, is to say "Reporting is actually a great way to bring people back into the product." This is especially true for SaaS platforms.

That justifies the investment. For us, that totally justified the investment. So we thought we discovered two hooks to help you bring people back in the product via reporting. For us there are alerts and there are what we call AI generated stories, AI with quotes in this slide.

Datadog's Hooks

Alerts, use Datadog and plug it in, basically alerts is a way to tell you in the middle of the night-- Usually in the middle of the night, because it's more pleasant. "I have a problem." Your phone rings. The system is going down. You need to fix it. So that's a very powerful way to bring people back into the product, but only if they expect it.

You wouldn't page somebody to say, "Your weekly report is ready. Now wake up." That doesn't work. In some use cases, and that's for you to think about your particular contexts, it's a very powerful tool. It works all the time. The downside is once you go down that path you have a very high quality threshold.

You really need to get your learning right. When you notify the users to bring them back in, you need to get it right, because otherwise they'll just slowly discount you. Their trust in you will erode.

That hook is extremely powerful, and it lasts forever if you spend a lot of time in it. The other hook we found and we thought about, and we started to deploy is AI. Which for us is a way to say anything from statistical processing, machine learning. It's an interesting play on open ended versus pre-canned, because it's pre-canned in the sense that AI. is a system that tells you, "This is what I found in the data." You didn't ask a question, it just says, "This is what I found in the data."

It's open ended in the sense that you buy whatever you deploy. You control the questions it asks. We found it a really interesting way to have reports that evolve over time, that don't necessarily require investment but are not completely open ended. The quality threshold here is pretty high, but for different reasons.

The issue with this hook into reporting is the last thing you want to do is have your AI system to generate work for users. By that I mean, you get a report and you look at it like, "I don't understand how the machine could come up with this," and then all of a sudden now your users-- You gave them work. Whereas you wanted to give them answers, now you've given them work. Potentially a lot of questions which they may or may not be able to answer.

So there are ways about it. Obviously when the machine generates an insight it says, "You may be interested," not "You should do this because of the quality level that that exists." Generally speaking, whenever we put AI inside of reporting, we err on the side of being conservative or the false negative. In other words, we don't tune the system to report on everything it finds. We'd rather miss a few things but only maintain trust by putting forward things that matter and that makes sense.

Reporting (Vexing) Must-Haves

There are reporting must haves as well, so what I talked about is a little bit of where you have a lot of latitude and control and you can decide where in the spectrum of open ended to pre-canned that you're going to fall. But there are things you're going to have to do no matter what.

I know I misappropriated this common joke. There are only three hard things in computer science cache invalidation, naming things and printing web pages. We've spent an inordinate amount of time trying to produce PDFs that look like the web page. Like the fonts and art, and it feels so infuriating.

The truth is you are going to have to do something like that, because in the world of enterprise, usually not hipster enterprise, PDFs still reign supreme as a way to share information. It is what it is.

I'm already happy that people are not printing emails anymore, so that's baby steps there. You're going to have to spend time there. You're going to have to spend time doing things like IP white listing, which in the world of BeyondCorp and Spiffy doesn't make a lot of sense, but nonetheless for traditional enterprises you want to have the ability to say, "This report will only be accessible from the following ranges of IP."

That's relatively easy to do, but you have to do it, and maybe you'll be like "But that doesn't make sense." Well, for a number of people it does. The last thing that's been talked about-- Not the last thing, but one of the must haves also is SAML, SSO and then SAML. That's something that we found we had to do early on for reporting, and you'll have to do that. The last must have which is really a topic in itself, is role based access control.

To RBAC or Not to RBAC

"Who can see the report, and what can they see?" This is something that's going to come up like almost immediately in terms of some enterprise customer is like, "OK great. I'll send you all this data and you'll do some wonderful stuff with it, and I want reports but who can see that? Because if the wrong person in the organization can see them, we have a problem." Our approach there has been ultimately to try and defer. You don't say no. You say, "Interesting."

There are ways to do it. Like, "It's the enterprise package, it's more expensive." Or you say, "Interesting. We'll put it on the roadmap and talk to you next year," kind of thing. The reason why, for reporting in particular, its RBAC that we try to defer is because it's friction, and particularly when it's very granular and it kills certain patterns.

For instance when I talked about open ended reporting, where you want people to explore the data that you have and slice and dice and learn by copying things, the RBAC gets in the way. RBAC kills conversations, because now I talked to somebody, they can't see their data, and you're dead.

It kills that conversation and it's super tedious to set up. So the way we found that works is you do what I call requirements jujitsu. We say, “No. RBAC is not what you want, you want to break down silos," a Jedi mind trick, "You don't want ARBAC." Ultimately when you push sometimes it's not like they want to control who has access, they just want to know who looked at the data. It's more of an audit trail. Ultimately, "Should we surrender to ARBAC and implement?" The answer is probably yes, and we're in the process of doing that.

Meta-Reporting

Last thing before closing is what I call meta reporting, which is reporting on how your users are using your product. This is great because it's often overlooked, but it helps with adoption, especially in the enterprise where the big risk is not that your product doesn't have the right feature.

The big risk is that it's left on the side because there's too much stuff going on.

Building reports about how well people use your product, who uses your product and how well they use it, actually helps your champions. It helps them justify, "Look. There's traction, but we got to go see this team to help them understand the value of the product, and then they'll get on board."

Parting Thoughts

Parting thoughts, reporting is when you think about it, deep down it's not very different from your core use case whatever it might be. You're generally gathering data and then you want to answer questions your users have. To guide how to design that, again I think that the things to think about is, "If I'm building reports, who is it for? Can they understand the answers that it produces?" And, "What are they going to do with it?" Generally speaking

I say keep it open ended, because it's more fun and it's more flexible. But if you do so, be aware of the blank sheet, and it could become discouraging. But ultimately think about reporting also as a way to re-engage your users and to bring them back into the universe on a regular basis. Thank you very much.

Want developer focused content in your inbox?

Join our mailing list to receive the latest Library updates. After subscribing, tell us your preferences to receive only the email you want.

Thanks for subscribing, check your inbox to confirm and choose preferences!

You've been here a while...

Are you learning something? Share it with your friends!