Library Blog

SLA & Support In The Enterprise Heavybit

When it comes to the enterprise, Service Level Agreements are no laughing matter. Enterprise customers pay to ensure less than 1% downtime, dedicated technical account managers, and stellar dev ops and incident response. It’s not enough to leave communication to the community and a status page.

Join PagerDuty Co-Founder Alex Solomon, Gainsight CEO Nick Mehta and Intuit’s SVP and Chief Product Development Officer of SMB (soon to be CTO) Marianna Tessel as they discuss what it means to offer high-availability and support to some of the world’s largest enterprises.

Alex Solomon is the CTO and Co-Founder of PagerDuty and responsible for the strategic direction of the company’s product and technology.

Nick Mehta is the CEO of Gainsight where under his leadership the company has become the Customer Success platform-of-record with an enterprise customer roster that includes companies like Box, Citrix, and CA Technologies.

Marianna Tessel is SVP and Chief Product Development Officer of Small Business and Self Employed Group at Intuit Inc where she has developed strategic partnerships, led engineering organizations, and catalyzed tremendous technology ecosystem growth.

Marc Campbell: First of all, thanks for coming out everybody to talk about support and SLAs, and how to build that into a SaaS vendor’s product offering. I’d love to start by hearing a little bit about your current company, PagerDuty and Gainsight. Maybe talk about VMware, from your past experience, what you’re offering at a high level. Do you have different tiers of support? Are you offering special packages for enterprise customers now?

Alex Solomon: Sure. So, Pager Duty. I suspect a lot of you guys may have heard about us, or may even be on call right now. What we do is we started out with on call alerting and management, and now we’ve broadened our vision to cover the entire incident lifecycle and to become the platform for real time work. Whether it’s dealing with your IT alerts and IT incidents and responding to them, and hopefully preventing some of them, expanding to security events and alerts, dealing with those quickly and in real time.

We also see customers using us for the support use case, where they have high priority customers needing support right away and they don’t want to send that into a queue for the most junior rep to handle, they want that to be handled in real time as well.

We’ve evolved a lot over the years in terms of support, but one of the things we did early on is we invested a lot in making sure that all of our support reps are very technical and knowledgeable around the product. We provided a high level of support across the board.

We didn’t sell support as a separate product initially, we have it now for our enterprise customers. The ones who want to have 24 hour round the clock support, we do that now. We’ve always been fast whenever we have any issues or any alerts or incidents of our own, to get on that right away, to update our status page and to communicate proactively with customers.

Marc: Nick, how about at Gainsight?

Nick Mehta: Cool. Nick Mehta, CEO of Gainsight. Some of you probably know us. We’re a SaaS company and we sell to SaaS, subscription and Cloud businesses and build software that helps them drive adoption of their products, minimize churn and get customers to grow with them. In doing what we do, we sell to a lot of support teams and service teams so we know a lot about the world of SLAs for customer support.

Now admittedly, in the early days of our company we didn’t have any SLAs. It was just my phone number or somebody else’s phone number going direct to solve problems. But over the years you grow, and we’re about a 600 person company now. We’ve got a few different tiers. We can talk about it more, but a lot of companies will stratify their support from a standard package that everyone gets.

Typically in Cloud it’s hard to charge for standard support nowadays. That’s built in for most companies.

But many companies will introduce a premium support business, especially if you sell to large enterprise, they’re often willing to pay for that. We have an elite, above premium, so every year you want to add more revenue and you add one more package on top of that. So, we have three tiers right now, and we can talk more about it but they’re differentiated based on response time. How fast you get back. Whether you get a dedicated person or not, whether you’re going into a queue or you get a named person, and then some additional reporting that we give for the higher levels.

Marianna Tessel: I’m Marianna. I’m glad you have me here to talk about SLA and support. Such a great topic. I work at Intuit, and at Intuit our mission is to power prosperity around the world. You probably know us from products like Turbo Tax and QuickBooks, and we serve consumers, small businesses and the self-employed. But in my background I worked at quite a few infrastructure companies, more significantly at VMware and at Docker.

I’ll talk a little bit about how my perspective from a buyer side and from the selling side, but let me start by talking about the selling side. At VMware in particular we focused on building products that just work. That was some of our magic, at least in the early days. In there, obviously we had all sorts of support agreements, etc.

The one thing I would like to note here as far as different tiers and different agreements, etc. What is most interesting is that for the really high tiers and some of the important companies, we actually view it more like a partnership and we worked hand-in-hand with them to try to understand their business to share roadmaps, etc. Somebody mentioned that at some point it becomes more of a relationship that you build with that customer over time, and you move away from this transactional agreement and often you go above and beyond the agreement.

With Docker, obviously we also focused on creating products that just work, which created a whole different problem for an open source company. There, the support etc. was not something that we emphasized in the early days. We wanted to emphasize more value-add on top of that, and again, similar in that you start from small support agreements and you enhance it over time to get more and more sophisticated.

Marc: Great. Thanks. I definitely want to come back and talk a little bit about the buyer experience as we go on. One area I’d love to chat about, Nick to start with, is when you see the different pricing structures for a support plan. You’ll have a startup, a business and an enterprise. One of the items that often gets listed under the Enterprise is a dedicated account manager. You just mentioned it a minute ago. Can you help define what that means to have a dedicated account manager? What you have to do to qualify and put that onto your website as an offering?

Nick: It’s super confusing because there’s a lot of different terminology, so I’ll try to break down what I see as industry standard. You’ll hear a lot of different terms. One term you’ll hear is you’ll have account managers, and then you’ll have technical account managers, and then you’ll have named support engineer or different terminology like that. So, I’ll try to break it down. Typically what people find is there is the reactive, and there’s a proactive. Just the way to think about it simplistically.

Reactive is customer has an issue and they want you to fix it, proactive is the vendor is trying to think about that customer.

In the reactive world they call in to support typically, and the question is, “Are you getting routed into a queue and you’re going to get some agent? Or are you going to get routed to a person and you’re going to get your agent?” That’s what people talk about as a dedicated support engineer, which is a little different from a dedicated technical account manager.

I’ll explain that in a little bit. A dedicated support engineer, I’m going to get Steve or Sally. I’m not just going to get a Zendesk queue or a Salesforce queue, or whatever. That’s the core thing. Typically people charge for going from that queue to getting a dedicated person, and that dedicated person, the benefit to the customer is “They know my environment. If there’s an issue, they know the last issue, and they know how they fit together.” That’s usually very valuable for large customers. Now, people use the term dedicated liberally. Sometimes the customer thinks that person’s just working on my account, dedicated, but what they mean is they’re dedicated. They’re hardworking, right?

Different definitions of the word dedicated. But that’s what you’d call a dedicated support engineer, and there are different terms, but that’s reactive. Then there’s proactive dedicated technical account manager. The subtle difference is I’m a company using VMware, I’m using Splunk or some enterprise wide infrastructure. I need somebody helping me plan my overall architecture looking at all the different instances, and sitting in design reviews with me, and helping me plan my upgrades.

Often you can sell that on top of the dedicated support engineer. The dedicated support engineer you might call a premier support package, so there’s basic and premier. Then separately there’s a technical account manager that you could sell on top of that. Most companies don’t get to that technical account manager till they’re a couple hundred employees, or in that range, but that’s something that you can know about to make your customers a lot more successful.

Marc: Great. Marianna, you spent some time at Docker, which is different because it’s an open source company. Is all that applicable to an open source company? If somebody is out here building an open source product, how should they think about those exact same questions?

Marianna: For open source companies it’s interesting. Some support structure that you put is similar. What could be tricky for open source companies is that suddenly somebody else can come in and offer support for your product. So you need to start competing with that to some extent, and start comparing the support that you provide to support that maybe somebody else provides on your product. That creates a bit of tension in the system, but similar rule applies in terms of what you provide.

The other thing that changed dynamics is obviously the bar for support is much higher.

People say, “I don’t need support for your software,” that’s why I’m saying, “If you create your software to be really good, then suddenly the bar of support is really high,” and there’s a bunch of companies that will say, “Thanks but no thanks.” Then you are now in a whole different tier of companies you deal with that cannot live without support. So, you eliminated some of the market and then you also invited competitors in, so you’re not the only provider in the market sometimes for your own software.

Marc: That’s definitely interesting. Alex, we were just talking about support and different tiers of support, and how the definitions are to put on your website. But I’d love to shift for a minute and talk about SLAs. You see four nines of SLAs listed on somebodies website, can you define what that means to be able to put that there?

Alex: Yeah. SLAs, one of the big parameters that’s going to be in an SLA is the uptime, and a lot of times that’s defined with the number of nines. So if you say, “Three nines of availability,” that’s 99.9%. You can get four nines, but I don’t see five nines that often because that means you can only go down five minutes in an entire year. You have to have such a reliant and resilient system that the problem must heal itself within those five minutes, because you can’t even have humans involved. We see three nines and four nines a lot, and then the other big variable to think about when thinking about an SLA is, “What is the penalty when you breach it?”

Those are the two things. You can make yourself look good by saying, “We do four nines.” I’ve seen companies say, “We do 100%.” It doesn’t mean they don’t go down ever, it means that’s what they guarantee and when they breach the SLA they go under the penalty. Generally the higher you make your nines the lower you make your penalty.

What we’ve noticed, being in business for nine years now, is that customers care about SLAs, especially enterprise customers. But it’s more of a check the check box thing, it’s more of a feel good, “This company has a guarantee.” Oftentimes in practice it’s about, “Is this service generally reliable? Does it work when I need it to?” Whether you have an SLA or not, generally the penalties aren’t that strong. For example our SLA for full transparency, we cap it at 30% of the month’s spend in terms of refunds. Even if we’re down a lot, which we’re not– Knock on wood. But there’s a cap there and each incident is 10% of your monthly bill up to 30%.

We’ve seen SLAs from other companies like Splunk that are watered down. They could be down for days and they only return 2-3% of your monthly bills. The penalty is really watered down, and it’s the check the checkbox thing. At the end of the day it’s about, “OK. If we’re not happy with your reliability, the SLA’s not going to make your reliability better.” It’s about, “Are we happy with it? Then we’ll continue to pay you. If we’re not we’re going to use someone else, or build our own.”

Marc: Great. Marianna, at Intuit from the enterprise buyer experience, is the SLA always just check a check box and you trust the numbers that they’re telling you? Or, how do you validate that when you’re doing due diligence on a new vendor?

Marianna: When we’re looking to buy software, we look at a variety of things. I will also note that there’s a difference if you buy something on prem or if you buy it in the Cloud. On prem, you’re going to get the support and the response and things like that. Where in the Cloud you’re looking more at performance, availability and quality. We’re not just looking at, “Are you down or up?” but, “How do you perform? What’s your performance number?” etc.

Just as I mentioned, yes, they’re in agreement and there is refunds but we don’t care about those. Because for us the damage to our image and the damage to our customers is far greater. We’re not really after the savings that we get as a refund. It’s more that you damage your reputation and we will not be happy with that.

We will look at a variety of things in that. We also look way beyond that, in that the previous talk was about security. Security is a big deal for us and we will test your software. I know I can go a little more in detail, if this is a good point in time, but we take it seriously. By the way, any Turbo Tax users here? OK, good. We store a lot of data for a large part of the consumers in the US and we feel there is a great responsibility.

We take security as a big deal, and we have teams inside Intuit that we call “The red team,” and “The blue team” that will on an ongoing basis attack our software. We would attack ourselves all the time, and we have what we call war machines, and we do all these crazy stuff.

Again, I can talk more in detail about that, but when we get new software often we will say, “We will turn our war machine on you, just to see if you’re going to withstand our attacks,” and that will be part of the criteria. If you don’t meet that we will say, “Go fix that and let’s talk again,” or whatever that is.

Another thing that the previous talker mentioned is compliance is more and more of a topic. For example with GDPR we have some obligations to customers to be able to retrieve the data and delete the data. You’ll have to comply with that as well, so we look for a variety of compliance things. There is way more than just, “Are you up or down?” There’s a variety of things we’re going to look at.

Alex: To interrupt a little bit, with our own SLA, the initial part of it is not an up or down thing, it is “Are we delivering notifications on time or not?” Our SLA’ss centered around, “we guarantee that all of your notifications are going to be delivered within five minutes.” Almost 100% of notifications within five minutes. The main parameter is the five minutes, and then more recently we added another SLA around, “Is the website available?” That one we measure in terms of nines.

Nick: One other thing I’d add in, just thinking about SLAs, is if you’re building in the Cloud, which most people are. Your infrastructures obviously depends on the Cloud infrastructure, and most of the Cloud infrastructure providers don’t provide significant SLAs. One of the challenges we’ve always found is signing up to an enterprise customer SLA around infrastructure, not around support but infrastructure, and not being able to hold our downstream provider up to that. That’s a big challenge in the Cloud these days.

A lot of enterprises will ask for the moon, but it’s hard to sign up for it.

One thing I’ve found though is a lot of times they’ll ask the procurement department, “We’ll throw an SLA at you,” and frankly most of the time you could not agree to a big SLA and just use your standard one.

Marc: How do you with that? With upstream providers that are out of your control that might go down, that violate your SLAs, but there’s nothing you can do about it.

Nick: I don’t think there’s a great answer to this. Like when Amazon goes down, the funny thing is when Amazon had– Some of you remember the outage related to Dyn, their DNS provider. Everyone went down in the whole world. I remember calling our customers and they were like, “We’re down too. It’s no big deal.” Everyone in the world was down at the same time. There’s an interesting thing where when the mainstream Cloud providers go down, people are a little more understanding, because they’re likely down too. But I don’t know, other thoughts on that?

Alex:

At the end of the day you’re going to use these providers, and we look at their track records.

When five, six years ago Amazon had one or two outages that were significant, but these days their track record is pretty solid. We also have some usage in Azure as well, so we are multi Cloud because we need to have such high reliability. There’s other ways to design for failure and to have workloads and multiple Clouds. If you really need to have those high availability.

Marianna: To add something, often we will look at what we call a blast radius. For example, you can imagine that around April 15th we have really good setup to make sure that nothing goes down. But for other things, that maybe they don’t quite impact a lot of the system, or whatever. We are a little bit more nuanced. We understand that not all software is equal, and it might fit in different places in our development cycle, or even in our production.

Or we might sometimes have our own redundancy, sometimes we’ll have two systems that do the same thing. But in there we can tolerate a bit more of an error rate, so a lot of it is nuance. It’s not like you have one number and you just go with that, you just look at where it fits in the system and particularly how it’s going to impact your customer.

Alex: There’s something to be said also about the tier of component or service that you’re providing. For us, notification delivery is one of the core things we do so that’s a tier 1 component. But for example, if you’re looking at one of our analytics products, where maybe we’re okay if the data is not up to date to the hour. If we can fall a little behind then that’s not a tier 1 component for us, so we don’t provide the same SLA there.

It’s to your point. It’s a nuanced thing where you have to look at the component, you have to look at the customer requirements, and you have to figure out what can you guarantee there versus what is the customer more okay with if it’s not perfect, or it’s a little bit degraded?

Nick: One other final thing I’d make on the SLA area, which if you’re a startup might be challenging. Actually, for everyone it’s challenging. There’s the SLAs, but then inside a contract there’s the liability that you have. Which is slightly different because it’s not just about uptime, but it’s about “What can your customer hold you responsible for?” This is one of the most complex areas in selling to the enterprise.

It’s called limitation of liability, if you don’t what that is, and basically “What’s the limit on how much they could sue you for in the case of a data breach or data theft, or revealing confidential information?” Getting that right is a real challenge, because honestly most enterprises will say, “We want unlimited liability.” That’s their starting point. “We want unlimited liability so we can sue you for anything,” but you’ll be able to negotiate some limit. It could be two years of our service contract, like if we pay you $100,000 a year that’s $200,000, or it could be up to a million dollars, or whatever. That’s another thing to think about.

Marc: When you do have a problem and you have to communicate that to your customers, Nick do you have any insight into communication channels that are working well today? How do enterprises like to receive that information?

Nick: Totally. There’s different kinds of communications people have. Most companies, if you sell to large enterprise you need a high touch approach. The world I live in, we work with our customers to create what’s called a Customer Success Manager. That person is often the main point person with the customer, and communicating to them about everything that’s happening, including a disruption or an issue.

Then often if you’ve signed up for an SLA around support, for example, in the customer support world it might be “We need to respond to every severity one ticket within this period of time.” Let’s say it’s 30 minutes. You’re going to report on that once a month and say, “Here’s all the tickets we got. Here’s when you reported it and here’s when we responded. Not when we fixed it,” by the way, nobody signs up for a fix SLA because it could be a bug that takes a month to fix. But you report on what that actual SLA was, and people like to see that. Usually people have some kind of status page or notification system for system wide stuff. Alex knows a lot more about that than I do.

Alex: There’s one too many channels of communicating to customers when you have an incident, or degraded performance, or outages. So we have a status page and we update that, and that is sometimes challenging as well because we now have a lot of complexity to our system. We have some level of segmentation between customers, so we can have one fraction of our customers being impacted by a service degradation. Like 164th, and then when we update the status page it’s hard to say, “It’s not all or nothing. We’re hard down,” and our customers read every single word very carefully in there, and they freak out sometimes. The ultimate goal for us, and we’re not there yet, is we want to get to the point where we can give customers a status page that’s just for them.

They can see, “I’ve bought these three products.” They don’t care about all the other products that they haven’t bought, the uptime of those or whether they go red or whether they’re green. Then they could see, “Am I being affected by this service degradation? Is the shard that I’m on being impacted? What is the impact?”

And ultimately the customers want to see that impact, and they want to know “What is my plan B?” For when they don’t get their alerts for their critical systems, maybe they’ll put people to watch the monitoring systems directly. That’s the workaround while we’re having issues, or something along those lines, and for each business it’s going to be different. “What is the plan B?” Sometimes it’s just wait it out.

Marianna: Just to add, I’ve talked about how from a VMware perspective or a Docker perspective, you start looking at some of your customers as partners. I highly recommend that too, especially for your first few customers, whether they’re SaaS or you have a SaaS product or on prem product, if you develop that relationship that’s great. I know on the buyer side, even today with really major providers for us, sometimes we have people on site at any given time because the relationship is so tight and there’s that dependency. Then you go into whole roadmap exchanges, etc. So it’s a whole different level.

You can have their agreement, but for the most part you leave it aside, and you partner at a much different level.

Again my recommendation for your first few strategic customers, that’s always a good idea to develop a bit of a relationship, so that they can also be highly reference-able.

Alex: To add to that, what we’ve done for some of our big customers is we’ve set up a Slack channel that’s dedicated to them with a customer success representative, and the sales rep in there is always talking to the customer on new features that we release. But also, when we have incidents, proactively letting them know about it. That it’s not a surprise, that their boss doesn’t yell at them, that sort of thing.

We’ve done that for a lot of our top customers, but we need a more automated way to do that, and as part of our incident response process we also need– And we’re working on this now. A way to let our internal stakeholders, so customer success and sales, all the customer facing people know about the incident as well so that they can proactively be able to update customers. Right now we do that through Slack. We have an internal incident updates channel for all of our internal incidents.

Marc: Customer success, sales, and incident management through Slack, have you tried doing just regular straight up support through Slack for some of your largest customers?

Alex: We do for the large customers. Instead of sending a thing that goes into an inbox, we just have a real time conversation with them over Slack.

Marc: Alex, you mentioned that there’s different levels. You may have the website down, or the report is an hour slow, so that’s a degraded service performance. If you think about different types of SLA violations, maybe there’s security violations or degraded or full outages. Do you use different methods of communication across those?

Alex: Since you mentioned security events, those I would say are very different than incidents. For an incident or a degraded performance type issue, you want to be proactive, you want to be fairly transparent, and you want to tell folks quickly. For a security incident, that’s different. You want to make sure that you’ve mitigated the incident, that the attackers are out, that you know what happened and who was impacted. You want to contact the authorities, like the FBI in that case, so they can do an investigation as well. And only then once you’ve wrapped everything up and you understand everything can you disclose to customers, because there’s legal and compliance requirements there that are very different than in the IT incident.

Marc: Marianna, at Intuit. From the experience of buying from software vendors, do you ever end up saying “I’m on this side of the plan. I’m buying from the enterprise plan,” but it’s not every enterprise is going to want the same SLAs and the same support agreements. So, are you going to ask for specific customizations?

Marianna: Totally. In fact, we start from our own paper. We have our own SLA agreement and we’ll start from that. We’re not trying to squeeze our buyers. We’re looking for a good outcome for everybody and if you come in and say, “No. I have my paper.” Sure, we might go with that, it just prolongs the process. That’s something to consider when you work with some companies.

Sure, you can squeeze them to do it on your paper, but you’re just prolonging the process and you’re starting from that point of view.

But we have our own paper, we have our own agreement. Like I said, we have anywhere from liability to what we expect in terms of response, what we expect in terms of fixing, what we expect in terms of availability, what we expect in terms of security and in terms of regulations, and other things.

This also comes on the heels of a process where we are actually going to go and try to understand your product and your architecture quite a bit before we even– Again, depending on the product, we come to that stage. It comes after much understanding, and after we have that understanding we again might have extra things in the SLA. Maybe things that we expect for you to do in a certain amount of time, after we decided to purchase the product.

Nick: Be prepared, if you haven’t done an enterprise agreement before, you send your agreement over in a word document and it’s going to come back all red from the other side. There’s going to be a back and forth usually. Your point is valid, that if you want to optimize for standardization, using your own paper is great. But if you want to optimize for speed sometimes using the client’s paper is good.

Marc: What types of customizations, like specifically? Do you have any examples of ones that you’ve gone through recently that you can share?

Marianna: For example, we can get very specific right now around GDPR. Sometimes we will ask for our own dedicated instance. An example would be, you would– This is not GDPR related. The first thing is GDPR, but we could be very specific because we might have some commitments to our customers and then we want you to be within that framework. But in case of an instance where you might have a SaaS offering, but again we might think that the exposure of that or the way it’s structured, or the architecture is not secure enough or maybe not performant enough.

Maybe we’re one of your first big customers, and we’re going to overload your system, and we’re not confident. So often we will ask, “You need to start scaling it this way,” or, “We want to have our own instance and we would like you to support our own instance.”

Things like that, we will ask during the negotiation process and they will go into the SLA.

That’s our expectation. We have our own dedicated instance, you have this response time, we have these parameters around it.

Like I said once in a while we are going to turn our war machines, our security beasts at you to test your software, and that will be something we tell you in advance. Then how fast you’re going to, if they find something, how fast we want you to respond specifically. So this will be, maybe it’s not what you considered a security threat before, but we found this threat and we consider it a threat for us so we want a specific response for us on this thing.

Nick: One other thing I’d add in on this is, as a vendor you can often charge for this. You can have an enterprise tier where you get dedicated instance, or IP white listing, or custom vanity URL or dedicated mail routing and things like that. You can often honestly significantly increase the ASP, average selling price, for customers in that type of tier.

Marc: When the enterprise customer is asking for that, they’re expecting to pay for it.

Nick: I’m not saying you should charge Intuit more.

Marianna: No, others. Don’t listen to him. Charge us less.

Nick: I’m saying charge other large enterprise companies more. Exactly.

Marc: We’ve been talking about support and SLAs and how to position it and how to put these numbers on it. I’d love to talk about how you suggest– Nick, we can start with support. Quantifying, measuring and reporting on the effectiveness of an account manager program. If you sell that dedicated account manager, how do you show the value?

Nick: There’s multiple tiers like we talked about. There’s the basic support rep, then there’s the dedicated support rep. They are the value, and largely it’s actually the relationship. What you want to do is you want to report on that person, and they should be reporting to the client on a regular basis on the cases they worked on and what the response time was. But a lot of it is the interaction they have with that person, so frankly a lot of people will renew that dedicated support purely because they enjoy working with that person. That’s one tier.

Then you’ve got a customer success manager proactively working with that client, and sometimes you can charge for that. I mentioned a technical account manager. There what you’re trying to do is show the business value. That’s a whole other conference we can have on that, which we do, on customer success. Which is all about showing the customer the ROI from your product. What’s the value they’re getting? What’s the money they’re saving, the revenue they’re driving? There’s a whole discipline around customer success management that can talk a lot more about that.

Alex: I’d think you would agree to this, but one of the big things that customer success can drive in an account is they’re going to be the trusted advisor for that customer.

They can drive up sell not because they’ve got a quota, but because they recommend and they show new features, they show new products that you have.

They say, “These might solve your problem.” They genuinely are there to help, not to sell, and they’re not quoted. They’re not incentivized to sell. They’re just there to increase the customer satisfaction for those accounts.

Nick: Exactly.

Marianna: In fact, it’s not a secret that companies like Amazon, they will actually be incented to help you save. Which sounds counterintuitive, but it actually works really well for a customer, because you’re like “You’re on my side.”

Marc: Alex, on the SLA side do you have any tools or tips for somebody who says, “Great. I want to be able to put three nines or four nines on my website for uptime, and I’ve defined this SLA.” How do you measure that and how do you report on that to your customers?

Alex: It’s a great question. Generally the standard I’ve seen with SLAs is there is not honestly a ton of transparency around them, like you don’t publish your uptime. I haven’t seen too many companies do that. When you do have an SLA violation, when you do have an incident or an outage, the onus of the SLA is on the customer to call in and say, “There was a breach.” If they don’t call in, nothing happened. This is the standard that everyone adheres to.

Now at the same time, you should have a system internally to monitor your own uptime. We use Datadog, we use New Relic. We use a few different tools like that. What I’ve seen more and more lately in the industry is better traceability tools, things like LightStep or there’s a few open source projects around that. We’re looking into those because one of the things that we can’t quite do today, but we want to, is be able to tell in real time which customers are being impacted by an incident or degraded performance.

We can always tell after the fact, after we do an analysis, enough to read to conduct a post-mortem. But it would be really advantageous to be able to tell that in real time, and then broadcast that to folks internally so that they can communicate to their own high value customers.

Let them know, “You’re impacted right now,” versus “You’re not. You’re OK.” Then for the high value customers, given the performance data of, “Here’s how you’re doing. Most of your notifications are being delivered within a minute, and we’ve had this one blip here and this one blip there.” Because again the status page is this course level thing, you can’t tell what’s going on. It’s all or nothing for all customers.

Marc: Marianna, when you are buying something like PagerDuty and they have an SLA but you have to be monitoring it to report it. What systems are you putting in place in order to hold them accountable to the SLA that they’re selling you?

Marianna: Obviously, we have our own monitoring systems and we look at things like availability, etc. and we will monitor all the time. What I want to say is at the end of the day, whether your software is working or not is less going to live or die by the numbers you put on the SLA, but more like “Did it generally work for me, or not? Was the pain worth it or not? Did you respond well when we had a problem? And did you improve if we did have one?” Because some of the relationships that we have started pretty difficult in the beginning, maybe there was more problems.

But you see that over time, “That’s a good vendor for me and we actually go some places together.” But yes, we have tools and often we will ask you, for example we use Splunk we use Wavefront, we have a bunch of other stuff, like AppDynamics. But we’ll often ask you, “Are you connected to this tool? Do you already have some things we could monitor? If not, we will connect or use APIs to connect, and then we will immediately know if you’re up or down.” Sometimes we know before our vendors, which is kind of sad.

Marc: Alex, on SLAs I know we talked about how to define it and how to measure it, how to report on it. When’s the right time to add one to your website? Should it be before you make that first sale?

Alex: It depends on the customer. If you’re selling to more higher end enterprise customers with larger contract sizes where you actually have a contract, versus a click through agreement, that’s when it’s time to think about it. Listen to the customer. If they’re asking for that, do it. If they’re not, then you can get away with not doing it yet. But get those early customers in a high velocity way. Early on we didn’t have an SLA on our website or in our contracts. We started with a frictionless model, with a free trial, sign up with a credit card. We got through that for a few years before we finally implemented an SLA.

Marc: You started getting pushback from some enterprises?

Alex: Some enterprises started to ask for it, and then we put one together. But we didn’t do it before that happened.

Nick: One thing I would say though is that I agree with what you said. When they ask for the first one you should be prepared, and you shouldn’t take the one they wrote. Sometimes in enterprise, “Here’s the SLA that we want you to sign up for.” Except for Intuit, you should do whatever Intuit says. But for other customers my recommendation is to have one ready. That’s something you can do and you can live with it. I have seen some people signed up for one that somebody came up with, and then they had to live with it for a long time, and things like that.

Marianna: You can always find an SLA.com somewhere. Is there like SLAWarehouse.com?

Nick: There probably is.

Marc: When a couple of years went by and you actually did have your first SLA, is that the same SLA you have today or has it changed over time?

Alex: It has changed. Not a lot, like the five minute guarantee is still there. We’ve added to it, we’ve added the uptime guarantee in terms of our APIs, in terms of our actual interfaces on mobile and web. Now that we have multiple products we’re looking to make it again more nuanced, and adding SLAs for all of those new products. But again, you have to think about the tier of service you want to provide. “Is this a mission critical service, or is this something that the customer can live with if it’s degraded a little bit?” There’s a lot of nuance to it and ultimately I keep going back to the SLA’s nice, and it’s a legal tool that your customers can hammer you over the head with. But at the end of the day, are they happy with your product or not?

The SLA is not going to guarantee that your product is going to be good.

Marc: They’ll leave.

Alex: Exactly.

Marianna: By the way, I know for example at Docker, a lot of the first few SLAs. were like “I’m not sure about this. I’m not sure about this liability limit.” But the next one, you’re like “I know what to do on this one. I know what you do on this one.” It does change over time and you become more and more savvy dealing with customers.

Marc: Has the support offering changed over time too?

Nick: Totally. We added different tiers. We didn’t have dedicated reps early on, and we didn’t have a support SLA in the beginning, so over time you evolve it. Very similar to what we just talked about. One thing I would say in general for both support and uptime SLAs is there’s a fine balance. You want your standard one to not be onerous for you, but you also don’t want it to be unfriendly to the customer because frankly what’ll happen is they’ll just try to negotiate it and you’ll have to go back and forth. That slows down your sales cycle if that’s what you’re focused on.

So finding something that’s reasonable for both sides, a lot of times the early SLAs people try to be cute and be like, “If we’re down on a day ending in a ‘Y’ it doesn’t count,” or something like that. They’ll add things that are ridiculous to their SLA that’s kind of weaselly. That’s not smart. The customers are pretty smart. Find something that’s pretty reasonable. One recommendation I’d have is make sure you have a lawyer that knows how to do SLAs and all this stuff. There’s lots of lawyers if you’re in the Bay Area that know tech contracts. Make sure your lawyers know tech contracts.

Marc: Another question. Outside of PagerDuty, Gainsight, Intuit, VMware and Docker. Can you give some examples of companies that you look at and they’re doing best in class support? You looked at them as examples, and how you want to model your own programs?

Marianna: I have to highlight AWS here, honestly. Because they are so customer focused that it really feels like they want you to have success. Like I said, they will go to the extreme of, “How can I help you gain more discounts on my platform?” As opposed to, “How can I get more revenue from you?” They’re very responsive, and things like that. While they’re not perfect, we all know that, we can all cite the big outages. They’re so customer focused that it’s quite impressive.

Alex: To add to that, I agree with 100% of what you said. The only one caveat is you have to be at a certain size before AWS will pay attention to you, because they have so many customers that they can only dedicate–

Marianna: At Docker they also paid attention, but that’s for another reason.

Alex: But what I was going to mention is that their status page is great, because it’s customized to you based on what products you’ve bought. That’s the direction that we want to go into. That’s where I see status pages going in general. The other one that comes to mind is Slack, just based on their fair use pricing, where if you have a user on Slack that never logs in an entire month they will refund the money for the entire month. They’ve done a pretty cool, innovative– I guess it’s more of a pricing thing than an SLA thing.

Marc: Slack also does the opposite of what we were talking about earlier, where they’re proactive about SLA violations, and they’ll just refund you.

Alex: I heard that too.

Marc: One final question for you, to give you an opportunity. Is there any advice you would give to a SaaS vendor right now who is thinking about creating a support or an SLA program and selling at enterprise?

Marianna: I’ll start. From the few examples that we have cited with Slack and AWS. If you look at the common thread, it’s about building some trust with your customers.

Let me assume that your software works. Let’s start there. Because if it doesn’t work you cannot– There’s no SLA, there’s no relationship, there’s nothing that’s going to overcome that.

So let’s assume that is a given. But aside from that, focus on building trust with your customers. Your customer is not your enemy. They pay your paycheck, so go that extra mile to build trust. Be transparent. So what if you had to refund them more? All these little things, they build a lot of trust in the relationship, but at the end of the day that’s what is most important when you have this kind of relationship.

Nick: Broadly the world I’ve been talking about on the support and account management side, it’s called customer success. That’s the bigger picture. There’s a lot you can read. We wrote a book on it. You guys can get it and people can check it out. One specific thing I would say is you could charge for a lot of this, and it’s a win-win. Whether it’s a named support person or a technical account manager, there are a lot of things you can charge for. It’s customer’s benefit because there’s somebody there, and you benefit too because you can fund this. A lot of the companies don’t realize that some of these revenue items can help both sides. So, consider charging for some of these services.

Alex: For me, I’d say start with the product. We talked about this earlier. If the product is good and it’s easy to use and easy to get started with, you can avoid a lot of support and a lot of support interactions, a lot of needing to go to the customer’s site to start implementing the products or professional services. We didn’t have that for a long time. We just started with it a year ago. Then from there, I would say also invest early in support, and having competent technical folks so that when the customer talks to one of these people they’re not like, “This person has no clue. They’re telling me to unplug and replug my computer.” You don’t want that support. Those are the two takeaways for me.

Marc: Great, thanks so much. I appreciate it.

Want developer focused content in your inbox?

Join our mailing list to receive the latest Library updates. After subscribing, tell us your preferences to receive only the email you want.

Thanks for subscribing, check your inbox to confirm and choose preferences!

You've been here a while...

Are you learning something? Share it with your friends!