November 10, 2016
Ep. #5, GraphQL At GitHub
In Ep. #5 of JAMstack Radio, Brian and Ryan talk to GitHub Platform Engineering Manager Kyle Daigle about what makes GraphQL such a great qu...
It's a true honor to be here. Thank you very much for allowing me to be here this day. I'm Eugenio, you can tell from my sexy Latin accent that I'm from Seattle. I've been there for a while.
What I plan to cover today, the first thing is a real problem. I get this question a lot. "Is identity management really an issue today?" The second thing I'm going to cover is why it matters if you're building enterprise software.
Then I'm going to go into a couple of solutions and the solution space, some of the architecture and some of the approaches that we've seen are successful if you're building along these lines. Now, a few resources to get you started.
So, is this a real thing? Is it really a problem? Many people I talk to say, they ask me, "Really? People pay you for a screen with two text boxes and one button?" Yes. This is their window. This is the most deceitful component in any piece of software.
Why? Because this little window with a username and password and a button to log in, it's full of traps and full of pitfalls that people make. You are a secure as the weakest link in your system.
If you think about it, a log in screen is the door to your application, and companies get hacked all the time because people make mistakes.
It's really easy to make mistakes, even in the simplest of use cases.
Even in the simplest of use cases where you have credentials, like username and password. Concerns like, "How do you store passwords?" You don't store passwords. You need to hash them. "How do you hash them? Which algorithm do you use?" There's many.
If you search, there's a lot. "How do you know that you know any of those are the right ones? How do you know that even if you do all the right things, people are trying to log into a site with credentials that have been stolen?" Even if they are hashed and properly stored and dealt with, they are still using credentials that are out in the wild.
That's a good website where you can try and test whether your user names have actually been breached or not. But here's the other problem, if you sell into the enterprise and you're really successful, you might get away with selling software into an enterprise with username and password credentials, but only for so long.
You will be successful if you sell into companies that are not really visible into the entire company. If you're selling software into the shadow IT. If it's like a group of people, five guys somewhere in a big bank that really love your software, but they're OK entering their own username and password and they're OK paying with a credit card.
What happens is that once you get into an enterprise and you fly over the radar of real companies, big companies, they will get interested in control. When you allow your applications to accept custom credentials, in essence, your companies are losing control of who goes into their applications.
Because think about this thing, those five guys that were using the software originally, they enter username and password. They might actually enter a credential that is a company credential, but then they are fired and they can go home and open the website and still log into your site. So, companies want all the things that you will see listed here. Some of them are highlighted in the opening of the session.
They want to know who has access to what, when. They want to be able to revoke access. Many of the attacks in a company are not external attacks, they are actually internal attacks. People who have malicious purposes inside an organization.
They want to be able to show compliance, they want to be able to show, "Yes. I fired somebody. That somebody doesn't have any more access to these critical components of my system." All the things that I need to do to be able to show all the things I described, there's also the user experience aspect.
If you use, hopefully, the software that you're building, it's software that is going to be used frequently and all the time.
Imagine jumping from one system to another, and having to enter another again credentials, and go back and enter your credentials again.
The user experience is the second big driver of why people want advanced identity management in enterprise software. That's what single sign on means. That's what SSO means, not security. Whatever you said before. It's single sign-on. It's the seamless access across a myriad of systems that might be completely different, they are heterogeneous and built not just by you but many others, including inside the company.
So, what is the solution to this problem? The solution to this problem is to make it somebody else's problem, not your problem anymore. That's the traditional approach that we have used in software development, is obstructing and shipping it somewhere else.
If you look at the traditional way of building software and identity management software, there is an application and the application interacts with a database where credentials are stored. You don't store passwords, you store hashes of passwords. The way of making this somebody else's problem is to just delete the database so you don't have that database anymore, and you interact with a database of users and credentials that is somebody else's infrastructure.
Namely, the company that you have sold your software to. They already have one of those, you don't have to build yet another one. When you do that there's obviously that error, it's oversimplified in my picture here, but it is an error that implies an exchange of information.
There is an intent that your application needs to initiate to this system where users exist, and then there's messages or information that comes back and forth that is communicated between these two parties. That's essentially a protocol. The first lesson is do not invent that protocol, because that protocol has been invented, and it has been tested and battle tested for many years.
It is very tempting to say, "I'm going to send you this and you're going to send me this thing back. If there's an error, I'm going to send you a query stream with error equals something." Don't do that.
The principle that you will see in all the protocols that are in this space is this notion of trust, which is obvious once you think about it.
Once you delegate the responsibility of logging or authenticating users, essentially answering the question, "Are you a legitimate user of my system?" You're implicitly trusting that system to do the right thing. In this protocol there's this notion that there's a trust relationship between these two entities. Keep that in mind.
One of the protocols, the most widespread and widely adopted protocols in the industry is something called SAML. It stands Security Assertion Markup Language, and by markup and by the date 2005, this is like a big blob of XML that goes back and forth on the wire.
It was created at the time where XML was great. It was awesome. Like the grandfather of JSON. There's two terms that you need to learn, one is something called IDP and something called an SP. SP stands for service provider. That's essentially your application. Unfortunately, the identity world is also full of terminology that overlaps, so other words that you will hear for these is "relying party."
Relying party is an app. Relying party is a service provider. Relying party is something that is relying on something else to do the transaction for them. An identity provider is just that, the system that you delegated to perform the authentication for you.
It was published in 2005, it's been around for a while. It's old but it's tested, and it's widely adopted. There's very few companies in the planet, probably the companies that you want to aspire to sell to, that have not adopted or are not using this in one way or another. That's actually part of the problem too, that we come to in a second. So how does it work? It's actually fairly simple in principle.
The way it works is somebody opens a browser and goes through an application, and they're trying to read or access some resource. The resource might be a page in that website. Let's say some page. What happens is that the first time you go there you're not authenticated, there's no session between your website and your browser. What the application does, if it is configured to do this, if trust is established between these two entities the user is going to be redirected. This is like an HTTP redirect from the website to the identity provider, so some place like "IDPSomeplace.com."
It will attach to that request the intent of that authentication to happen. That's called a SAML request, and it's a document. You see there in this example that it's going on a query parameter, that's common, it's a big encoded blob that is really an XML document inside. It will have the information like, "Who am I? What is what is the application information?" And other things. We will go to the identity provider, and now the user will see on the browser the identity provider page, and it will typically have a username and password only that it's not yours.
It's not you capturing those credentials. It's the identity provider capturing that credential. That's important, because the identity provider can choose to authenticate you in many different ways.
It could be a username and password, it could be a certificate, it could be a card with a chip in some places, still used. It could be other things. It could be whatever method the company that is doing authentication has chosen to configure in there. It could be things like cameras, if you're all like me you remember what that means. It's like the great-great grandfather of SAML. Then once that transaction is finished and it's successful, our result which is called a SAML response will be posted back to your application.
That's another encoded thing but it's in essence, if you decode it, you will see it is another big XML document with the result of that transaction. The final thing that it does is the application validates the response, because obviously it needs to validate that it's coming from the trusted place and then it will go to the original place that you were trying to do.
This is one of the many what's called SAML profiles, because that is the most common one, the web SSO. It's a web redirect binding, is what it's called. It's one of the many ways these two entities can interact.
This is what fragment, and that's a tool that our company built, that you can use to actually see what is in the document that is being exchanged back and forth. It's very verbose, it implies and it includes things like cryptography on XML documents or digital signatures and encryption on external documents. It becomes really complicated really fast.
SAML, one of the problems is that it's like the Latin alphabet of identity. You can have many languages in the world use the Latin alphabet, but that doesn't mean that you can understand each other. You can have two systems that are SAML compatible, perfectly spec, following the spec to the last line, and still not be able to talk to each other because there's so many parameters and so many options that they're not using the same options at the same time. As I said, two systems.
The fact that the company is using SAML is not a guarantee that it will work.
So, you are successful. You moved to SAML and you implemented SAML on your own. You have the first win with the first customer with their implementation of SAML, but if you are really successful you will have many of these.
You'll sell to Boeing and then you'll sell to Bank of America, and then you'll sell to some other big corporation and then you have these hopefully hundreds of relationships between your app and all these systems. Even though they are talking SAML, the burden of all those translations are on the application. The better approach is to use an intermediary, which is called a federation provider.
That's another system that you push the responsibility to. You connect your application to a federation provider and a federation provider handles the fun out into all those systems and all the nuance of the connections between you and the different companies.
One of the advantages of that is that you once you connect to the federation provider you can keep adding new applications, so if you're building software that is a portfolio of systems, you get automatic single sign on between those two.
A better and nicer side effect of this approach is that you can have what is called protocol transition. You can deal with a SAML that is required for all these enterprises on the right, but you can choose to use a more modern protocol on the left so your application is not overburdened with all their stuff.
One of those protocols that is a modern SAML is something called Open ID Connect, which is not to be confused with OpenID. That's a different thing. Open ID Connect is equivalent identity protocol is equivalent to SAML, but in 2018. That's a URL to another tool that allows you to see what those will look like. So finally, some of the lessons learned. I mentioned this, applications are as strong as your weakest link.
Identity is usually a very weak link. Not just that, but it's also reliability considerations. Your system can work perfectly well, but if your identity stack is down it's as if everything was down. Reliability has to be built into this as well.
There's no applications, or very few applications that don't require identity. You have to take this into account.
It's never like one thing that you do once and you forget about it. It's very tempting to hire a very smart developer with all the crypto background and all the technical chops to build a system with this complexity, but forget about it if you think that is going to be a one-month sprint. You're going to have to have that person involved in this all the time, because it's always a race against a new attack, and a new vulnerability, and a new flaw in SSL, and a new something in a signature algorithm.
Some links and resources to get you started, some of them I mentioned as well. JWT stands for JSON Web Tokens, that's the modern SAML for the actual token. There's two tools, one is ours, samltool.io which helps you with that. One log in, another company in this same space has really nice tools, in some respects better than ours, so I put it there because I thought it would be insightful and useful for all of you. Thank you very much.