April 27, 2018
Ep. #28, Front-End Infrastructure at Coursera
In episode 28 of JAMstack Radio, Brian is joined by Jon Wong, Senior Software Engineer at Coursera. Jon discusses how Coursera has approache...
The subtitle for this talk is "Lean Containers," first as an homage to the expression "lean startup." I guess many of you know that lean startup idea, and also for more practical reasons that will soon be obvious as we dive into the topic.
First, a short introduction. I work for Docker. I'm, as you probably guessed, or will shortly guess from my accent, I'm French, but I live here in San Francisco. I joined Docker four years ago when it was still dotCloud, so I was doing containers before it was cool, so that probably makes me a hipster of containers.
I helped to build and scale the dotCloud PaaS, and we were running containers at scale in production. And so I picked a few tricks that then I tried, as long as I could, to inject into Docker to make it useful and valuable to people willing to run things in production with containers.
The outline for tonight: I will do a very brief intro about Docker and containers so that people who don't have much info about that topic are not completely lost by subsequent parts.
Then I will explain some technical differences between VMs and containers, not just to make a comparison, but to explain why and how we can get to this lean container idea and how we can achieve parity between dev and production environments with containers.
Then I will dive into some functional differences between VMs and containers, and from those differences, we will extract those patterns, those best practices that we can use to simplify the dev-to-prod workflow. That's how we'll get to lean containers, and then I will talk about how to compose applications using multiple containers, and so basically making stacks of containers.
Brief intro about Docker and containers: you probably have already seen that tagline, "Build, ship, and run any app, anywhere." I want to break that down into its components.
The idea is to build, so build means take any Linux program and put it in a container.
Any Linux program really means anything, even if the primary target for Docker was application servers and databases. It was quickly expanded to deal with more things, like their little tools sometimes are a little bit of a pain to install, like the Amazon command line tools, for instance. They used to require Java. Now it's simpler with Python.
But, nonetheless, if you just have a tiny little operation to do with them, it's kind of a pain to spend five, 10, 20, 30 minutes to just carry on one little operation, so it would be nice to have them in a container and read them easily.
They are also increasingly used for desktop applications, and that one is a kind of a big surprise, because nobody was really expecting containers to take over desktop applications.
But we have people, even though they're in a pretty reduced number now, that use Docker to run their Web browser, to run Open Office, well, Libre Office now, applications like Skype and even Steam games in containers. So it kind of proved the point that, yes, we can run absolutely anything in containers.
About non-Linux programs, you might have heard that Microsoft has committed to bring Docker support on Windows, and there have been demos of that very recently at Microsoft Build.
So soon we'll be able to run Windows applications on Windows machines but with the Docker Ecosystem, so that doesn't mean running Windows programs on Linux or Linux programs on Windows. It means each program on its own platform, but using the same APIs, the Docker API, in the same orchestration tools to control everything from a similar point.
There are people working on Docker for FreeBSD, and we know that it will be technically possible to someday have Docker natively on OS X. It's just that we don't know exactly if that would serve any use, so that's why not many people are working on it.
But Docker is not Linux-specific. The implementation is Linux-specific, but the concept applies pretty much anywhere. Now that we have our application in a container, Docker lets us ship that container pretty much anywhere.
The container is an image, and Docker comes with a distribution format, a protocol, an API, to move those images around.
That's again going one step further, the classical VM, or the old-style LXC-container approach, where we have something to run VMs or containers, but when we want to move them around when we have created one of those containers, moving them is extremely old school and bulky.
You have to make a table, or to move that gigantic five-or-10-gigs disk image, so we want to go one step further and have something natively from the very beginning to move those containers around extremely easily. That distribution protocol is open. There is a reference implementation which is open source, and it's the one that we use for the Docker hub, our library of public and private images.
So the reference implementation of that thing is not just, "Hey, look. This is the reference implementation, but it's really just a toy." No, it's a production-ready, deployed, at-scale distribution library that we use, and that many people are using as well.
Another thing that Docker brings us is the way to move those images efficiently by moving layers, so instead of moving a whole image, like I have my application based on Java and Tomcat, and tons of dependencies.
But when I'm coding, I'm just making small changes. So that wouldn't make sense to transfer multiple gigabytes each time, so instead I'll be able to move only the small layers that I need, and I won't have to redeploy an entire environment for each single tiny change that I make in my application.
The last step in "build, ship, and run" is run, so those containers can then run anywhere. When we mean anywhere it means they can run on physical machines or virtual machines, all alike. Running on physical machines is neat because it means I can achieve native performance with those containers.
Running on virtual machines is neat as well, because it means I can take those containers and I can run them on a VM on my machine, but also I can run them on a VM on EC2, on Linode, on Rackspace, or whatever provider I want, which is different from what I can do with VMs. Because with VMs, if I have VMware, or VirtualBox VM on my machine, if I want to move it to EC2, that will be much more painful. I have to convert it, and so on and so forth.
There are even new implementations of Docker made completely independently of the original one that just follow the same API, but then have a completely different runtime. So for instance, Joyent has developed their own implementation of Docker that uses Solaris Zones to run Linux programs but with the safety and isolation characteristics of Solaris Zones.
Now let's look a little bit about the differences between VMs and containers, and why they matter, and what kind of special characteristics we can draw from containers to have something that we didn't have before.
First, containers are easily portable and moveable around. I said if I have a VM image on my OS X machine and I want to run that on EC2, that will be painful. I have to convert that image to an Amazon AMI.
I have to transfer possibly gigabytes of data, and when I change a little something, I either need to do it all over again or set up some really clever differential synchronization between my local VM and what I have on EC2.
With containers, all this work is done for me by Docker or by my container runtime. Sometimes people ask, "Okay, so containers are a kind of virtual machine, so it'll be like the JVM or something like that, so how can you make the difference? What's exactly the parallel between Java VM, for instance, and running Java applications and containers?"
There is a good parallel to be made here. You can see Docker containers are just like Java Virtual Machines, except that the bytecode that is executed instead of being a Java bytecode, it's Intel 64-bit CPU code.
The API and ABI, the interface between your code and the external world, instead of being the Java APIs and all the Javas on the library, it's the Linux kernel system calls. Why is this important and relevant? Because those two things: the Intel 64-bit instruction language and the kernel system call ABI are extremely stable.
As you can guess, Intel doesn't want to make a new CPU tomorrow that will break compatibility with all existing CPUs of the past 10 years or something, and likewise, Linux tries to keep an extremely strong compatibility across versions.
When someone breaks the kernel binary compatibility, they generally receive a load of profanity from Linus Torvalds and other kernel maintainers for doing so, so there is a pretty strong incentive to not do that.
Another interesting thing with containers, compared to VMs, is that they have a very low overhead. Containers are just normal processes running on top of a normal kernel.
The only little difference between entirely plain, vanilla processes is that they have little labels, little tags, saying this is a process running in a container. And so processes that wear different labels can't see each other, so processes in different containers can't see each other.
Otherwise, they're just completely normal when they do I/O on disk or the network, they follow exactly the same path, the same rules, the same code as a normal process.
There are people who have done benchmarks to try to see what's the difference in performance and as long as you do the proper tuning, there is exactly zero difference between running code on a physical machine and running code in a container.
Containers generally let us have a higher density. On the other hand, VMs have much stronger isolation characteristics. A VM can't just poke at another VM. When a VM needs to talk to another VM, it typically has to do that over the network using good old TCP/IP and other protocols.
There are some exceptions, like there are some hypervisor or some virtualization technologies that let you have direct communication channels between VMs, but as far as I know, no public cloud providers expose that, and they are also very specific to the hypervisor you choose.
If you decide to go with Xen, then use XenBus. Then if someday you migrate to KVM, you have to scrap everything and re-implement it from the beginning.
From a security point of view, VMs can also run a non-privileged process on your machine, which means that when someone breaks out of the VM, if they try to attack your programs so they get into your server one way or another and then they break out of the VM, they end up being a non-privileged process on the host.
They have to go over from the beginning. They didn't gain anything. It's a little bit like someone breaking out of a prison and realizing that they're on a desert planet and there's nothing out there. Now they need to build a spaceship to get out.
With containers, however, the story is different. Containers are standard processes running on a normal kernel, so if there is a kernel vulnerability, then you get a full-scale security breach. That's something to be much more careful about.
There is a nice analogy that I like to make to kind of compare both. It's that VMs are like solid brick walls, so it's slow to build, it's extremely sturdy. If you want to move them, you have to break them down and rebuild them, so it's not extremely convenient.
Containers, on the other hand, are more like those paper rice screen dividers that are super easy to set up, like in seconds, super easy to move, however, they're also very easy to knock down.
That being said, the lines between containers and VMs are constantly blurring because, on the one hand, we have new mechanisms like new kind of lightweight VMs that are extremely quick to boot, that have almost the same flexibility characteristics as containers. And, on the other hand, we have containers that become stronger and stronger from a security point of view and that try to address that issue about this room divider that you can knock off.
It's pretty safe to guess that, at some point, the differences will be so close that VMs and containers will be pretty much the same thing, just implemented differently.
What does that mean from a functional point of view? That's where things get really interesting as we design applications in containers. No pun intended, but VMs contain everything they need. Everything that the VM needs, like remote access, like a logging daemon, backup mechanisms, everything has to be within the VM.
So your VM doesn't only have your Node.js, or Go, or Java, or Python, or Ruby application. It also has a ton of things that are not related at all to your application. It might have an SSH server. It might have Puppet, or Chef, or Salt, or Ansible to do configuration management. It can have tons of things that have nothing to do with your, I would say, your job in the first place.
Containers, on the other hand, provide the possibility to be extremely bare-boned, stripped down. A container can have just what you need and everything else, like the logging, remote access, can be provided by the things around the container.
Another big difference is the life cycle of VMs and containers. Containers will typically be created from an image, so you prepare an image and then you run that image, and you can run it multiple times, and it's pretty quick and straightforward. With VMs, however, we will typically use some kind of configuration management.
A VM will be created and updated, and updated again, and again, and again, and again, and at some point it will be destroyed. While for containers, when you need to make a change, you just make a new image, run it, and discard the old container.
It's also possible to do that with VMs, but it's more complex. People like Netflix or Amazon have described how they do that, this immutable server approach, where each time you need to make a new version you make a new VM image, you start it, and then you destroy the old one.
But I would say for normal people, so to speak, it's much harder, much bulkier to set up, and you lose a little bit in agility, because instead of just deploying your change right away, you have to wait for this whole machinery to take place. With containers, we can achieve that mechanism more easily and faster.
Now let's look at the deployment process. If we, without going all the way to microservices, if we break down the application in multiple parts, each part will typically run in its own VM. So I will have maybe 10 VMs running at the very least, because when I scale each part independently, I will have more and more VMs.
When I run my application on my local laptop, I will probably not have 10 VMs, because that would be a big waste of resource, and I will typically have just one of those and all my components will be crammed into it.
With containers, I can achieve something nicer. I can have tons of containers on my local machine without paying more, so to speak, than if I had one VM containing my 10 or more components. And then I can use the very same containers from my development machine to my production environment.
That being said, there will be differences between my development environment and my production setup. Locally, I will probably just write logs to plain files, but when I deploy to production, I want to send them to maybe an ELK cluster or to Papertrail, or Splunk, or Loggly, or syslog, or something like that.
Same thing for backups, monitoring. I probably don't want to have all that stack from my small, modest, local development environment, because I'm just developing Python code, for instance. So I don't need all those graphs and things reporting the amount of errors and traffic and latency, because the only person using that environment is me.
So there isn't much chance that latency or things like that will happen. I don't want to make my local environment too complex. I don't want to be in a situation where it takes five days and three ops engineers to set up my local machine, so I will have a way simpler environment locally. But then I will have differences between dev and prod, so how do I reduce those differences?
One way is to have everything in my containers. For instance, in my application container, in addition to my, let's say, Python and Django app, I will have all the things I need for production, so I will have the things to send the logs to my Splunk cluster.
I will have the things to report errors to Bugsnag or something like that. I will have all the metrics, like with New Relic, or Datadog, or SignalFx, or all the things I need for production environment will be in that container.
That leads to bloated containers, because I have a container, and 60 to 80 percent of the stuff in that container is useless for me in development. It could get into my way, or worse, I could get into its way.
I could be, "Oh, I need a newer version of that library for my application," and by installing that new library, I could break, let's say, New Relic, for instance, and then when we give the container to the ops team, and the ops team deploys the actual production.
"Well, New Relic is broken, and we don't have metrics anymore. What's going on?" To avoid that, we get to this idea of lean containers.
The idea is to do one thing, do it well, and just like the Unix philosophy. We have one container for the component itself for our application code. Then we will have a separate container for logging. Another for monitoring. Another for backups if we have some behind this container. Another for debugging when we need to, and so on, and so on.
So, how does that look like in practice? How do we do that? How is it possible to have our application in one container but all the logging logic in a different container? This all stems from the fact that containers can share pretty much anything they want, so instead of being strictly isolated, just like VMware, we can break down some of the walls between containers when we need to.
For instance, we can share files, but when we share those files, we're not just using some network protocol or shared folders mechanisms. We really have two containers accessing the same location on disk at the same time with no overhead in the process.
We can also share the network stack. That one is also a little bit weird when we don't know exactly how containers are implemented, but under the hood, it means that we have two containers that seem to be entirely different. They have a different file system. They're running different programs. They have their own memory quotas and disk I/O quotas and everything except that if we use network commands, like ifconfig or IP or route or netstat, we will see exactly the same things.
It will appear like both containers have the same IP address, the same sockets appear to be open. That's because they will actually share the same network stack. This means that I can run my Tomcat app server in one container, and in the other container, I can run something like netstat to see how many sockets are open. I don't need to poke into my Tomcat container or do anything particular with it. I can access its whole network stack without actually being in it.
It can also go as far as sharing process space and memory. For instance, if I want to attach a debugger to a program running in a container, I can run another container and say, "Okay, reuse the same process space."
So now when I can have my, let's say this time, a Go network server, and I want to attach a debugger live, but I don't, normally I would have to log into that machine and make sure that we have the debugger and all the debugging symbols and everything there. So I would have to kind of break things a little bit.
Instead, I can have my production image, then a debug image, which is pretty much the same, but with debugger and symbols and everything, and I can use this to attach to the process running in the other container.
We can imagine all those containers as being in rooms in a building in multiple dimensions, and sometimes we can break a wall on that side, on that side or the floor, depending on what we want to share between those containers.
Let's see exactly how that works, and I have some examples for you here. So, logging: there are two strategies for logging in Docker.
The first one is to get your processes to write on the standard output. It's the recommended thing in the long term. It has some flaws in the short term, because it's kind of being improved and developed, but when you do that, it means that you have kind of moved the whole logging issue outside of the container.
The container just writes on its standard output, and Docker collects that. Then you configure Docker to, say, put this in files, or send that to syslog. Those are the two options available right now. Another option is to write to a directory, so plain log file is the normal /var/log thing, and then to share that directory with another container.
I showed here how it's done in practice. When you stop the first container, you indicate that /var/log will be what we call a "volume," and a volume is just a special directory that can then be shared with another container. Now, when I need to inspect those logs, here, for instance, I'm doing the simplest command of them all, like a tail on those logs. We do that in a separate container. Of course, if it's just to run tail, it's not extremely exciting.
However, if I want to run a log collector that takes my logs and ships them somewhere, it's much more meaningful. Because instead of having to synchronize between the people working on the log platform that will tell me, "Okay, so you know our logging system needs Java or Node.js because it ships logs that way, so in your application container you need to have Java or Node.js."
No. Instead I put what I want in my application container. They have their logging container, and we don't have to agree on specific dependencies, and we don't risk breaking each others' code.
The key idea here is to make the application completely oblivious, like completely dumb about logging. The application just writes the logs knowing that something else will take care of everything.
Another good example of that is when you want to run some custom log analyzers. It could be Apachetop, for instance, which does a text-mode, real-time display of the most frequent requests. It could be some fancier log analysis tool.
Sometimes, if you have a lot of traffic or a very specific kind of traffic, those tools can be real memory hogs, and it's pretty interesting to see that by moving the log analysis tools in their own container, we can protect the application.
I have a real-world scenario here. We used to have, a long time ago, in the beginning of dotCloud, we had something to count the number of unique visitors on the platform. We were running a PaaS, so there were quite a lot of unique visitors, not on our website, but on the websites of basically all the people who had anything on dotCloud, so that was like tens of thousands of people and tens of thousands of applications.
The unique visitors were in tens, sometimes hundreds of millions, and the thing that would aggregate that and compute the number of uniques would need absolutely indecent amounts of RAM. And sometimes we would see some load balancers crashing, always at the same time of the week, and say, "Okay, something must be happening outside at very specific times, so let's look."
It turns out that it was the log analysis tool. The log analysis tool was eating up all the memory on the system, and at some point, instead of being the one that would be taken down, the load balancer would be taken down because the system would be like, "Okay, I'm out of memory. I need to shut down something," and it will randomly pick something, and that something would be the load balancer instead of the log analysis tool.
By moving the log analysis tool in a container, we can set the cap for this tool. And when the system goes out of memory, while in fact, the system doesn't go out of memory anymore, this individual container does.
This individual container dies, and we see a specific message telling us the log analysis process crashed because it was out of memory instead of having traffic disruptions because some load balancer was randomly terminated by the system.
So, sometimes people ask about, "Okay, what's the performance overhead of doing that? Because I have multiple processes accessing the same file. So what happens? Can I have concurrency issues?" There is no problem.
It's exactly the same scenario as when you have multiple processes on the same machine accessing the same files. Exactly like when you have Apache or Nginx or Unicorn or any web server writing logs and you're logged in at the same time on the machine, and you're doing tail -f on those logs.
There is no concurrency issue. There is no performance overhead because under the hood, remember, containers are just processes running on the same machine.
They just have those little room dividers between them, but nothing preventing them from accessing the same files as long as you allow them to.
Another example is for backups, file-based backups. So you have a bunch of files and you want to save them to s3, or some remote server once in a while or on a daily basis. The usual approach is to install your backup tools on your server and have something in crontab, for instance, to automatically back those files up.
The container approach is to have those files in a volume again, and to have a separate container to ship those files elsewhere. Again, that gives you a nice separation of concerns between the dev team, which is like, "Okay, we store the images or a bunch of binary data, whatever, in that directory."
And then the ops team says, "Okay, you need backups, and we do backups over s3, so you need to install this, and this, and this in your containers, and we will set it up for you."
If there is any kind of incompatibility, or other library, or if, at some point, once again, the developers create something and break the backup process, you're in trouble. Here instead, we end up having our application container with its files in its own directory, and then the backup container just accessing the same directory.
It has its own file system, its own libraries, its own dependencies, everything, and nothing that the application container can do will prevent the backup container from doing its job and vice versa.
Another example: when your backups are network based. So that's when you do like a pg_dump or mysqldump. When you have a backup process that connects through the databaseand gets all the data and shoves that somewhere, so pretty much the same thing here.
That seems less interesting than VMs, because if I tell you, "Oh, we have that VM that we only run once a week to make the full backup," that seems a little bit weird to have a VM that just runs a few minutes per week.
With containers, however, it's much less of an issue because we don't need to physically spin up something to start a container. We're just starting a process, but with all those extra safeguards and limits that I described earlier, so that, once again, the backup process, for instance, doesn't end up breaking the database, which also happens, by the way, in some scenarios. MongoDB, I'm looking at you.
Another example: network analysis. Whether it's like, "Oh, we have some weird traffic going on the Web servers, and we really have to get to the bottom of the issue," or "We want you to see how many sockets are currently in use, how many connections are open, what's the state and everything."
The traditional approach is, once again, I have my VM, I am in my VM, and I run something like tcpdump, or ngrep, or netstat, too, to see the mirror open sockets. As I described earlier, we can do that with a container that will have those tools, and that will be able to access our network state.
So instead of polluting my servers with all those tools like tcpdump or ngrep, for instance, I can have a separate container that has all my Swiss Army knife of network analysis, and that we'll be able to poke at the network stack in many ways and report that. Again, we nicely and neatly split our application and our ops team's.
One of the last examples: service discovery. That one is a little bit more complex. The idea here is pretty powerful. Normally, when you have, let's say, a Web application that needs to connect to a database in a very classic, old-school approach, in your code you will have something like, "Okay, let's connect to the database on IP address 192.168.-something," or maybe, "Connect to some DNS entry."
If you follow the 12-factor apps principle, then you will put your database connection information in under one variables, so you can have a deployment is some code and an environment.
But, when you start to introduce some failover mechanisms, things get more complicated than that because you will maybe needs something like, "Okay, when my connection to the database is broken, I need to connect to ZooKeeper or etcd, or Consul, or I need to do something special to find out if there was a failover to a replica and connect to that replica."
I need to do something a little bit more complicated than what 12-factor allows me to do. I need to have extra code in my application to cope with those situations. One thing we can do with containers is that we can move away that complexity to a different container.
This means that, for instance, for a database, my application will connect to a local thing that we call an "ambassador." You can think of it as a relay or proxy, and that thing has the responsibility of finding out where the actual database is running.
For instance, when we deploy a stack of containers and we want to use this principle, it means that when I have a service "A" that needs to talk to service "B," first we make sure that service B is running somewhere.
We find out the end point for service B, like where is it running now, then we start this famous ambassador, telling it, "Okay, all the connections that you receive now, you will relay them to that remote container."
Then we start container A linked to the ambassador. So we have, for instance, my database running somewhere, and I have my Web app and ambassador running somewhere else. And from my Web app point of view, the database is still local. I'm connecting on it, and there is nothing special.
My database could even appear to be running on local host, but actually what's running on local host is not my database. It's the relay, the gateway, the proxy, however you want to call it, that takes me to my actual database, and my application doesn't have to know about that.
Somebody has to know about it. Somebody has to write the clever logic that finds out in ZooKeeper or Consul, or whatever, where is the new replica that has my data, but not my application.
Also, it means that those responsibilities can be handled by different teams, right? So that the general pattern for all those things is that we keep the application container as simple, as lean as possible, and all the other tasks, all the things that would typically require to log in to the VM and add new stuff, all those things are in separate containers.
Now there, the last item is how do we compose stacks of containers? How do we manage all that stuff, because instead of having a simple app with a Web tier, an API, a data layer, and maybe some caching, now for each of those components, we have the thing responsible for backups, the things responsible for metrics, and so on and so on.
Again, explosion in number of components. There is a tool called Docker Compose, which as the name implies, helps to compose applications. Docker Compose is using a YAML file describing a stack, so this is the stack that I would write as a developer.
This is a microservice-based application. I have a random-number generator; I have a hasher; I have a Web interface; I have a worker; I have Redis as a data store. So I describe my stack like this, and then I have, with Docker Compose, I can start that stack or just one specific component, because it's the one that I'm working on, and Compose would be smart enough to start its dependencies automatically, that kind of thing.
If we manage the whole lifecycle of those containers, like if I have a data container and I want to upgrade it, it won't wipe out the data. It will make sure that that data is moved correctly from the old version to the new version, and then, at use, we realized that it was pretty easy to transform this YAML file into, for instance, this.
This is an example where I want to insert this ambassador into the story. So now instead of having in that case, I have the worker connecting to the hasher, for instance. Now I have inserted a hasher proxy, and my worker connects to the hasher proxy. The hasher proxy connects to the hasher, and for the application, nothing has changed.
But I have inserted that proxy that allows me to not only move things around, but also get some metrics, like when I start my application and I start to see that something is slow or not going well, the fact that I insert those proxies, those proxies give me some insight about what's going on, where are the contentions, the choke points in my application.
The nice thing is that the transformation between my basic development stack and this instrumented stack was dealt with a very simple Python script that just processes that YAML and gives him a new stack. So, for those of you that are already wrangling Docker in containers, either in production or almost production, Docker 1.7 is almost out.
We have cut the rudest candidates a few days ago, and there will be a huge rehaul in networking. This doesn't mean that what you had so far has to be thrown away. Quite the opposite. It means that we will have new network mechanisms so that containers can communicate between hosts in a more transparent manner. So you might want to stay tuned for that if this whole ambassador thing was appealing to you.
Okay, so, conclusions. The whole thing about this lean container approach comes from the fact that containers can share more context than virtual machines. We can share pretty much anything between containers, which allows us to decouple complexity, have my application one container and the sidekicks, the ops' jobs, in different containers.
All those things that would typically require VM modification or access can be done in containers. This means that we can break down deployment complexity.
Our developers need less interaction with ops in the production context, and we reduced the likeness of either breaking production because of something that changed in the development stack, or vice versa, breaking development stack because of a new personal constraint.
The last thing is that Docker by itself is not a silver bullet, solving all those problems. It's just bringing us new tools that make solving those problems easier or simpler. For instance, the service discovery thing. Docker by itself doesn't provide automatic service discovery, it just helps us to decouple the service discovery mechanism from the application itself.
So instead of having to cram the service discovery and failover code inside the application, it can be split from the application logic, and it can be reused or moved or maintained more easily, for instance. Thanks. If you have questions, I will be extremely happy to address them.
Compose has a new x-smart create option. Those are more improvements in the way Compose manages the lifecycle of containers, so they won't directly help with that specific part. It's more like if you've been using Compose for a while, there are some little pain points.
The first thing is like, "Oh, this is great. That stuff used to be complicated, and now it's simple." But then, very quickly, the initial "Oh yea" kind of fades, and it's like, "Okay, that stuff used to take 10 minutes. Now it takes one minute, but that's too long. I want it to take 10 seconds." So that's the kind of improvement that the new versions of Compose are bringing.
The whole service discovery challenge, there is almost a kind of Mexican standoff between Swarm and Compose, for instance. You know, okay, I have Compose that lets me have a bunch of containers that are connected to each other, and I have Swarm that lets me take a bunch of Docker machines and expose them as one single Docker machine. So I can ask to Swarm, start those containers, and Swarm will smartly dispatch them in my cluster.
Now, if I want to get two containers to talk to each other, one approach is to say, "Well, that's Compose's job to realize that we have two containers needing to talk to each other," and to add all the bridges, and tunnels, and proxies, and ambassadors and overlay networks, or whatnot. All the magic required for those two containers to communicate has to be added by Compose. That's one approach.
The other approach is to say no, Compose has to be naive and simple, and it will just ask to Swarm, "Start those containers." And it's Swarm's job to say, "Okay, I'm going to put those two containers in separate machines, but to let them communicate anyway, I'm going to let them do the magic."
We haven't exactly figured out yet if it should be Swarm's job or Compose's job, or a little bit of both. And when I say "we" it's not specifically Docker Inc., it's as a whole, a community that develops and uses the tools of Ecosystem today.
There are a few different approaches. The simpler one, like from a design point of view, but which at use is not so nice is to have this container, running the Splunk agent or running something like pipestash, I think, that can send logs to a log stash cluster.
Starting it for each container that has logs, we start another container that will, and we tell to that container, "And by the way, the logs will be in that directory," and that container will know, "Okay, I have to look in that directory, pick all the files there and send them to my log cluster." It's a little bit of extra work.
Another approach is to say, "Okay, I could have one single, more-or-less magic container that will be linked to everything and will be smart enough to discover that it's linked with those." It's possible, but it's kind of, I mean, there is a lot of magic involved here.
So there is actually a kind of middle-ground solution, which is the one that many people are using, which is to attach to the Docker event API, and basically you get to broadcast of all the things happening on the Docker host, like "A new container has been started; a container has been stopped; a container has been restarted, or this or that is happening.
And then you can listen in particular to container create events, and each time a container is created, you say, "Okay, let's add a logging sidekick for that container." Or you can take a decision depending on the image name or some environ variables in that container to decide whether or not you want to start a logging container.
Obviously, you don't want to start a logging container for logging containers, otherwise you have a kind of recursion issue, but that mechanism, this idea of watching the events, the things happening to the Docker machine and reacting to that is also pretty popular for ambassadors, and in particular for a Web application deployment.
Some people have written a mechanism where if you want a container to be reachable on HTTP, you just add a vhost environ variable in that container, and magically when that container is started, there is that thing observing the event stream that say, "Okay, a container has been started. Oh, I see it has a vhost eviron variable, so I will extract that vhost, and I will plumb that container into my load balancer, which could be based on Vulcand or Apache or Nginx, or something like that.
And so magically, when starting containers, those containers get plumbed into my HTTP load balancers.
I actually have an image called jpetazzo/hamba, which is a HAProxy ambassador, but in the current state, it's also a pretty hackish ambassador. However, I'm about to deliver a series of workshops around how to deploy Docker at scale using Mesos, Swarm, and that kind of thing, and so I will update that image.
It's basically HAProxy, currently, but in a kind of static configuration. The goal is to make it more dynamic to make it able to pick up those changes.
There is another thing that has the same kind of features, Interlock by Evan Hazlett, so that's pretty much the same story, listens to events, and then there is a program mechanism to say, okay, when this happens, when the container is created, we want to automatically set up this. Thanks.