Written by: Mark Betz on 06/03/15

If you work in IT and haven’t heard the term “container” over the last two years then you must have been locked in the NOC with no Internet access. Containers are a combination of technologies that have been present in the linux kernel since 2008, and are used to isolate running processes, file systems, and other resources at runtime. It’s not the purpose of this post to go very deeply into how they work. If you are the kind of person who needs to know and you have a linux box or VM handy, here is a great presentation that will walk you through spinning up a native linux container and give you a good sense of what is involved. Containers vastly improve the configuration, deployment, and management of distributed applications in the data center, and they have been in use by major players like Google for several years. However, like a lot of useful technologies (file encryption, say, or the Internet) the underlying stuff was just complicated enough to keep less resource-rich adopters at bay for awhile; about half a decade in this case.

picture-of-shipping-containers

In March of 2013 Docker, Inc. released their first version of Docker, which is really the point at which containers began to become a mainstream thing. What Docker did was to combine the basic technologies of linux containers with a layering file system (AUFS) and the idea of a repository where images of file system layers could be stored, retrieved, and then used to form the execution environment for a process running inside a container. Suddenly it was possible to do this:

[email protected]$ sudo apt-get install lxc-docker
[email protected]$ sudo docker run -h my_container ubuntu:14.04 /bin/bash
Docker: pulling ubuntu:14.04 …
[email protected]_container# sudo do_whatever_you_want.sh
[email protected]_container# exit
[email protected]$

In that example lxc-docker is the Docker package in the official Ubuntu repository, and ubuntu:14.04 is a public official Docker image that contains a file system in which the 14.04 version of Ubuntu has been installed. After the image is pulled and the container is started what you have is bash running on Ubuntu 14.04 with its own isolated set of kernel objects, an isolated file system, a virtual network adapter, etc. From the outside it appears to behave much like a VM except that it starts about as fast as you can cat a text file, uses almost no resources, and imposes almost no performance penalties. Any changes made to the file system inside the container are ephemeral unless committed to an image, something we’ll touch on below.

I first became attracted to the use of Docker and containers when I realized that I could essentially spin up a new OS anytime I wanted to install something and try it out. Need to try something with elasticsearch but don’t want java on your machine? There’s a container for that. Just want to see what it’s like to rm -rf * from root and not hurt anything? Fire up a container and have at it. From a development standpoint the flexibility and modularity are awesome tools to have. My system is a much cleaner place these days. But that is not what made containers into the buzzworthy IT topic they have become. If it were just about developers and our tools nobody would care (trust me), and we already had VMs, vagrant, virtualenv, etc. The process and file system isolation and the ability to move containers between boxes are big wins, but it is really what they enable that gets IT types quivering with excitement, and that is the ability to pre-configure images of services and deploy them rapidly.

The most basic way that Docker makes this happen is by allowing you to start a container, install stuff in it, and then commit the resulting changes to a new image. When that image is later started in a new container all your changes are there ready to go. More usefully, Docker lets you declaratively specify all the steps needed to build a new image from a known starting image using a Dockerfile. For example, at my employer we have a complex service that aggregates data from many external websites. It involves python, redis, elasticsearch, and a number of third party tools and frameworks. The installation and configuration of this service requires more than fifty steps. Thanks to Docker we encode those fifty steps into a Dockerfile which is checked into our git repo. We can recreate the service image at any time by building the Dockerfile, and can can then deploy a new instance of it to an EC2 server like this:

[email protected]$ sudo docker pull mycompany/aggregator
[email protected]$ sudo docker run -h aggregator -p 1234:5678 /usr/local/bin/start-her-up.sh

I’ve omitted a few arguments for clarity, but basically those two commands will pull the image from the repository, start it up in a container, map the port 5678 on the container to port 1234 on the host, and then execute the startup script inside the container. We use this basic technique to manage all of our services. We can deploy a new elasticsearch node, or a new logging node, or a new instance of the web front- or back-end, in exactly the same way. It’s hard to overstate how transformational this can be. For many classes of runtime issues it’s now literally faster to redeploy the service than to diagnose what is wrong with it. In fact you can easily do both: you can stop the existing container, start a new container, then commit the old one to a disk image, pull it down to a dev box and debug it there.

You would think that would be enough to make us all happy, but the truth is that developers and ops people are a lazy bunch. It wasn’t long after we realized how cool these features were that we also realized we would still have to do things like: remember what host a container is running on; remember what version is running; remember what port it exposes; and so on. That’s still a lot of work, and can put a significant dent in the amount of time available for important things like reading Hacker News. What would be cool would be to simply purchase some cloud servers, define what stuff needs to be running and which other stuff it needs to talk to, and then let the system figure out how to deploy all of it to the available cloud resources and get it all connected. Over the last year a number of fledgling attempts to solve this problem have gotten themselves off the ground. Not too long ago we were suddenly introduced to one that is no fledgling, and in fact is the result of more than five years of development, making it a pretty mature beast in the world of system software.

I mentioned previously that Google was an early adopter of container technology, using it to deploy and manage most of their internal services over the last six years. In the process they developed a layer of system software to manage deploying containers to hardware resources, among many other functions. As container technology has entered mainstream IT the company has decided to open source the platform, naming it kubernetes. You can take advantage of what it does on Google’s own platform via their native hooks, or on a number of other platforms including AWS and your own metal via community contributed extensions. If you’re interested in the source code and documentation you can browse it all at the project’s git repository.

Any complete solution to the problems I described above is going to involve introducing some news ideas and abstractions, and kubernetes is no exception. In kubernetes the most fundamental unit of composability is still the container, but since containers should be kept as simple as possible a single container isn’t typically enough to define a complete service, and it’s ultimately services that we’re interested in. The kubernetes stack has three “layers” that provide the higher level things we need. First, a set of containers colocated on the same host and sharing the same mounted volumes on the host (or elsewhere) is defined as a “pod.” To give a concrete example, the web side of our application requires one container running nginx, another running an API service in django, and an instance of redis. In kubernetes these three things are defined as a pod which will run on a single hardware resource, with the containers communicating between themselves in defined ways.

The next step up the kubernetes stack is a “replication controller.” Replication controllers are basically process supervisors on steroids, and Google recommends that you use one for your application even if you need just a single pod. A replication controller will always make sure that the specified number of pods are up and running. If one dies, or you kill it for maintenance, the controller will start another, on whatever computing resource is available and makes sense from a utilization perspective. If you accidentally start too many instances of a pod the controller will kill some to get to the right number. In addition to supervising the number of pods replication controllers make it possible to run scheduled services, perform rolling updates, easily scale capacity up or down, and maintain multiple live release tracks.

What pulls the whole thing together are services, the highest level building block for constructing complete systems on kubernetes. A service is basically a stable name and address – an entry point – for some set of pods which are selected from among the set of all available pods according to labels which have been assigned to them. Because of the replication controller the service doesn’t have to be concerned with whether or where the pods that implement the service are running, and because of the service layer the client, whether end user or internal, doesn’t need to be concerned with where the services themselves are running or how they are implemented. The portability and composability are ultimately enabled by the fundamental container technology, but it is the cluster management piece that elevates those desirable capabilities into something on which the architecture of an entire system can be based, and which is flexible enough and resilient enough to be tailor-made for the cloud computing environment in which more of us are working and running our systems every day.

Back to the main premise of this post. Containers had already begun ushering in a world where a devops engineer like myself doesn’t have to think much about the configuration of a server instance to run an application. That’s pretty huge. Today I run the exact same system on my dev machine as I run in production. But taking the next step requires something like kubernetes. With container cluster management hardware resources will eventually be just a mass commodity market, with cycles, disk, and bandwidth acquired and discarded in real time as system needs change. At that point our systems will be declaratively-specified sets of containers that can be recreated from source repos at any time, and then implemented automatically across whatever the available hardware resources are. Our plans, designs, architecture, and implementation will no longer need to consider hardware at all beyond capacity and resource levels. That’s what I call iron-free infrastructure, and it’s poised to spark a serious revolution in the way systems are managed in the data center.

Mark Betz is a system architect and software developer working in the New York City area.  



Categories

Linux, Ubuntu, Very Technical