If you just woke up from a decade-long slumber and searched Google for the most popular technology trends of 2017, don’t be surprised if you are inundated with websites talking about Docker. In fact, a comparison between Docker and LXC, in terms of Google search trends, will put the whole debate of Docker vs LXC to rest.
To be fair to LXC, the first implementation of Docker was layered on top of LXC, and LXC truly made Linux containers accessible to the masses. It’s only folly, just like Solaris Zones and BSD Jails, was that it tried to provide a lightweight VM experience for system administrators. On the other hand, Docker’s focus, from the beginning, was to bringing container benefits to the developer community, primarily on the laptop, and across all distributions of Linux. In order to realize this goal, Docker, starting version 0.9, dropped support for LXC as its default execution environment and replaced it with its own implementation called libcontainer, and eventually, the OCI specification compliant, runc. While other container alternatives such as rkt, OpenVZ, Cloud Foundry Garden exist, their use is rather limited. Docker has established a significant lead in the race to bring containerization to market, with huge install bases and ecosystem partners, as well as advanced tools and facilities custom built for this solution.
At the core, Docker, LXC and other container technologies depend on the key Linux kernel features of cgroups and namespaces. I highly recommend watching this talk by Jérôme Petazzoni to get more details about these kernel features.
At the onset, the Docker architecture looked quite similar to LXC, where in place of liblxc, they had implemented their own library called libcontainer that would provide the execution environment across multiple Linux distributions. Over time though, they have added multiple abstraction layers to better suit the larger open source ecosystem, and to comply with industry standards. Currently, the two key Docker engine components are: containerd and runC.
Docker is more than an image format and a daemon, though. The complete Docker architecture comprises the following components:
- Docker daemon: runs on a host
- Client: connects to the daemon, and is the primary user interface
- Images: read-only template used to create containers
- Containers: runnable instance of a Docker image
- Registry: private or public registry of Docker images
- Services: a scheduling service called Swarm which enables multi-host, multi-container deployment. Swarm was introduced in version 1.12
For more details, refer to the Docker documentation.
LXC storage management is rather simple. It supports a variety of storage backends like btrfs, lvm, overlayfs, and zfs. But by default (if no storage backend defined), LXC simply stores the root filesystem under /var/lib/lxc/[container-name]/rootfs. For databases and other data-heavy applications, you can load data on the rootfs directly or mount separate external shared storage volumes for both the data and rootfs. This will allow you to leverage the features of your SAN or NAS storage array. Creating an image out of an LXC container just requires tar’ing up the rootfs directory.
On the other hand, Docker provides a more sophisticated solution for container storage and image management.
We first start with image storage. A Docker image references a list of read-only layers that represent differences in the filesystem. These layers are stacked one over the other, as shown in the image above, and form the basis of the container root filesystem. The Docker storage driver stacks and maintains the different layers. The storage driver also manages sharing of layers across images. This makes building, pulling, pushing, and copying of images fast and saves on storage.
When you spawn a container, each gets its own thin writable container layer, and all changes are stored in this container layer, this means that multiple containers can share access to the same underlying image and yet have their own data state.
Docker, by default, uses copy-on-write (CoW) technology with both images and containers. This CoW strategy optimizes both image disk space usage and the performance of container start times.
When a container is deleted, all data stored is lost. For databases and data-centric apps, which require persistent storage, Docker allows mounting host’s filesystem directly into the container. This ensures that the data is persisted even after the container is deleted, and the data can be shared across multiple containers. Docker also allows mounting data volumes from external storage arrays and storage services like AWS EBS via its Docker Volume Plug-ins.
For more details on Docker storage, refer to their documentation.
Client Tools and Onboarding
As we established earlier, BSD jails and LXC have focused on IT Operators with the goal of providing a lightweight virtualization solution. This means, for a system administrator to transition from hypervisor-based virtualization to LXC is rather painless. Everything, from building container templates, to deploying containers, to configuring the OS, networking, mounting storage, deploying applications, etc. all remain the same. In fact, LXC gives you direct SSH access, this means all the scripts and automation workflows written for VMs and physical servers, apply to LXC containers too. LXC also supports a template notion, which essentially is a shell script that installs required packages and creates required configuration files.
Docker has focused primarily on the developer community. As a result, it has provided custom solutions and tools to build, version, and distribute images, deploy & manage containers, and package applications and all their dependencies into the image. The 3 key Docker client tools are:
- Dockerfile – A text file that contains all the commands a user could call on the command line to assemble an image.
- Docker CLI – This is the primary interface for using all Docker features.
- Docker Compose – A tool for defining and running multi-container Docker applications using a simple YAML file.
It is important to note, that while Docker brings ease of use via a slew of custom tooling, it comes at the cost of a steeper learning curve. If you are a developer, you are used to using VirtualBox, VMware workstation/player, and vagrant, etc. to create quick development environments. On the other hand, the administrators have built their own scripts and automation workflows for managing test and production environments. Both these groups have become accustomed to this arrangement given the industry-accepted norm that Development environment != Production Environment. Docker is challenging this notion and trying to get these two groups to use standard tools and technology across the entire product pipeline. While developers find Docker intuitive and easy to use, especially given how significantly it boosts their productivity, the IT administrators are still warming up to the idea and trying to work in a world where containers and VMs will co-exist. The Docker learning curve for the IT admins remains steep, as their existing scripts need to change. SSH access is not available by default, security considerations are new. Also, with the new microservices architecture, it challenges their set processes associated with the typical 3-Tier traditional applications.
One of the key components of the Docker architecture is the Image registry, that stores and lets you distribute Docker images. Docker provides both a private image registry and a publicly hosted version of this registry called Docker Hub which is accessible to all Docker users. Also, the Docker client is directly integrated with Docker Hub, so when you run `Docker run ubuntu` on your terminal, the daemon essentially pulls the required Docker image from the public registry. If you are just starting out with Docker, it is best to pay a visit to Docker Hub and explore the hundreds of thousands of container images available out there for you to use.
Docker Hub was launched in March of 2013, but according to Docker Inc, as of Oct 2016, there already have been 6 Billion plus pulls from it. Other than Docker Hub, there are many other vendors that provide API-compatible Docker registries, to name a few – Quay, AWS, JFrog, etc.
LXC on the other hand, given its rather simple storage management, both, in terms of container filesystem and images, does not come with any special registries. Most vendors supporting LXC generally provide their custom mechanism for storing LXC images and staging them to different servers. The Linuxcontainers.org website does provide a list of base images which are built using community supported LXC image templates. Similar to Docker, LXC provides a download template that can be used to search for images from the above source, and then dynamically create containers. The command looks like `sudo lxc-create -t download -n `.
The application space can be roughly categorized as – modern, microservices-based, and traditional enterprise applications.
Microservices architecture has gained popularity amongst new web-scale companies like Netflix, Google, Twitter, etc. Applications with a microservice architecture consist of a set of narrowly focused, independently deployable services, which are expected to fail. The advantage: increased agility and resilience. Agility since individual services can be updated and redeployed in isolation. Given the distributed nature of microservices, they can be deployed across different platforms and infrastructures, and the developers are forced to think about resilience from the ground up instead of as an afterthought.
Microservices architecture and containers, together, make applications that are faster to build and easier to maintain while having overall higher quality. This makes Docker a perfect fit. Docker has designed its container solution around the microservices philosophy and recommends that each container deal with a single concern. So applications now can span 100s or 1000s of containers.
Now microservices architecture is fairly new, and the hence the applications based on it are limited. A large portion of the enterprise data center is dominated by the typical 3-Tier applications – Web, app, and db. These applications are written in java, ruby, python, etc, have a notion of single logical application server and database, and require large CPU and memory allocations since a majority of the components communicate via in-memory function calls. In terms of management, these applications require administrators to bring the application services down, apply patches and upgrades, make configuration changes, and then restart the services. All this assumes that you have complete control over the app and can change state without losing access to the underlying infrastructure.
Give the nature of the existing or traditional enterprise apps, LXC seems like a more natural fit. Sysadmins can easily ‘Lift & Shift’ their existing apps running on bare metal servers or VMs to LXC containers. While Docker is promoting their technology for traditional applications as well, this requires significant involvement and work to get it working. Of course, this experience is only going to get simpler, as most vendors now provide their software as Docker images, and this will help jump-start deployment of new applications.
Bottomline, if you are writing new applications, whether they are microservices-based or 3-tier architecture based, Docker is the best platform to pursue. But if you want to gain all benefits of containers, without significantly changing operational processes, then LXC will be a better fit.
Vendor Support & Ecosystem
Both Docker and LXC are open source projects. Docker is backed by Docker Inc, while LXC & LXD (dubbed container hypervisor) are now backed by Canonical, the company behind Ubuntu OS. While Docker Inc provides an enterprise distribution of Docker solution called Docker DataCenter, there are many other vendors that provide official distributions as well. In contrast, there are very few LXC only vendors. Most support LXC as an additional container technology.
Unlike VMs, the container space is rather young, and the solutions are still quite immature, with many feature gaps. This has created an explosion of companies that provide various solutions around containers and consequently has led to an enormous ecosystem. The image below lists some of the partners that support the Docker ecosystem. LXC does not have such a rich and dedicated ecosystem, but this is primarily because of its native VM-like experience. Most of the tools that work on VMs, should naturally work with LXC containers as well.
Finally, in terms of platform support, Docker now has been ported to Windows as well. This means, all major cloud vendors – AWS, Azure, Google, and IBM now provide native Docker support. This is huge for containers and only shows the growing trend.
As with any blog of this nature, comparing two very similar technologies, the answer to which is best is really that, it depends!! Both Docker and LXC have tremendous potential and provide the same set of benefits in terms of performance, consolidation, the speed of deployment, etc. But given Docker’s focus on the developer community, its popularity has skyrocketed and continues to grow. On the other hand, LXC has seen limited adoption but seems to be a viable alternative for existing traditional applications. Also, the VM administrators would find transitioning to LXC easier than to Docker, but will certainly have to support both these container technologies.
Robin Value Add in the LXC vs Docker World
Robin brings to the table an application-centric approach that simplifies the application lifecycle management across various environments for Big Data applications such as Hadoop and Elasticsearch, for distributed databases such as MongoDB and Cassandra, for Oracle databases, and for enterprise applications – Robin supports both Docker and LXC.
Robin is the next-generation virtualized infrastructure platform that runs applications with bare-metal performance, guaranteed QoS, and application-aware infrastructure management. Robin software pools your existing commodity hardware into a scalable, elastic, fluid pool of compute and storage resources that can be dynamically allocated to applications based on business needs or QoS requirements. Robin enables you to:
- Virtualize databases or Big Data without the hypervisor overhead
- Consolidate workload with guaranteed QoS
- Simplify application lifecycle management