The 100-year Archive and the Data Preservation Explosion—Part Two: Containers, containers everywhere, Nor any bit to store

Containers are amazing tools. Anyone in product development, quality assurance, software development, sing the praises of containers, and the continuous development and continuous deployment (CICD) cycle that it has enabled. So, what is the big deal?

Similar to how VMware and Citrix virtualized the application and operating system and disaggregated it from the underlying hardware, Containers are a virtualized at the application level —disaggregating it from both the operating system and the underlying hardware dependencies. Put simply; a container consists of an entire runtime environment: an application, plus all its dependencies, libraries and other binaries, and configuration files needed to run it, bundled into one self-contained package. Abstracting the operating system and hardware dependencies eliminates the challenges in application development between differences in the developers, test, and production environments, as the container is like a space capsule where the applications have everything it needs to survive. Could the idea also be applied to storage?

The monolithic architecture is the original integrated system where the hardware, software, and applications are all designed together. A change in any one system directly affects the other two, which slows development and advances to a crawl. Think of the compatibility matrix we all had to manage in the past. Abstracting the hardware from the Operating Systems and Applications delivered the ability to run multiple operating systems on the same hardware system, which made the hardware a large fungible server pool that could be tasked to run Unix, Microsoft, or other operating systems on demand. The benefits were groundbreaking; however, the cost was that it required a primary operating system, as well as an operating system for each virtual machine. That overhead took up CPU cycles to manage and large amounts of redundant storage space. In comes the new kid on the block, containers arrive on the scene. Containers create the abstraction between the primary operating system and the individual application. It eliminates the need for an operating system for every application, making it lighter and smaller in management and storage footprint. This streamlined approach requires the application to carry with it all its dependencies, libraries, and other binaries and configuration files it needs to run itself within the container. Below is a simplified visual representation of the monolithic, virtual machine, and container architectures.

Containers look fantastic, and we haven’t covered how to do development in layers within a container architecture, which is a game-changer, but we won’t cover DevOps here, but it is so cool…lookup my former collogues and fellow alumni who did their dissertations (i.e., invented them) and help build several companies on Containers and Container Layers technologies – Dr. Dinesh Subhraveti and Dr. Shaya Potter.

A Container is usually tens of megabytes in size, whereas a virtual machine with its own operating systems can be several gigabytes in size. With an order or two in magnitude difference in size, a single server can host many more containers than virtual machines. Another benefit is startup time. A virtual machine has to go through the operating system boot sequence before the application can be initiated, which can take several minutes. A container spins up almost instantly. This capability allows a container application to be initiated when needed and then disappear when no longer needed, freeing resources for other processes. Containers can also be used in conjunction with other containers delivering clean and efficient modularity. Modularity allows the application to be split into individual functions, such as database, front end, etc.

A container’s ability to execute applications/programs itself is another powerful feature that has numerous implications and benefits. Lastly, the self-contained nature of containers delivers unmatched portability across compute environments from laptop to cloud.

So, what is the catch with containers and storage? Containers are Stateless…storage is stateful.