Backup for containers – what, when and why?

While for many firms, container backup might be a rather exotic concept, manufacturers of data protection solutions are including such functionality in their solutions with increasing frequency.

In the last 18 months or so, containerization has become one of the hottest topics in the IT industry. Its most enthusiastic proponents claim that it will soon consign virtualization to the trash can. And while many people may disagree with this claim, one thing is sure – containerization is not just a passing trend.

Differences between containerization and virtualization

Containers and virtual machines are probably destined to co-exist for a long time to come. Virtualization has of course brought a great many benefits to the world of new technologies – it allows you to make better use of physical server resources, simplifies system scaling, and reduces costs related to purchasing and maintaining equipment. However, the world is racing forward and is constantly on the lookout for new, better solutions, and what better example of this is there than the IT industry?

Specialists are beginning to identify drawbacks to a technology that, until very recently, was considered to be a perfect solution. The most common complaint is that virtualization requires the use of several layers: the equipment on which the operating system is running, a hypervisor – virtual machines with their own operating system, followed by libraries, binary files and applications. Containerization can reduce this load by removing the hypervisor and the duplicated operating system. As a result, containers not only take up less space but are also more efficient. Containerized applications open in several seconds, and many more applications can fit on one computer than in the case of solutions using virtual machines. What is more, a common operating system for all applications brings benefits in the form of a lighter workload related to the installation of patches and updates.

Microservices and containers

Most currently available software consists of monolithic architecture. This is a model in which all functions are located in a single code base and are implemented as individual files. However, for some time now there has been a noticeable change in direction toward software based on microservices and containers. Gartner predicts:

in 2022, more than 75% of global organizations will implement container applications in their production environment – currently the figure is less than 30%.

Software development is driven amongst others by new trends such as the internet of things, artificial intelligence and self-driving vehicles. Analysts estimate that there are around 300 million applications in existence around the world, and that in four years there will be almost 800 million. Such a significant increase would not be possible without progress in software development. It is also an explanation for the growing interest in microservices. Containers fit ideally into new architecture based on the development of small independent services that carry out unique tasks. The same cannot be said about virtual machines. Launching microservices directly in virtual machines is generally inefficient, as a relatively large proportion of processor resources, RAM memory and disc space are wasted.

Do containers need backup? 

The expansion of containers presents another challenge to IT departments, who must solve issues related to management, monitoring and data protection. While the first two tasks can be successfully completed using the open-source Kubernetes platform, in the case of data protection the situation is more complex.

Those with a poor knowledge of containers assume that it is unnecessary to back them up due to their exceptionally short lifespan. Research conducted by Sysdig shows that in 2018 only five percent of containers live longer than one week. In 2019:

Comparing container lifespans year over year, we found that the number of containers that are alive for 10 seconds or less has doubled to 22%. In fact, the number of containers that live for 5 minutes or less grew by 2X as well.

Container Lifespans

  • <10 seconds – 22%
  • <1 minute – 17%
  • <5 minutes – 15%
  • <10 minutes – 9%
  • <30 minutes – 10%
  • <1 hour – 4%
  • <6 hours – 6%
  • <1 day – 3%
  • <1 week – 8%
  • <2 weeks – 4%
  • >2 weeks – 4%

However, experts have drawn attention to a few crucial issues. Stateful applications hold information about the system in which they operate – this can be user session files or temporary data.

Their loss is only too real and can cause certain problems for the user, which is why in such cases backup is recommended. It’s also worth asking yourself the question: What would happen if someone removed the entire Kubernetes cluster, the container nodes and the related permanent memory? As you might expect, creating backup allows you to recover resources after an outage or cyber-attack. What’s more, environment replication can be extremely useful when moving from a test environment to a production environment, and can help in migration of a Kubernetes cluster. However, you should remember that replication of the whole container environment after an outage requires several components: the container image, the attached mass storage memory and databases, persistent volumes, etcd.

Unfortunately, traditional software for backup and DR are not effective in a container environment. This type of product usually focuses on protection of individual servers and the applications running on them. Meanwhile, in the Kubernetes environment, applications are often widely dispersed and sometimes include many clouds and data centers. In addition, containers are usually highly temporary, which is a big challenge for applications in creating backup copies. The only way to provide appropriate protection for such applications is to use tools for creating backup copies that have been specially designed to work with Kubernetes.

To finish with, it’s worth mentioning that backup is unnecessary in two cases of stateless applications that do not store the information which appears between individual client queries. It is also not used for pods, that is groups consisting of active containers.

Containers technology expansion. Business is booming.

Proof that containerization is not simply another overblown trend is best provided by the example of VMware. At the last two VMworld conferences, the Palo Alto firm dedicated more time to containers than to virtual machines. The market leader in virtualization is focusing heavily on promoting VMware Tanzu – a suite of products and services used for creating, launching and supporting the latest applications in Kubernetes containers. And they’re not the only ones.

At the end of last year, the storage industry was buzzing with news of two acquisitions. Pure Storage took over Portworx – a startup developing software for container data management. Meanwhile, Veeam bought the firm Kasten, which specializes in backup and DR solutions for Kubernetes. In addition to this, it is worth noting that in 2020 NetApp expanded its portfolio to include a platform for storing data in Kubernetes environments, while Commvault implemented comprehensive Kubernetes support for its Hedvig software-defined storage environment. Niraj Tolia, the CEO of Kasten, even went so far as to say that Kubernetes will be the next VMware.

Storware comes out well against the backdrop of such company. As early as 2018, there was the release of version 3.7 of Storware vProtect, compatible with the Kubernetes platform, while from version 3.9 onwards, the software started supporting OpenShift – a solution that competes with Kubernetes. Paweł Mączka, Storware CTO, admits that:

When containers first appeared clients did not ask about data protection, but with time their awareness of the issue is growing. Additionally, more and more stateful applications are appearing due to DevOps putting their databases in containers. – I don’t think that in the short term containers will take over the world of IT, but for us it is important for Storware vProtect to make copies of data in a container environment.

text written by:

Angelika Jeżewska, CMO at Storware