Where does the data go?

The popularity and demand of running containers in development is increasing day by day. The advantages of running containers over using the traditional methods of virtualization are known to all. But where does the data in the running container go. Let's find out!

If we look at the image below-

There is a local machine which has a container running on it's memory, and it wants to access data somewhere on the disk.

There are two ways by which docker allows us to do this-

  • Persisting state option 1- Blind mounts
  • Persisting state option 2- Volumes

Let us have a look at both of the methods-

Persisting State Option-1: Blind Mounts

If we look at the diagram above, on it's left side, it shows a container on the RAM, which is accessing a mount on the DISK. This is a kind of quick way to access data on the disk of the machine.
Therefore, a container can access a directory residing in the file system, directly on the file system of the computer. This is is called as a mount.

We can make use of the following command-
docker run --volume LOCAL-PATH:CONTAINER PATH

This command maps the local directory into the container. Here, we can also specify the permissions, such as (here) ro for read-only, etc.

E.g.:
docker run -v /home/username/project:/app:ro
docker run -v $(pwd):/opt/project
-----------------------------------------------------------------------
docker run -mount type=bind, source="($pwd)",target=/opt/project

The above stated method works great if used in the development mode.
This means that, if we are in a local development phase, and not running docker-containers in the production system, this method can prove to be a fast and efficient one. But it is not desirable to use the mount method while being in a production system.
The simple reason behind this is that our local path would not be present in the cloud, while we deploy our project over there.

This is the reason why docker provides us with another system called a volume system.

Persisting State Option-2: Volumes

When we install docker on a docker-runtime on our local computer-system, we provide it access to the computing resources of the local-machine. For example- the cpu, the memory, some space on disk, etc. As a result, docker becomes capable of managing a part of the local machine's hard disk, as well as controlling the allocation the disk space and managing it by itself.
This method is more robust than the previous one. It can be used while being in both the development mode as well as in the production mode.

The following command can be used in order to use the volume method:
docker run --volume VOLUME-NAME:CONTAINER-PATH

For an example:
docker run -v project-dat:/app
-------------------------------------------------------------------
docker run --mount source=mysql-data,target=/var/lib//my-sql

Concept of Volume in Docker and Kubernetes

Volume is a concept that exists in docker, but it is also an important concept in kubernetes.

How docker represents volume

A simple container built with the help of a dockerfile without any context is a temporary container that is created during the image building process. The snapshots of the public image are copied into the container. It's a snapshot that is locked in time and is not updated by default with a change to any code.

In order to get the changes updated, we can either-

  • build the image all over again and do the straight copy, or
  • abandon the approach of doing the straight copy

While using docker-volume, rather than doing a straight copy, we adjust the docker run command, that is used to start up the running container, by the use of a feature with docker called volumes.

The volume sets up a reference inside of the docker-container, which points back to the local machine and gives access to the files inside of these folders on the local machine.

So, a docker-volume can be thought of something like a port-mapping. In port-mapping, we map a port inside the container to a port outside the container. Similarly, by using docker-volume, we are setting up a mapping from a folder inside the container to a folder outside the container.

For learning more about docker-volumes, refer to -> here .

The term volume appears to be very very similar in both the contexts of Docker as well as Kubernetes. Let's have a small glimpse at what is volume in kubernetes.

What is volume in kubernetes

A Kubernetes volume is a directory that contains the data which is accessible to the containers in a given Pod. The Pod is residing in the orchestration and scheduling platform. Volumes in kubernetes provide a plug-in mechanism to connect ephemeral containers with persistent data stores elsewhere.

Docker has a concept of volumes, but it is less managed and the functionality is somewhat limited. Kubernetes, on the other hand, supports many types of volumes. At its core, a volume is a directory, which may/may not have data in it, which is accessible to the containers in a pod.

For more learning about kubernetes-volume, refer to -> here

Additional Resources:

Also read-> Beginner's guide to Docker, for getting a brief introduction on what is Docker and how it works.

15