In this post we are going to look at a way to persist data from a docker container so that the data is still there after the given container was stopped and removed.

This is done by a concept that is called volumes.

What this post will cover

  • What is a volume
  • Why one should use Volumes
  • Working with Volumes

What this post will not cover

  • More specific examples like database
  • Complex and advanced scenarios with networks and different volumes
    (see for this my later posts on this topic)

Why do you want to use a Docker Volume?

To create complex applications somehow it is necessary to store the data that users edited or brought into the system. And this data should (or even in most cases) must not be lost when the docker container is stopped, changed or upgraded.

What is a Docker volume?

In brief, a docker volume allows data that should be persisted, to live and exist outside a docker container. This means essentially that you can replace a container, without loosing the data that was created from this given container.

A common use case is a database that is run in a docker volume, to prevent its data from being lost.

To work with any application you have two kinds of files. Those that are used to run the application (like binaries, config files, Resources/Assets and so on and so forth) and those that are generated by the application when it is running. And with using docker those types of files are handled different internally.

When you utilize Docker containers, those first files that are used to run the application are part of the docker container.

Before we go into depth with how we work with docker volumes I want to demonstrate this with an example.

  1. create Dockerfile.volumes like so
    FROM alpine:latest
    WORKDIR /mydata
    ENTRYPOINT (test -e message.txt && echo "this file already exists" \
    || (echo "Create the messageFile..." \
    && echo Hello, Docker $(date '+%X') > message.txt)) && cat message.txt
    --> checks if file is there with minimal linux distribution, then creates it and prints hello docker plus the date into this file.
  2. cd to dockerFile directory

    docker build . -t mydocker/volumes -f Dockerfile.volumes
    docker run --name volume_test mydocker/volumes
    


    Output will be

    Creating the messageFile...
    Hello, Docker 20:13:47 // at this time for me when I tested this
    

If you restart the container with

docker start -a volume_test

The output is

this file already exists
Hello, Docker 20:13:47

So the filesystem of the docker container is untouched.
But if you are now to delete the container

docker rm -f volume_test

and restart it, the output would again be

Creating the messageFile...
Hello, Docker 20:15:26

Ok simply going around and deleting files might not be the way to go in production environments, yet mistakes happen and machines fail.

So how do we prevent this with volumes? Read ahed for that.

How do we work with Volumes?

So docker solves the data file problem mentioned above by keeping data files outside of the container and its file system and making it still available to the application that runs inside the container.

To use a volume you need to undertake three steps:

  1. Update the docker file
    add between FROM and WORKDIR commands the line:

    VOLUME /data

    This tells docker that any file stored in the directory named after the VOLUME command should be stored in a volume i.e. putting those data outside the container. The important thing is, that this is totally transparent for the container. It still simply reads and writes from this file like it where in its own file system.

    Also do not forget to update the image:

    docker build . -t mydocker/volumes -f Dockerfile.volumes
  2. Create the Volume
    to create a volume where the data actually can be stored you run the command

    docker volume create --name test_storage
    

     

  3. Create the Container
    Last but not least you create the container again
docker run --name volume_test_2 -v test_storage:/data mydocker/volumes

-v tells Docker that the data inside the dir /data should be stored in the recently created volume called test_storage.

If you now run the container again, it will yield the same result on the first docker run command.
Yet if you remove the container and then run it again, it will still show the
this file already exists line from our example. (This is true as long as you do not remove the volume from your machine)

How do I find out if my container uses Volumes?

This sounds trivial, but in complex applications this can become tedious and painful very quickly. That is the reason the docker inspect command comes in very handy at times.

You use it like so

docker inspect mydocker/volumes

The result will be a JSON description of the image with a section on Volumes:

"Volumes":
{
"/data":{}
},
...

Summary

This post gave an overview over the issue with data that should be stored longer than the lifetime of a docker container. We explored the why, what and how of this problem which is solved by Docker containers with the concept of volumes.

Where to go from here:
Run a Database in a Docker Volume.

Categories: Docker

0 Comments

Leave a Reply

Avatar placeholder