Docker is for ephemeral data and shouldn’t have data that needs to be backed up. Or at least that’s what I was told for a while. And then I met the head of development for a large bank in Australia who said that this idea was nonsense. He said one of the biggest trends today was increasing the speed with which we move internally-developed software from development into testing and eventually into production. This is the essence behind what is known as agile development. He felt that this means that if you’re using containers in development, you should use them in production. And if you’re using containers in development and production, some of them are bound to create data that is going to need to be backed up.
First Idea: Don’t Do it
One idea is that even though your app may be in a container, the database it writes to and reads from doesn’t have to be. It could run at the normal kernel level at the same level where Docker is running. The app would simply give the appropriate information to connect to the database remotely. This way, the database is where you can more easily protect it and you don’t have to worry about backing up containers.
Second Idea: The Old Way
If you have a database inside a container and you want to back it up, you can use the same old tools used by backup people for years. Run mysqldump or the appropriate command for your database. This, of course, will require two things. The first is that you give your database the ability to back up to some persistent storage. The second thing is you must automate this process, including exception reporting and notification. This method is the most likely of the three methods, but it’s also the most expensive and time consuming. This is because you need enough space to hold an entire database backup.
Third Idea: Docker Volumes and Snapshots
If you can place your database on a Docker Volume, you can possibly perform a snapshot of that volume and then replicate that snapshot somewhere. If this is your plan, put the database into backup mode while you take the snapshot. Otherwise, you will create a crash consistent backup, not an application consistent one. Put Oracle and PostgreSQL into backup mode long enough to create the snapshot or file system backup, and then take them out of backup mode. You can lock MySQL using a flush tables with lock command, as shown here. Just make sure that you synchronize the database lock with the creation of the snapshot. Then, once the database is in the proper mode, make a snapshot of the Docker Volume it’s on and replicate that snapshot. (Related products include PortWorx, Hedvig and ClusterHQ. StorageSwiss also has a webinar, available on-demand, that covers some of the storage challenges related to Docker; “The Top 4 Requirements of a Docker Storage Architecture”. The first commercial product designed to back up containers is from Asigra.)
Automating this from outside your container is the next challenge. Today, this will most likely be a homegrown scripted solution, but you could run a single script that would run at the Linux kernel level that could run the database commands by connection remotely to the database running inside the container, then connect to the Docker Volume to tell it to make the snapshot.
I’d love to hear from those of you running Docker Containers in production. What are you doing for backups? Do you wish you could do better?