Storage

Docker Volumes and Persistent Storage: The Complete Guide

February 18, 2025 · 15 min read

Containers are ephemeral by design. When a container stops, any data written to its writable layer is lost. This is fine for stateless applications, but databases, file uploads, configuration, and logs all need to survive container restarts, updates, and redeployments. Docker volumes are the mechanism for persisting data beyond the container lifecycle.

This guide covers every aspect of Docker storage: the three types of mounts, when to use each, how to configure volume drivers for network storage, backup and restore procedures, the dreaded permission problems, and performance-oriented tmpfs mounts.

The Three Types of Docker Storage

Docker provides three mechanisms for persisting data:

1. Named Volumes

Named volumes are managed entirely by Docker. They are stored in Docker's storage directory (/var/lib/docker/volumes/ by default) and are the recommended approach for most use cases.

# Create a named volume
docker volume create mydata

# Use a named volume in a container
docker run -v mydata:/app/data myapp

# Using the --mount syntax (more explicit)
docker run --mount source=mydata,target=/app/data myapp

# List all volumes
docker volume ls

# Inspect a volume
docker volume inspect mydata
# Returns: Mountpoint, Driver, Labels, etc.

Advantages of named volumes:

Docker manages the storage location and cleanup
Work on both Linux and macOS/Windows
Can be shared between multiple containers
Support volume drivers for remote/network storage
Can be pre-populated with data from the container image

2. Bind Mounts

Bind mounts map a specific host directory or file into the container. The host path must exist before the mount.

# Bind mount a host directory
docker run -v /host/path/data:/container/data myapp

# Using --mount syntax (recommended for clarity)
docker run --mount type=bind,source=/host/path/data,target=/container/data myapp

# Read-only bind mount
docker run -v /host/config:/app/config:ro myapp
docker run --mount type=bind,source=/host/config,target=/app/config,readonly myapp

# Bind mount a single file
docker run -v /host/path/config.json:/app/config.json:ro myapp

Bind mounts are useful for:

Development workflows where you want live code reloading
Sharing configuration files from the host
Accessing host directories for log aggregation
When you need to control the exact host filesystem location

Warning: Bind mounts give the container access to the host filesystem. A container with a bind mount to / has full access to the host. Always mount the most specific path possible and use :ro when the container only needs to read.

3. tmpfs Mounts

tmpfs mounts store data in the host's memory. The data is never written to disk and disappears when the container stops.

# Create a tmpfs mount
docker run --tmpfs /app/cache myapp

# With size limit and options
docker run --mount type=tmpfs,target=/app/cache,tmpfs-size=100m,tmpfs-mode=1777 myapp

# Multiple tmpfs mounts
docker run --tmpfs /tmp:size=50m --tmpfs /app/cache:size=100m myapp

Use tmpfs for:

Temporary files that should never touch disk (security-sensitive data)
Cache directories where performance matters more than persistence
Working directories in read-only filesystem containers
Session files, temp uploads before processing, scratch space

Named Volumes vs Bind Mounts: When to Use Each

Aspect	Named Volumes	Bind Mounts
Management	Docker manages location	You manage the host path
Pre-population	Copies data from image	Overrides image contents
Portability	Works everywhere Docker runs	Depends on host path structure
Performance (macOS)	Fast (native)	Slower (file system translation)
Backup	Need container or Docker path	Direct host filesystem access
Use case	Database storage, app data	Development, config files

Rule of thumb: Use named volumes for production data and bind mounts for development and configuration.

Volumes in Docker Compose

Docker Compose makes volume configuration declarative and reproducible:

services:
  postgres:
    image: postgres:16
    volumes:
      # Named volume for database data
      - db_data:/var/lib/postgresql/data
      # Bind mount for initialization scripts
      - ./init-scripts:/docker-entrypoint-initdb.d:ro

  app:
    image: myapp:latest
    volumes:
      # Named volume shared with worker
      - uploads:/app/uploads
      # Bind mount for config
      - ./config/app.json:/app/config.json:ro
      # tmpfs for temporary files
    tmpfs:
      - /tmp:size=100m

  worker:
    image: myworker:latest
    volumes:
      # Same named volume as app — shared storage
      - uploads:/app/uploads

volumes:
  db_data:
    # Docker manages this volume
  uploads:
    # External volume (must exist before docker compose up)
    # external: true

  # Volume with specific driver options
  nfs_data:
    driver: local
    driver_opts:
      type: nfs
      o: addr=192.168.1.100,rw,nfsvers=4
      device: ":/exports/data"

Volume Drivers and Network Storage

Volume drivers extend Docker's storage capabilities beyond the local filesystem. They let you mount remote storage systems as Docker volumes.

NFS Volumes

NFS is the most common network storage option for Docker. It allows multiple Docker hosts to share the same storage:

# Create an NFS volume using the local driver
docker volume create --driver local \
  --opt type=nfs \
  --opt o=addr=192.168.1.100,rw,nfsvers=4.1 \
  --opt device=:/exports/shared \
  nfs_shared

# Use it like any other volume
docker run -v nfs_shared:/app/data myapp

In Docker Compose:

volumes:
  shared_data:
    driver: local
    driver_opts:
      type: nfs
      o: "addr=192.168.1.100,nolock,soft,rw,nfsvers=4.1"
      device: ":/exports/shared"

Tip: For NFS to work, the NFS client utilities must be installed on the Docker host (apt install nfs-common on Debian/Ubuntu, yum install nfs-utils on RHEL/CentOS). The NFS server must export the directory with appropriate permissions.

CIFS/SMB Volumes

For Windows file shares or Samba servers:

docker volume create --driver local \
  --opt type=cifs \
  --opt device=//192.168.1.50/share \
  --opt o=addr=192.168.1.50,username=user,password=pass,file_mode=0777,dir_mode=0777 \
  smb_volume

Third-Party Volume Drivers

The Docker plugin ecosystem includes volume drivers for enterprise storage systems:

REX-Ray: Supports AWS EBS, Google Persistent Disk, Azure Managed Disk, and more
Portworx: Software-defined storage for containers with replication and snapshots
GlusterFS: Distributed filesystem for high-availability storage
Convoy: Docker volume driver for backup and restore

# Install a volume plugin
docker plugin install rexray/ebs

# Create a volume using the plugin
docker volume create --driver rexray/ebs --opt size=50 my_ebs_volume

Backing Up Docker Volumes

Volume backup is critical for any production deployment. There are several approaches depending on your needs.

Method 1: Backup Using a Temporary Container

The standard Docker approach creates a temporary container that mounts both the volume and a host directory:

# Backup a volume to a tar archive
docker run --rm \
  -v mydata:/source:ro \
  -v $(pwd)/backups:/backup \
  alpine tar czf /backup/mydata-$(date +%Y%m%d-%H%M%S).tar.gz -C /source .

# Restore from a tar archive
docker run --rm \
  -v mydata:/target \
  -v $(pwd)/backups:/backup \
  alpine sh -c "cd /target && tar xzf /backup/mydata-20250218-120000.tar.gz"

Method 2: Direct Filesystem Backup

Since named volumes live in /var/lib/docker/volumes/, you can back them up directly (with the container stopped):

# Find the volume's mount point
docker volume inspect mydata --format '{{.Mountpoint}}'
# /var/lib/docker/volumes/mydata/_data

# Stop the container first for consistency
docker stop mycontainer

# Backup the directory
sudo tar czf mydata-backup.tar.gz -C /var/lib/docker/volumes/mydata/_data .

# Start the container again
docker start mycontainer

Method 3: Database-Specific Backups

For databases, use the application's native backup tools for consistency:

# PostgreSQL backup
docker exec postgres pg_dumpall -U postgres > pg_backup.sql

# MySQL backup
docker exec mysql mysqldump -u root --password=secret --all-databases > mysql_backup.sql

# MongoDB backup
docker exec mongo mongodump --archive=/backup/mongo.archive
docker cp mongo:/backup/mongo.archive ./mongo.archive

# Redis backup (trigger RDB save, then copy)
docker exec redis redis-cli BGSAVE
docker cp redis:/data/dump.rdb ./redis-backup.rdb

Automated Backup Script

#!/bin/bash
# backup-volumes.sh — Automated Docker volume backup

BACKUP_DIR="/backups/docker-volumes"
RETENTION_DAYS=30
DATE=$(date +%Y%m%d-%H%M%S)

mkdir -p "$BACKUP_DIR"

# Backup all named volumes
for VOLUME in $(docker volume ls -q); do
  echo "Backing up volume: $VOLUME"
  docker run --rm \
    -v "$VOLUME":/source:ro \
    -v "$BACKUP_DIR":/backup \
    alpine tar czf "/backup/${VOLUME}-${DATE}.tar.gz" -C /source .
done

# Clean up old backups
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +$RETENTION_DAYS -delete

echo "Backup complete. Files in $BACKUP_DIR"

Schedule this with cron for automated backups:

# Run backup daily at 2 AM
0 2 * * * /usr/local/bin/backup-volumes.sh >> /var/log/docker-backup.log 2>&1

Troubleshooting Permission Issues

Permission problems are the most common source of frustration with Docker volumes. They occur because of mismatches between the user inside the container and the owner of the files on the host.

The Root Cause

When a container runs as a non-root user (say UID 1000), it needs the volume files to be owned by UID 1000. But if the volume was created by a different container running as root (UID 0), the files are owned by root and the non-root container cannot write to them.

# Check file ownership inside the volume
docker run --rm -v mydata:/data alpine ls -la /data

# Check from the host
sudo ls -la /var/lib/docker/volumes/mydata/_data

Solutions

1. Match the container user to the volume owner:

# In Dockerfile, create a user with a specific UID
FROM node:20-alpine
RUN addgroup -g 1000 appgroup && adduser -u 1000 -G appgroup -D appuser
USER appuser

2. Fix ownership in an entrypoint script:

#!/bin/sh
# entrypoint.sh
# Fix volume permissions (runs as root initially)
chown -R appuser:appgroup /app/data
# Drop to non-root user
exec su-exec appuser "$@"

3. Use an init container to set permissions:

services:
  init-permissions:
    image: alpine
    volumes:
      - app_data:/data
    command: chown -R 1000:1000 /data
    # This container exits after fixing permissions

  app:
    image: myapp
    user: "1000:1000"
    volumes:
      - app_data:/data
    depends_on:
      init-permissions:
        condition: service_completed_successfully

4. Use user namespace remapping:

# In /etc/docker/daemon.json
{
  "userns-remap": "default"
}
# This maps container root (0) to a high unprivileged UID on the host

Common Permission Scenarios

Scenario	Symptom	Fix
Container runs as UID 1000, volume owned by root	Permission denied errors	chown 1000:1000 on volume data
New bind mount, directory doesn't exist	Docker creates it as root	Create directory before starting container
PostgreSQL data volume	initdb permission failure	Ensure volume is owned by UID 999 (postgres)
Nginx config bind mount	Cannot read configuration	Ensure files are readable by UID 101 (nginx)

Managing Volumes at Scale

Cleaning Up Unused Volumes

Orphaned volumes accumulate over time and consume disk space:

# List dangling (unused) volumes
docker volume ls -f dangling=true

# Remove all unused volumes (careful!)
docker volume prune

# Remove specific volume
docker volume rm myoldvolume

# See how much space volumes are using
docker system df -v

Monitoring Volume Usage

# Check disk usage of all volumes
sudo du -sh /var/lib/docker/volumes/*/

# Monitor in real time
watch -n 60 'sudo du -sh /var/lib/docker/volumes/*/ | sort -rh | head -20'

Tip: usulnet's volume management interface lets you browse volume contents, see usage statistics, and manage volumes across all connected Docker hosts. This is significantly easier than SSH-ing into each host and running du commands manually, especially when managing storage across multiple servers.

Volume Performance Considerations

Storage Drivers

Docker's storage driver affects volume performance. The overlay2 driver (default on modern Linux) is the best choice for most workloads. Avoid aufs and devicemapper in production.

# Check your current storage driver
docker info | grep "Storage Driver"

# Configure in /etc/docker/daemon.json
{
  "storage-driver": "overlay2"
}

I/O Performance Tips

Use named volumes over bind mounts on macOS: Docker Desktop's file sharing for bind mounts has significant performance overhead. Named volumes are stored in the Linux VM and are much faster.
NFS tuning: For NFS volumes, tune the read/write size (rsize=1048576,wsize=1048576) and use async for write-heavy workloads where durability guarantees from the application layer are sufficient.
Use tmpfs for temp data: Writing temporary files to tmpfs (RAM) is orders of magnitude faster than disk I/O and prevents unnecessary wear on SSDs.
Avoid storing databases on NFS: Database engines are designed for local block storage. NFS adds latency and can cause locking issues. Use local volumes for database containers and replicate at the application level.

Docker volumes are the foundation of stateful container workloads. Understanding the differences between named volumes, bind mounts, and tmpfs, knowing how to configure network storage, implementing reliable backups, and troubleshooting permission issues are essential skills for anyone running containers in production. With proper volume management, your data remains safe, accessible, and performant regardless of container lifecycle events.