Storage is the Achilles' heel of Docker Swarm. Containers are ephemeral and can be rescheduled to any node at any time, but their data volumes are node-local by default. When Swarm reschedules a database container from Node A to Node B, the data stays behind on Node A. The container starts fresh on Node B with an empty volume, and your database is gone.

This is not a bug. It is a fundamental architectural decision. Docker volumes are local to the Docker daemon that creates them. Solving this requires either constraining stateful services to specific nodes, or using shared storage systems that make data available across all nodes. This guide covers every practical approach.

The Storage Problem in Swarm

Scenario What Happens Data Outcome
Service rescheduled to another node New container starts on different node Local volume data is not available (data loss)
Node failure with local volumes Tasks rescheduled to healthy nodes Data on failed node is inaccessible until node recovers
Rolling update on constrained service New task starts on same node Data preserved (same named volume)
Scale up with shared storage New replicas mount same volume Data shared (requires filesystem that supports concurrent access)

Strategy 1: Pin Stateful Services to Nodes

The simplest approach: constrain your stateful services to specific nodes using placement constraints. The data never needs to move because the service always runs on the same node.

version: "3.8"

services:
  postgres:
    image: postgres:16
    volumes:
      - pgdata:/var/lib/postgresql/data
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.labels.db == true
          - node.hostname == db-node-01
      resources:
        limits:
          memory: 8G
    environment:
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password
    secrets:
      - db_password

volumes:
  pgdata:
    driver: local
# Label the node
docker node update --label-add db=true db-node-01

This works, but it means your database cannot survive a node failure. It is the "good enough" approach for many deployments, especially when combined with application-level replication (PostgreSQL streaming replication, MySQL replication, etc.).

Strategy 2: NFS Volumes

NFS (Network File System) is the most common shared storage solution for Swarm. Every node mounts the same NFS share, so volumes are accessible from any node.

Setting Up NFS Server

# On the NFS server
apt-get install nfs-kernel-server

# Create the export directory
mkdir -p /srv/nfs/swarm-volumes
chown nobody:nogroup /srv/nfs/swarm-volumes

# Configure exports
cat >> /etc/exports << 'EOF'
/srv/nfs/swarm-volumes 10.0.0.0/16(rw,sync,no_subtree_check,no_root_squash)
EOF

# Apply and start
exportfs -ra
systemctl enable --now nfs-server

Using NFS in Swarm Stack Files

version: "3.8"

services:
  app:
    image: myapp/web:latest
    volumes:
      - uploads:/app/uploads
    deploy:
      replicas: 4

  postgres:
    image: postgres:16
    volumes:
      - pgdata:/var/lib/postgresql/data
    deploy:
      replicas: 1

volumes:
  uploads:
    driver: local
    driver_opts:
      type: nfs
      o: addr=10.0.1.100,rw,nfsvers=4.1,hard,intr
      device: ":/srv/nfs/swarm-volumes/uploads"

  pgdata:
    driver: local
    driver_opts:
      type: nfs
      o: addr=10.0.1.100,rw,nfsvers=4.1,hard,intr
      device: ":/srv/nfs/swarm-volumes/pgdata"
Warning: NFS is not suitable for all database workloads. PostgreSQL and MySQL can run on NFS, but performance will be significantly lower than local SSD storage. The hard,intr mount options ensure the client retries on NFS server outages rather than returning errors, but this means containers will hang during NFS downtime. For databases with high I/O requirements, prefer node pinning with application-level replication.
NFS Workload Suitability Notes
File uploads / media Excellent Read-heavy, large sequential I/O
Static assets Excellent Read-only in production
Application configs Good Small files, infrequent access
PostgreSQL / MySQL Acceptable Works but 2-5x slower than local SSD
MongoDB / Elasticsearch Poor Random I/O patterns suffer greatly on NFS
Redis / high-IOPS cache Not recommended Latency-sensitive; use local storage only

Strategy 3: GlusterFS

GlusterFS is a distributed filesystem that replicates data across multiple nodes. Unlike NFS, there is no single point of failure for the storage server.

# Install GlusterFS on all storage nodes
apt-get install glusterfs-server
systemctl enable --now glusterd

# Create a replicated volume across 3 nodes
gluster peer probe node-02
gluster peer probe node-03

gluster volume create swarm-vol replica 3 \
  node-01:/data/gluster/brick1 \
  node-02:/data/gluster/brick2 \
  node-03:/data/gluster/brick3

gluster volume start swarm-vol

# Mount on all Swarm nodes
mount -t glusterfs node-01:/swarm-vol /mnt/gluster

GlusterFS Volume Plugin

# Install the GlusterFS volume plugin
docker plugin install --grant-all-permissions \
  trajano/glusterfs-volume-plugin

# Use in a stack file
version: "3.8"

services:
  app:
    image: myapp/web:latest
    volumes:
      - app_data:/app/data

volumes:
  app_data:
    driver: trajano/glusterfs-volume-plugin
    driver_opts:
      glusteropts: "--volfile-server=node-01 --volfile-id=swarm-vol"
      subdir: "app_data"

Strategy 4: Ceph / RBD

Ceph provides block storage (RBD), object storage, and filesystem (CephFS) from a single cluster. It is the most capable storage backend but also the most complex to operate.

# Using CephFS volumes in Swarm
version: "3.8"

services:
  postgres:
    image: postgres:16
    volumes:
      - pgdata:/var/lib/postgresql/data

volumes:
  pgdata:
    driver: local
    driver_opts:
      type: ceph
      o: name=admin,secret=AQBxxxxxxxxx,mds_namespace=swarm
      device: "10.0.1.10,10.0.1.11,10.0.1.12:/pgdata"

For block-level storage with better database performance, use Ceph RBD with the rexray plugin:

# Install REX-Ray with Ceph RBD support
docker plugin install rexray/ceph \
  CEPH_CLUSTERTYPE=ceph \
  CEPH_MONITORS="10.0.1.10:6789,10.0.1.11:6789" \
  CEPH_POOL=docker

# Use in stack file
volumes:
  pgdata:
    driver: rexray/ceph
    driver_opts:
      size: 50  # GB

Strategy 5: REX-Ray Volume Plugin

REX-Ray is a Docker volume plugin that supports multiple storage backends: AWS EBS, Azure Disk, GCE PD, Ceph, ScaleIO, and more. It provides portable volume management across cloud providers.

# Install REX-Ray for AWS EBS
docker plugin install rexray/ebs \
  EBS_ACCESSKEY=AKIA... \
  EBS_SECRETKEY=xxx \
  EBS_REGION=us-east-1

# Create a volume backed by EBS
docker volume create --driver rexray/ebs \
  --opt size=100 \
  --opt volumeType=gp3 \
  --opt iops=3000 \
  db-volume

# Use in a stack
version: "3.8"

services:
  postgres:
    image: postgres:16
    volumes:
      - db-volume:/var/lib/postgresql/data
    deploy:
      replicas: 1

volumes:
  db-volume:
    driver: rexray/ebs
    driver_opts:
      size: 100
      volumeType: gp3
Tip: When using cloud block storage (EBS, Azure Disk), the volume can only be attached to one node at a time. This means your stateful service can still only run on one node, but the volume follows the service automatically when it is rescheduled. This is a significant improvement over local volumes: a node failure causes the volume to be detached and reattached to the new node within a few minutes.

Strategy 6: local-persist Plugin

The local-persist plugin creates local volumes at specific host paths, ensuring the data directory exists and persists across container restarts. It is simpler than NFS but still node-local.

# Install local-persist
docker plugin install cwspear/docker-local-persist-volume-plugin

# Use in stack file
volumes:
  pgdata:
    driver: cwspear/docker-local-persist-volume-plugin
    driver_opts:
      mountpoint: /data/postgres

Database Deployments in Swarm

Databases are the most critical stateful services. Here are proven patterns for each major database:

PostgreSQL with Streaming Replication

version: "3.8"

services:
  postgres-primary:
    image: postgres:16
    volumes:
      - pg_primary:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password
      POSTGRES_REPLICATION_USER: replicator
      POSTGRES_REPLICATION_PASSWORD_FILE: /run/secrets/repl_password
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.labels.pg-role == primary
    secrets:
      - db_password
      - repl_password
    networks:
      - db

  postgres-replica:
    image: postgres:16
    environment:
      PGDATA: /var/lib/postgresql/data
    command: >
      bash -c "
        pg_basebackup -h postgres-primary -U replicator -D /var/lib/postgresql/data -Fp -Xs -P -R
        postgres
      "
    volumes:
      - pg_replica:/var/lib/postgresql/data
    deploy:
      replicas: 2
      placement:
        constraints:
          - node.labels.pg-role == replica
        preferences:
          - spread: node.labels.zone
    networks:
      - db

volumes:
  pg_primary:
    driver: local
  pg_replica:
    driver: local

networks:
  db:
    driver: overlay
    driver_opts:
      encrypted: "true"
    internal: true

Redis with Sentinel

version: "3.8"

services:
  redis-master:
    image: redis:7
    command: redis-server --appendonly yes --requirepass "${REDIS_PASSWORD}"
    volumes:
      - redis_master:/data
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.labels.redis == master
    networks:
      - cache

  redis-replica:
    image: redis:7
    command: >
      redis-server --appendonly yes
      --replicaof redis-master 6379
      --requirepass "${REDIS_PASSWORD}"
      --masterauth "${REDIS_PASSWORD}"
    volumes:
      - redis_replica:/data
    deploy:
      replicas: 2
    networks:
      - cache

  redis-sentinel:
    image: redis:7
    command: redis-sentinel /etc/redis/sentinel.conf
    configs:
      - source: sentinel_conf
        target: /etc/redis/sentinel.conf
    deploy:
      replicas: 3
      placement:
        preferences:
          - spread: node.labels.zone
    networks:
      - cache

volumes:
  redis_master:
    driver: local
  redis_replica:
    driver: local

networks:
  cache:
    driver: overlay

Choosing a Storage Strategy

Strategy Complexity Performance HA Best For
Node pinning Low Native No Dev/test, single-node databases
NFS Low-Medium Moderate SPOF on NFS server File shares, media, configs
GlusterFS Medium Moderate Yes (replicated) General shared storage
Ceph High High (RBD) Yes Large clusters, enterprise
Cloud block (EBS/Azure) Low High Volume follows task Cloud deployments, databases
App-level replication Medium Native Yes Databases (PG, MySQL, Redis)

For most teams, the pragmatic approach is to use NFS for shared application data (uploads, media) and node pinning with application-level replication for databases. Platforms like usulnet can help you visualize which volumes are on which nodes and track the relationship between services and their storage, making it easier to manage stateful deployments across a Swarm cluster.

Conclusion

Persistent storage in Docker Swarm requires deliberate architecture decisions. There is no single solution that works for all workloads. Match your storage strategy to your data's performance requirements, availability needs, and operational complexity budget. Start simple (node pinning + NFS), add distributed storage when you outgrow it, and always combine your storage strategy with proper backups regardless of how replicated your filesystem is.