Docker Swarm Storage: Persistent Data Across Nodes with Volume Plugins
Storage is the Achilles' heel of Docker Swarm. Containers are ephemeral and can be rescheduled to any node at any time, but their data volumes are node-local by default. When Swarm reschedules a database container from Node A to Node B, the data stays behind on Node A. The container starts fresh on Node B with an empty volume, and your database is gone.
This is not a bug. It is a fundamental architectural decision. Docker volumes are local to the Docker daemon that creates them. Solving this requires either constraining stateful services to specific nodes, or using shared storage systems that make data available across all nodes. This guide covers every practical approach.
The Storage Problem in Swarm
| Scenario | What Happens | Data Outcome |
|---|---|---|
| Service rescheduled to another node | New container starts on different node | Local volume data is not available (data loss) |
| Node failure with local volumes | Tasks rescheduled to healthy nodes | Data on failed node is inaccessible until node recovers |
| Rolling update on constrained service | New task starts on same node | Data preserved (same named volume) |
| Scale up with shared storage | New replicas mount same volume | Data shared (requires filesystem that supports concurrent access) |
Strategy 1: Pin Stateful Services to Nodes
The simplest approach: constrain your stateful services to specific nodes using placement constraints. The data never needs to move because the service always runs on the same node.
version: "3.8"
services:
postgres:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data
deploy:
replicas: 1
placement:
constraints:
- node.labels.db == true
- node.hostname == db-node-01
resources:
limits:
memory: 8G
environment:
POSTGRES_PASSWORD_FILE: /run/secrets/db_password
secrets:
- db_password
volumes:
pgdata:
driver: local
# Label the node
docker node update --label-add db=true db-node-01
This works, but it means your database cannot survive a node failure. It is the "good enough" approach for many deployments, especially when combined with application-level replication (PostgreSQL streaming replication, MySQL replication, etc.).
Strategy 2: NFS Volumes
NFS (Network File System) is the most common shared storage solution for Swarm. Every node mounts the same NFS share, so volumes are accessible from any node.
Setting Up NFS Server
# On the NFS server
apt-get install nfs-kernel-server
# Create the export directory
mkdir -p /srv/nfs/swarm-volumes
chown nobody:nogroup /srv/nfs/swarm-volumes
# Configure exports
cat >> /etc/exports << 'EOF'
/srv/nfs/swarm-volumes 10.0.0.0/16(rw,sync,no_subtree_check,no_root_squash)
EOF
# Apply and start
exportfs -ra
systemctl enable --now nfs-server
Using NFS in Swarm Stack Files
version: "3.8"
services:
app:
image: myapp/web:latest
volumes:
- uploads:/app/uploads
deploy:
replicas: 4
postgres:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data
deploy:
replicas: 1
volumes:
uploads:
driver: local
driver_opts:
type: nfs
o: addr=10.0.1.100,rw,nfsvers=4.1,hard,intr
device: ":/srv/nfs/swarm-volumes/uploads"
pgdata:
driver: local
driver_opts:
type: nfs
o: addr=10.0.1.100,rw,nfsvers=4.1,hard,intr
device: ":/srv/nfs/swarm-volumes/pgdata"
hard,intr mount options ensure the client retries on NFS server outages rather than returning errors, but this means containers will hang during NFS downtime. For databases with high I/O requirements, prefer node pinning with application-level replication.
| NFS Workload | Suitability | Notes |
|---|---|---|
| File uploads / media | Excellent | Read-heavy, large sequential I/O |
| Static assets | Excellent | Read-only in production |
| Application configs | Good | Small files, infrequent access |
| PostgreSQL / MySQL | Acceptable | Works but 2-5x slower than local SSD |
| MongoDB / Elasticsearch | Poor | Random I/O patterns suffer greatly on NFS |
| Redis / high-IOPS cache | Not recommended | Latency-sensitive; use local storage only |
Strategy 3: GlusterFS
GlusterFS is a distributed filesystem that replicates data across multiple nodes. Unlike NFS, there is no single point of failure for the storage server.
# Install GlusterFS on all storage nodes
apt-get install glusterfs-server
systemctl enable --now glusterd
# Create a replicated volume across 3 nodes
gluster peer probe node-02
gluster peer probe node-03
gluster volume create swarm-vol replica 3 \
node-01:/data/gluster/brick1 \
node-02:/data/gluster/brick2 \
node-03:/data/gluster/brick3
gluster volume start swarm-vol
# Mount on all Swarm nodes
mount -t glusterfs node-01:/swarm-vol /mnt/gluster
GlusterFS Volume Plugin
# Install the GlusterFS volume plugin
docker plugin install --grant-all-permissions \
trajano/glusterfs-volume-plugin
# Use in a stack file
version: "3.8"
services:
app:
image: myapp/web:latest
volumes:
- app_data:/app/data
volumes:
app_data:
driver: trajano/glusterfs-volume-plugin
driver_opts:
glusteropts: "--volfile-server=node-01 --volfile-id=swarm-vol"
subdir: "app_data"
Strategy 4: Ceph / RBD
Ceph provides block storage (RBD), object storage, and filesystem (CephFS) from a single cluster. It is the most capable storage backend but also the most complex to operate.
# Using CephFS volumes in Swarm
version: "3.8"
services:
postgres:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data
volumes:
pgdata:
driver: local
driver_opts:
type: ceph
o: name=admin,secret=AQBxxxxxxxxx,mds_namespace=swarm
device: "10.0.1.10,10.0.1.11,10.0.1.12:/pgdata"
For block-level storage with better database performance, use Ceph RBD with the rexray plugin:
# Install REX-Ray with Ceph RBD support
docker plugin install rexray/ceph \
CEPH_CLUSTERTYPE=ceph \
CEPH_MONITORS="10.0.1.10:6789,10.0.1.11:6789" \
CEPH_POOL=docker
# Use in stack file
volumes:
pgdata:
driver: rexray/ceph
driver_opts:
size: 50 # GB
Strategy 5: REX-Ray Volume Plugin
REX-Ray is a Docker volume plugin that supports multiple storage backends: AWS EBS, Azure Disk, GCE PD, Ceph, ScaleIO, and more. It provides portable volume management across cloud providers.
# Install REX-Ray for AWS EBS
docker plugin install rexray/ebs \
EBS_ACCESSKEY=AKIA... \
EBS_SECRETKEY=xxx \
EBS_REGION=us-east-1
# Create a volume backed by EBS
docker volume create --driver rexray/ebs \
--opt size=100 \
--opt volumeType=gp3 \
--opt iops=3000 \
db-volume
# Use in a stack
version: "3.8"
services:
postgres:
image: postgres:16
volumes:
- db-volume:/var/lib/postgresql/data
deploy:
replicas: 1
volumes:
db-volume:
driver: rexray/ebs
driver_opts:
size: 100
volumeType: gp3
Strategy 6: local-persist Plugin
The local-persist plugin creates local volumes at specific host paths, ensuring the data directory exists and persists across container restarts. It is simpler than NFS but still node-local.
# Install local-persist
docker plugin install cwspear/docker-local-persist-volume-plugin
# Use in stack file
volumes:
pgdata:
driver: cwspear/docker-local-persist-volume-plugin
driver_opts:
mountpoint: /data/postgres
Database Deployments in Swarm
Databases are the most critical stateful services. Here are proven patterns for each major database:
PostgreSQL with Streaming Replication
version: "3.8"
services:
postgres-primary:
image: postgres:16
volumes:
- pg_primary:/var/lib/postgresql/data
environment:
POSTGRES_PASSWORD_FILE: /run/secrets/db_password
POSTGRES_REPLICATION_USER: replicator
POSTGRES_REPLICATION_PASSWORD_FILE: /run/secrets/repl_password
deploy:
replicas: 1
placement:
constraints:
- node.labels.pg-role == primary
secrets:
- db_password
- repl_password
networks:
- db
postgres-replica:
image: postgres:16
environment:
PGDATA: /var/lib/postgresql/data
command: >
bash -c "
pg_basebackup -h postgres-primary -U replicator -D /var/lib/postgresql/data -Fp -Xs -P -R
postgres
"
volumes:
- pg_replica:/var/lib/postgresql/data
deploy:
replicas: 2
placement:
constraints:
- node.labels.pg-role == replica
preferences:
- spread: node.labels.zone
networks:
- db
volumes:
pg_primary:
driver: local
pg_replica:
driver: local
networks:
db:
driver: overlay
driver_opts:
encrypted: "true"
internal: true
Redis with Sentinel
version: "3.8"
services:
redis-master:
image: redis:7
command: redis-server --appendonly yes --requirepass "${REDIS_PASSWORD}"
volumes:
- redis_master:/data
deploy:
replicas: 1
placement:
constraints:
- node.labels.redis == master
networks:
- cache
redis-replica:
image: redis:7
command: >
redis-server --appendonly yes
--replicaof redis-master 6379
--requirepass "${REDIS_PASSWORD}"
--masterauth "${REDIS_PASSWORD}"
volumes:
- redis_replica:/data
deploy:
replicas: 2
networks:
- cache
redis-sentinel:
image: redis:7
command: redis-sentinel /etc/redis/sentinel.conf
configs:
- source: sentinel_conf
target: /etc/redis/sentinel.conf
deploy:
replicas: 3
placement:
preferences:
- spread: node.labels.zone
networks:
- cache
volumes:
redis_master:
driver: local
redis_replica:
driver: local
networks:
cache:
driver: overlay
Choosing a Storage Strategy
| Strategy | Complexity | Performance | HA | Best For |
|---|---|---|---|---|
| Node pinning | Low | Native | No | Dev/test, single-node databases |
| NFS | Low-Medium | Moderate | SPOF on NFS server | File shares, media, configs |
| GlusterFS | Medium | Moderate | Yes (replicated) | General shared storage |
| Ceph | High | High (RBD) | Yes | Large clusters, enterprise |
| Cloud block (EBS/Azure) | Low | High | Volume follows task | Cloud deployments, databases |
| App-level replication | Medium | Native | Yes | Databases (PG, MySQL, Redis) |
For most teams, the pragmatic approach is to use NFS for shared application data (uploads, media) and node pinning with application-level replication for databases. Platforms like usulnet can help you visualize which volumes are on which nodes and track the relationship between services and their storage, making it easier to manage stateful deployments across a Swarm cluster.
Conclusion
Persistent storage in Docker Swarm requires deliberate architecture decisions. There is no single solution that works for all workloads. Match your storage strategy to your data's performance requirements, availability needs, and operational complexity budget. Start simple (node pinning + NFS), add distributed storage when you outgrow it, and always combine your storage strategy with proper backups regardless of how replicated your filesystem is.