Caching is the single most effective technique for improving application performance. A well-implemented caching strategy can reduce database load by 90%, cut response times from seconds to milliseconds, and allow your infrastructure to handle 10x the traffic without scaling hardware. But caching done poorly creates consistency bugs, stale data, and debugging nightmares that are harder to fix than the performance problem they were meant to solve.

This guide covers caching at every layer of a modern web application: Redis for data caching, Nginx for reverse proxy caching, CDNs for edge caching, application-level in-memory caching, and the critical cache invalidation strategies that keep everything consistent.

The Caching Layers

A request from a user's browser can hit multiple cache layers before reaching your database:

Layer Location Latency Best For
Browser cache User's device 0ms Static assets (JS, CSS, images)
CDN Edge servers worldwide 5-50ms Static content, public API responses
Reverse proxy (Nginx/Varnish) In front of application <1ms Full page cache, API response cache
Application in-memory Application process <1ms Hot data, computed values, config
Distributed cache (Redis) Separate server/container 0.5-2ms Session data, query results, shared state
Database query cache Database server 1-5ms Repeated identical queries (limited use)

Each layer reduces load on the layer below it. The goal is to serve as many requests as possible from the fastest layer.

Redis as a Cache

Redis is the most common application-level cache. It sits between your application and your database, storing frequently accessed data in memory.

Cache-Aside Pattern (Lazy Loading)

The most common pattern: the application checks the cache first, and only queries the database on a cache miss.

// Go example: cache-aside pattern with Redis
func (s *UserService) GetUser(ctx context.Context, userID string) (*User, error) {
    // 1. Check cache first
    cacheKey := fmt.Sprintf("user:%s", userID)
    cached, err := s.redis.Get(ctx, cacheKey).Result()
    if err == nil {
        var user User
        if err := json.Unmarshal([]byte(cached), &user); err == nil {
            return &user, nil // Cache hit
        }
    }

    // 2. Cache miss: query database
    user, err := s.repo.FindByID(ctx, userID)
    if err != nil {
        return nil, fmt.Errorf("find user: %w", err)
    }

    // 3. Store in cache with TTL
    data, _ := json.Marshal(user)
    s.redis.Set(ctx, cacheKey, data, 15*time.Minute)

    return user, nil
}

// Invalidate on update
func (s *UserService) UpdateUser(ctx context.Context, user *User) error {
    if err := s.repo.Update(ctx, user); err != nil {
        return err
    }
    // Delete cache entry (write-invalidate)
    s.redis.Del(ctx, fmt.Sprintf("user:%s", user.ID))
    return nil
}

Write-Through Pattern

Write-through updates the cache and the database simultaneously on every write:

func (s *ProductService) UpdateProduct(ctx context.Context, product *Product) error {
    // Write to database
    if err := s.repo.Update(ctx, product); err != nil {
        return err
    }

    // Write to cache immediately
    cacheKey := fmt.Sprintf("product:%s", product.ID)
    data, _ := json.Marshal(product)
    s.redis.Set(ctx, cacheKey, data, 1*time.Hour)

    return nil
}

Redis Cache Configuration

# redis.conf for caching workload
maxmemory 2gb
maxmemory-policy allkeys-lfu    # Evict least frequently used keys

# Disable persistence for pure cache (data can be regenerated)
save ""
appendonly no

# Optimize for caching
# Smaller values, many keys
hash-max-ziplist-entries 128
hash-max-ziplist-value 64
list-max-ziplist-size -2
set-max-intset-entries 512
Tip: Use allkeys-lfu (Least Frequently Used) instead of allkeys-lru (Least Recently Used) for caching workloads. LFU considers access frequency, not just recency, so a key that was accessed 1000 times yesterday is kept over a key accessed once today. This produces better hit ratios for most workloads.

Nginx Proxy Cache

Nginx can cache entire HTTP responses, serving them directly without hitting your application server:

# nginx.conf - Proxy cache configuration
http {
    # Define cache zone: 10MB for keys, 1GB for cached responses
    proxy_cache_path /var/cache/nginx levels=1:2
        keys_zone=app_cache:10m
        max_size=1g
        inactive=60m
        use_temp_path=off;

    server {
        listen 80;
        server_name myapp.com;

        # Cache static assets aggressively
        location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff2)$ {
            proxy_pass http://app:8080;
            proxy_cache app_cache;
            proxy_cache_valid 200 30d;
            proxy_cache_valid 404 1m;
            add_header X-Cache-Status $upstream_cache_status;
            add_header Cache-Control "public, max-age=2592000, immutable";
        }

        # Cache API responses selectively
        location /api/public/ {
            proxy_pass http://app:8080;
            proxy_cache app_cache;
            proxy_cache_valid 200 5m;
            proxy_cache_valid 404 1m;
            proxy_cache_key "$request_uri";
            proxy_cache_use_stale error timeout updating
                                  http_500 http_502 http_503 http_504;
            add_header X-Cache-Status $upstream_cache_status;
        }

        # Never cache authenticated endpoints
        location /api/user/ {
            proxy_pass http://app:8080;
            proxy_no_cache 1;
            proxy_cache_bypass 1;
        }

        # Purge cache endpoint (restricted)
        location /purge/ {
            allow 127.0.0.1;
            deny all;
            proxy_cache_purge app_cache "$scheme$request_method$host$request_uri";
        }
    }
}

The X-Cache-Status header is invaluable for debugging. It reports: HIT (served from cache), MISS (fetched from upstream), EXPIRED (cache entry expired), STALE (served stale while updating), or BYPASS (cache explicitly bypassed).

Varnish Cache

Varnish is a dedicated HTTP caching proxy that outperforms Nginx for cache-heavy workloads:

# docker-compose.yml - Varnish in front of your app
services:
  varnish:
    image: varnish:7.4
    container_name: varnish
    ports:
      - "80:80"
    volumes:
      - ./varnish/default.vcl:/etc/varnish/default.vcl:ro
    command: >
      -a :80
      -f /etc/varnish/default.vcl
      -s malloc,1G
    depends_on:
      - app
# default.vcl - Varnish Configuration Language
vcl 4.1;

backend default {
    .host = "app";
    .port = "8080";
    .probe = {
        .url = "/health";
        .interval = 5s;
        .timeout = 2s;
        .window = 5;
        .threshold = 3;
    }
}

sub vcl_recv {
    # Don't cache POST requests
    if (req.method == "POST") {
        return (pass);
    }

    # Don't cache authenticated requests
    if (req.http.Authorization || req.http.Cookie ~ "session_id") {
        return (pass);
    }

    # Strip tracking cookies from cacheable requests
    if (req.url ~ "\.(css|js|png|jpg|gif|ico|svg|woff2)$") {
        unset req.http.Cookie;
    }
}

sub vcl_backend_response {
    # Cache static assets for 30 days
    if (bereq.url ~ "\.(css|js|png|jpg|gif|ico|svg|woff2)$") {
        set beresp.ttl = 30d;
        unset beresp.http.Set-Cookie;
    }

    # Cache HTML pages for 5 minutes
    if (beresp.http.Content-Type ~ "text/html") {
        set beresp.ttl = 5m;
        set beresp.grace = 1h;  # Serve stale for 1h while revalidating
    }
}

CDN Caching (Cloudflare)

CDNs cache content at edge servers worldwide, serving users from the geographically closest location:

# Set cache headers in your application
# Static assets: cache for 1 year, immutable
Cache-Control: public, max-age=31536000, immutable

# API responses: cache for 5 minutes, revalidate
Cache-Control: public, max-age=300, s-maxage=600, stale-while-revalidate=60

# Authenticated content: never cache
Cache-Control: private, no-cache, no-store, must-revalidate

# Cloudflare-specific: cache everything for 2 hours
# (set via Page Rules or Cache Rules)
Cache-Control: public, s-maxage=7200

Cache-Control header directives explained:

Directive Meaning
public Any cache (CDN, proxy) can store the response
private Only the browser can cache (no CDN/proxy)
max-age=N Cache for N seconds (browser and CDN)
s-maxage=N Cache for N seconds (CDN/proxy only, overrides max-age)
no-cache Revalidate with origin before serving cached copy
no-store Never store any copy (for sensitive data)
immutable Content will never change (perfect for versioned assets)
stale-while-revalidate=N Serve stale for N seconds while fetching fresh copy

Application-Level In-Memory Caching

For the fastest possible access, cache data directly in your application's memory. This avoids even the network hop to Redis:

// Go: in-memory cache with TTL using sync.Map + expiration
type MemoryCache struct {
    items sync.Map
}

type cacheItem struct {
    value     interface{}
    expiresAt time.Time
}

func (c *MemoryCache) Get(key string) (interface{}, bool) {
    item, ok := c.items.Load(key)
    if !ok {
        return nil, false
    }
    ci := item.(*cacheItem)
    if time.Now().After(ci.expiresAt) {
        c.items.Delete(key)
        return nil, false
    }
    return ci.value, true
}

func (c *MemoryCache) Set(key string, value interface{}, ttl time.Duration) {
    c.items.Store(key, &cacheItem{
        value:     value,
        expiresAt: time.Now().Add(ttl),
    })
}

// For production, use established libraries:
// - github.com/dgraph-io/ristretto (concurrent, admission policy)
// - github.com/allegro/bigcache (GC-friendly, []byte values)
// - github.com/hashicorp/golang-lru (simple LRU)
Warning: In-memory caches are per-process. If you run multiple instances of your application (which you should for availability), each instance has its own cache. This means cache invalidation only affects one instance. Use in-memory caching for data that is either immutable (config, lookup tables) or where staleness within the TTL window is acceptable. For data that must be consistent across instances, use Redis.

Cache Invalidation Strategies

"There are only two hard things in Computer Science: cache invalidation and naming things." Here are the practical strategies:

1. TTL-Based Expiration

Set a time-to-live and accept that data may be stale for that duration. Simplest approach, appropriate when eventual consistency is acceptable.

2. Event-Driven Invalidation

When data changes, explicitly invalidate the cache. More complex but provides stronger consistency.

// Event-driven invalidation with Redis pub/sub
func (s *ProductService) UpdateProduct(ctx context.Context, product *Product) error {
    if err := s.repo.Update(ctx, product); err != nil {
        return err
    }

    // Invalidate specific cache entries
    s.redis.Del(ctx, fmt.Sprintf("product:%s", product.ID))
    s.redis.Del(ctx, "products:list")
    s.redis.Del(ctx, fmt.Sprintf("category:%s:products", product.CategoryID))

    // Notify other instances via pub/sub
    s.redis.Publish(ctx, "cache:invalidate", fmt.Sprintf("product:%s", product.ID))

    return nil
}

3. Cache Stampede Prevention

When a popular cache entry expires, hundreds of concurrent requests all hit the database simultaneously. Solutions:

// Mutex/lock approach: only one request recomputes
func (s *Service) GetPopularData(ctx context.Context) (*Data, error) {
    cacheKey := "popular:data"
    lockKey := cacheKey + ":lock"

    // Try cache
    cached, err := s.redis.Get(ctx, cacheKey).Result()
    if err == nil {
        return unmarshal(cached)
    }

    // Try to acquire lock
    acquired, _ := s.redis.SetNX(ctx, lockKey, "1", 10*time.Second).Result()
    if acquired {
        // This request recomputes
        defer s.redis.Del(ctx, lockKey)
        data, err := s.computeExpensiveData(ctx)
        if err != nil {
            return nil, err
        }
        s.redis.Set(ctx, cacheKey, marshal(data), 5*time.Minute)
        return data, nil
    }

    // Other requests wait and retry
    time.Sleep(100 * time.Millisecond)
    return s.GetPopularData(ctx) // Retry (add max retries in production)
}

Distributed Caching Patterns

When your application runs across multiple servers, cache consistency requires coordination:

  • Centralized cache (Redis) - All instances read/write the same cache. Simple, but Redis becomes a single point of failure.
  • Cache replication - Each instance has a local cache that is synchronized via pub/sub or a message queue.
  • Two-tier caching - Local in-memory cache (L1) backed by Redis (L2). Check L1 first, then L2, then database.
// Two-tier cache: local memory + Redis
func (s *Service) Get(ctx context.Context, key string) (*Data, error) {
    // L1: check local memory cache
    if data, ok := s.memCache.Get(key); ok {
        return data.(*Data), nil
    }

    // L2: check Redis
    cached, err := s.redis.Get(ctx, key).Result()
    if err == nil {
        data := unmarshal(cached)
        s.memCache.Set(key, data, 1*time.Minute) // Populate L1
        return data, nil
    }

    // Cache miss: query database
    data, err := s.repo.Find(ctx, key)
    if err != nil {
        return nil, err
    }

    // Populate both caches
    s.redis.Set(ctx, key, marshal(data), 15*time.Minute)  // L2
    s.memCache.Set(key, data, 1*time.Minute)                // L1

    return data, nil
}

When running containerized applications with caching layers, tools like usulnet help monitor Redis container health, memory usage, and eviction rates alongside your application containers, providing a unified view of your caching infrastructure's performance.

Caching Strategy Decision Matrix

Data Type Where to Cache TTL Invalidation
Static assets (JS, CSS) CDN + Browser 1 year (versioned filenames) Deploy new version
User session Redis 24 hours On logout
Database query results Redis 5-15 minutes On write to related tables
API rate limit counters Redis Window size (1 min / 1 hour) Expires naturally
Config / feature flags In-memory 30-60 seconds TTL expiry
Full HTML pages Nginx / Varnish 1-5 minutes Purge on content change
Computed aggregations Redis + In-memory 5-30 minutes Scheduled recomputation