Cache Invalidation Patterns

Strategies for maintaining cache consistency in modern web applications

The Cache Invalidation Challenge

In our previous lectures, we explored how caching can dramatically improve application performance by storing frequently accessed data in fast-access storage like Redis. However, caching introduces a critical challenge: ensuring that cached data remains consistent with the source of truth.

Phil Karlton, a computer scientist at Netscape, famously said: "There are only two hard things in Computer Science: cache invalidation and naming things." Today, we'll tackle the first of these challenges.

flowchart TD
    A[Application] --> B[Cache]
    A --> C[Database]
    B -.-> A
    C -.-> A
    
    subgraph "Cache Invalidation Challenge"
        D[When do we update the cache?]
        E[How do we update the cache?]
        F[How do we prevent stale data?]
        G[What happens during failures?]
    end
                

Why Cache Invalidation Is Important

Cache invalidation is the process of removing or updating cached data when the source data changes. Without proper invalidation strategies:

The Consistency Spectrum

Different applications have different consistency requirements, ranging from:

Cache Invalidation vs. Cache Expiration

Before diving into invalidation patterns, it's important to understand the difference:

Both approaches have their place in a comprehensive caching strategy, and we'll explore how they can be combined effectively.

Common Cache Invalidation Patterns

graph TD
    A[Cache Invalidation Patterns] --> B[Time-Based]
    A --> C[Event-Based]
    A --> D[Query-Based]
    A --> E[Hybrid Approaches]
    
    B --> B1[TTL Expiration]
    B --> B2[Scheduled Refresh]
    
    C --> C1[Write-Through]
    C --> C2[Write-Behind]
    C --> C3[Write-Around]
    C --> C4[Cache-Aside]
    
    D --> D1[Content-Based]
    D --> D2[Query Invalidation]
    
    E --> E1[Leased-Based]
    E --> E2[Versioned Cache Keys]
    E --> E3[Cache Stamping]
                

Time-Based Invalidation

Time-based invalidation relies on setting expiration timeouts for cached data. After the specified time has elapsed, the data is considered invalid and is removed from the cache.

TTL (Time-To-Live) Expiration

The simplest approach to cache invalidation is to set an expiration time on each cached item.


// Using Redis with TTL expiration
async function getCachedData(key, fetchFunction, ttl = 3600) {
  const client = redis.createClient();
  await client.connect();
  
  // Try to get data from cache
  const cachedData = await client.get(key);
  
  if (cachedData) {
    return JSON.parse(cachedData);
  }
  
  // If no cached data, fetch fresh data
  const freshData = await fetchFunction();
  
  // Store in cache with TTL
  await client.set(key, JSON.stringify(freshData), {
    EX: ttl // Expires in ttl seconds
  });
  
  return freshData;
}

// Usage example
async function getUserProfile(userId) {
  return getCachedData(
    `user:${userId}:profile`,
    async () => {
      // Simulate database query
      return db.query('SELECT * FROM users WHERE id = ?', [userId]);
    },
    1800 // Cache for 30 minutes
  );
}
            

Advantages:

Disadvantages:

Scheduled Refresh

Instead of waiting for a cache miss, proactively refresh cached data on a schedule.


// Using a background job scheduler (like node-cron)
const cron = require('node-cron');
const redis = require('redis');

async function setupCacheRefresh() {
  const client = redis.createClient();
  await client.connect();
  
  // Schedule cache refresh for popular items every 15 minutes
  cron.schedule('*/15 * * * *', async () => {
    console.log('Refreshing popular item cache...');
    
    try {
      // Get list of popular items to refresh
      const popularItemIds = await getPopularItemIds();
      
      // Refresh each item's cache
      await Promise.all(popularItemIds.map(async (itemId) => {
        const freshData = await fetchItemData(itemId);
        await client.set(`item:${itemId}`, JSON.stringify(freshData), {
          EX: 3600 // Set TTL to 1 hour
        });
      }));
      
      console.log(`Refreshed cache for ${popularItemIds.length} items`);
    } catch (error) {
      console.error('Cache refresh failed:', error);
    }
  });
}

async function getPopularItemIds() {
  // Implement logic to determine which items are accessed frequently
  // This could be based on analytics, recent access patterns, etc.
  return db.query('SELECT id FROM items ORDER BY view_count DESC LIMIT 100');
}

async function fetchItemData(itemId) {
  // Fetch fresh data from the database
  return db.query('SELECT * FROM items WHERE id = ?', [itemId]);
}
            

Advantages:

Disadvantages:

Event-Based Invalidation

Event-based invalidation reacts to data changes by updating or invalidating the cache when the underlying data changes.

Write-Through Cache

In a write-through cache, data is written to both the cache and the backend storage simultaneously.

sequenceDiagram
    participant App as Application
    participant Cache as Cache Layer
    participant DB as Database
    
    App->>App: Update operation
    App->>Cache: Update cache
    App->>DB: Update database
    DB-->>App: Confirm update
    Cache-->>App: Confirm update
    App->>App: Operation complete
                

// Write-through cache implementation
async function updateUserProfile(userId, profileData) {
  const client = redis.createClient();
  await client.connect();
  
  try {
    // Begin transaction
    const transaction = await db.beginTransaction();
    
    try {
      // Update database
      await db.query(
        'UPDATE users SET name = ?, email = ?, updated_at = ? WHERE id = ?',
        [profileData.name, profileData.email, new Date(), userId],
        { transaction }
      );
      
      // Update cache
      await client.set(
        `user:${userId}:profile`,
        JSON.stringify({
          id: userId,
          ...profileData,
          updated_at: new Date()
        }),
        { EX: 3600 } // Cache for 1 hour
      );
      
      // Commit transaction
      await transaction.commit();
      
      return { success: true };
    } catch (error) {
      // Rollback transaction
      await transaction.rollback();
      throw error;
    }
  } catch (error) {
    console.error('Update failed:', error);
    return { success: false, error: error.message };
  } finally {
    await client.quit();
  }
}
            

Advantages:

Disadvantages:

Write-Behind Cache

In a write-behind (or write-back) cache, data is written to the cache first, and then asynchronously written to the backend storage.

sequenceDiagram
    participant App as Application
    participant Cache as Cache Layer
    participant Queue as Message Queue
    participant Worker as Background Worker
    participant DB as Database
    
    App->>App: Update operation
    App->>Cache: Update cache
    App->>Queue: Queue database update
    Queue-->>App: Acknowledge
    App->>App: Operation complete
    
    Queue->>Worker: Process update
    Worker->>DB: Update database
    DB-->>Worker: Confirm update
    Worker->>Cache: Update cache TTL
                

// Write-behind cache implementation using a message queue
const { Queue } = require('bullmq');
const writeQueue = new Queue('database-writes');

async function updateUserProfile(userId, profileData) {
  const client = redis.createClient();
  await client.connect();
  
  try {
    // Update cache immediately
    await client.set(
      `user:${userId}:profile`,
      JSON.stringify({
        id: userId,
        ...profileData,
        updated_at: new Date(),
        pending: true // Mark as pending database update
      }),
      { EX: 3600 } // Cache for 1 hour
    );
    
    // Queue database update
    await writeQueue.add('update-user', {
      userId,
      profileData,
      timestamp: Date.now()
    });
    
    return { success: true, message: 'Update queued' };
  } catch (error) {
    console.error('Update failed:', error);
    return { success: false, error: error.message };
  } finally {
    await client.quit();
  }
}

// Worker process to handle database writes
const { Worker } = require('bullmq');

const dbWorker = new Worker('database-writes', async (job) => {
  const { userId, profileData, timestamp } = job.data;
  const client = redis.createClient();
  await client.connect();
  
  try {
    // Update database
    await db.query(
      'UPDATE users SET name = ?, email = ?, updated_at = ? WHERE id = ?',
      [profileData.name, profileData.email, new Date(timestamp), userId]
    );
    
    // Update cache status (no longer pending)
    const cachedData = await client.get(`user:${userId}:profile`);
    
    if (cachedData) {
      const parsedData = JSON.parse(cachedData);
      delete parsedData.pending;
      
      await client.set(
        `user:${userId}:profile`,
        JSON.stringify(parsedData),
        { EX: 3600 } // Reset TTL
      );
    }
    
    return { success: true };
  } catch (error) {
    console.error('Database update failed:', error);
    throw error; // Will trigger retry mechanism
  } finally {
    await client.quit();
  }
}, { connection: redisConnection });
            

Advantages:

Disadvantages:

Write-Around Cache

In a write-around cache, write operations bypass the cache and go directly to the database. The cache is only updated on read misses.

sequenceDiagram
    participant App as Application
    participant Cache as Cache Layer
    participant DB as Database
    
    Note over App: Write Operation
    App->>DB: Write directly to database
    DB-->>App: Confirm write
    
    Note over App: Read Operation
    App->>Cache: Check cache
    alt Cache Hit
        Cache-->>App: Return cached data
    else Cache Miss
        App->>DB: Read from database
        DB-->>App: Return data
        App->>Cache: Update cache
    end
                

// Write-around cache implementation
async function updateUserProfile(userId, profileData) {
  try {
    // Update database directly, bypassing cache
    await db.query(
      'UPDATE users SET name = ?, email = ?, updated_at = ? WHERE id = ?',
      [profileData.name, profileData.email, new Date(), userId]
    );
    
    // Invalidate cache (optional)
    const client = redis.createClient();
    await client.connect();
    await client.del(`user:${userId}:profile`);
    await client.quit();
    
    return { success: true };
  } catch (error) {
    console.error('Update failed:', error);
    return { success: false, error: error.message };
  }
}

// Read operation with cache-aside pattern
async function getUserProfile(userId) {
  const client = redis.createClient();
  await client.connect();
  
  try {
    // Try to get from cache
    const cachedData = await client.get(`user:${userId}:profile`);
    
    if (cachedData) {
      return JSON.parse(cachedData);
    }
    
    // Cache miss, read from database
    const userData = await db.query(
      'SELECT * FROM users WHERE id = ?',
      [userId]
    );
    
    if (!userData) {
      return null;
    }
    
    // Update cache
    await client.set(
      `user:${userId}:profile`,
      JSON.stringify(userData),
      { EX: 3600 } // Cache for 1 hour
    );
    
    return userData;
  } catch (error) {
    console.error('Get user failed:', error);
    throw error;
  } finally {
    await client.quit();
  }
}
            

Advantages:

Disadvantages:

Cache-Aside (Lazy Loading)

In the cache-aside pattern, the application is responsible for reading from and writing to both the cache and the database.

sequenceDiagram
    participant App as Application
    participant Cache as Cache Layer
    participant DB as Database
    
    Note over App: Read Operation
    App->>Cache: Check cache
    alt Cache Hit
        Cache-->>App: Return cached data
    else Cache Miss
        App->>DB: Read from database
        DB-->>App: Return data
        App->>Cache: Update cache
    end
    
    Note over App: Write Operation
    App->>DB: Update database
    App->>Cache: Invalidate cache
                

// Cache-aside pattern implementation
class UserRepository {
  constructor() {
    this.redisClient = redis.createClient();
    this.redisClient.connect();
  }
  
  async getUser(userId) {
    const cacheKey = `user:${userId}`;
    
    // Try to get from cache
    const cachedUser = await this.redisClient.get(cacheKey);
    
    if (cachedUser) {
      console.log('Cache hit for user:', userId);
      return JSON.parse(cachedUser);
    }
    
    console.log('Cache miss for user:', userId);
    
    // Get from database
    const user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
    
    if (!user) {
      return null;
    }
    
    // Update cache
    await this.redisClient.set(cacheKey, JSON.stringify(user), {
      EX: 3600 // Cache for 1 hour
    });
    
    return user;
  }
  
  async updateUser(userId, userData) {
    // Update database
    await db.query(
      'UPDATE users SET name = ?, email = ? WHERE id = ?',
      [userData.name, userData.email, userId]
    );
    
    // Invalidate cache
    await this.redisClient.del(`user:${userId}`);
    
    return { success: true };
  }
  
  async close() {
    await this.redisClient.quit();
  }
}
            

Advantages:

Disadvantages:

Query-Based Invalidation

Query-based invalidation focuses on invalidating cached data based on the content of the data or the structure of the queries.

Content-Based Invalidation

Invalidate cache entries based on the content or relationships in the data.


// Content-based invalidation example
async function updateProduct(productId, productData) {
  const client = redis.createClient();
  await client.connect();
  
  try {
    // Begin transaction
    const transaction = await db.beginTransaction();
    
    try {
      // Get previous category ID (if changed)
      let previousCategoryId = null;
      if (productData.categoryId) {
        const existingProduct = await db.query(
          'SELECT category_id FROM products WHERE id = ?',
          [productId],
          { transaction }
        );
        
        if (existingProduct && existingProduct.category_id !== productData.categoryId) {
          previousCategoryId = existingProduct.category_id;
        }
      }
      
      // Update database
      await db.query(
        'UPDATE products SET name = ?, price = ?, category_id = ? WHERE id = ?',
        [productData.name, productData.price, productData.categoryId, productId],
        { transaction }
      );
      
      // Invalidate direct product cache
      await client.del(`product:${productId}`);
      
      // Invalidate category-related caches
      if (previousCategoryId) {
        await client.del(`category:${previousCategoryId}:products`);
      }
      
      if (productData.categoryId) {
        await client.del(`category:${productData.categoryId}:products`);
      }
      
      // Invalidate any cached search results or listings
      const productListKeys = await client.keys('product:list:*');
      if (productListKeys.length > 0) {
        await client.del(productListKeys);
      }
      
      // Commit transaction
      await transaction.commit();
      
      return { success: true };
    } catch (error) {
      // Rollback transaction
      await transaction.rollback();
      throw error;
    }
  } catch (error) {
    console.error('Update failed:', error);
    return { success: false, error: error.message };
  } finally {
    await client.quit();
  }
}
            

Advantages:

Disadvantages:

Query Invalidation

Track which database tables or objects are involved in each query, and invalidate cached results when those tables change.


// Query invalidation system
class CacheInvalidator {
  constructor(redisClient) {
    this.client = redisClient;
  }
  
  // Track which tables are accessed by a query
  async trackQuery(queryId, tables) {
    // Store mapping from tables to queries
    for (const table of tables) {
      await this.client.sAdd(`table:${table}:queries`, queryId);
    }
    
    // Store mapping from query to tables
    await this.client.sAdd(`query:${queryId}:tables`, ...tables);
  }
  
  // Invalidate all queries affected by a table update
  async invalidateTable(table) {
    // Get all queries that depend on this table
    const affectedQueries = await this.client.sMembers(`table:${table}:queries`);
    
    if (affectedQueries.length === 0) {
      return { invalidated: 0 };
    }
    
    // Delete all affected query results
    const pipeline = this.client.multi();
    
    for (const queryId of affectedQueries) {
      pipeline.del(`cache:${queryId}`);
    }
    
    await pipeline.exec();
    
    return { invalidated: affectedQueries.length };
  }
  
  // Helper to generate a deterministic query ID
  static generateQueryId(query, params) {
    const hash = crypto.createHash('md5');
    hash.update(query);
    hash.update(JSON.stringify(params || {}));
    return hash.digest('hex');
  }
}

// Usage example
async function getProductsByCategory(categoryId, options = {}) {
  const client = redis.createClient();
  await client.connect();
  
  const invalidator = new CacheInvalidator(client);
  
  // Generate a deterministic query ID
  const query = 'SELECT * FROM products WHERE category_id = ? ORDER BY created_at DESC LIMIT ? OFFSET ?';
  const params = [categoryId, options.limit || 20, options.offset || 0];
  const queryId = CacheInvalidator.generateQueryId(query, params);
  
  try {
    // Try to get from cache
    const cachedResult = await client.get(`cache:${queryId}`);
    
    if (cachedResult) {
      return JSON.parse(cachedResult);
    }
    
    // Cache miss, execute query
    const results = await db.query(query, params);
    
    // Store in cache
    await client.set(`cache:${queryId}`, JSON.stringify(results), {
      EX: 3600 // Cache for 1 hour
    });
    
    // Track which tables this query depends on
    await invalidator.trackQuery(queryId, ['products', 'categories']);
    
    return results;
  } finally {
    await client.quit();
  }
}

// When updating a product
async function updateProduct(productId, data) {
  // Update database
  await db.query('UPDATE products SET ... WHERE id = ?', [...Object.values(data), productId]);
  
  // Invalidate all queries that depend on the products table
  const client = redis.createClient();
  await client.connect();
  
  const invalidator = new CacheInvalidator(client);
  const result = await invalidator.invalidateTable('products');
  
  console.log(`Invalidated ${result.invalidated} cached queries`);
  
  await client.quit();
  
  return { success: true };
}
            

Advantages:

Disadvantages:

Hybrid Approaches

Most real-world applications use a combination of the above strategies, tailored to their specific requirements.

Versioned Cache Keys

Instead of invalidating cache entries, use versioned keys that change when the data changes.


// Versioned cache keys implementation
class VersionedCache {
  constructor(redisClient) {
    this.client = redisClient;
  }
  
  // Get a versioned key
  async getVersionedKey(baseKey) {
    // Get the current version for this key
    const version = await this.client.get(`version:${baseKey}`) || '1';
    return `${baseKey}:v${version}`;
  }
  
  // Increment the version
  async incrementVersion(baseKey) {
    // Increment the version number
    return this.client.incr(`version:${baseKey}`);
  }
  
  // Get cache data with versioned key
  async get(baseKey) {
    const versionedKey = await this.getVersionedKey(baseKey);
    const data = await this.client.get(versionedKey);
    
    return data ? JSON.parse(data) : null;
  }
  
  // Set cache data with versioned key
  async set(baseKey, data, ttl = 3600) {
    const versionedKey = await this.getVersionedKey(baseKey);
    
    await this.client.set(versionedKey, JSON.stringify(data), {
      EX: ttl
    });
    
    return versionedKey;
  }
  
  // Invalidate by incrementing version
  async invalidate(baseKey) {
    return this.incrementVersion(baseKey);
  }
}

// Usage example
async function getUserProfile(userId) {
  const client = redis.createClient();
  await client.connect();
  
  const cache = new VersionedCache(client);
  
  try {
    // Try to get from cache
    const baseKey = `user:${userId}:profile`;
    const cachedData = await cache.get(baseKey);
    
    if (cachedData) {
      return cachedData;
    }
    
    // Cache miss, get from database
    const userData = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
    
    if (!userData) {
      return null;
    }
    
    // Store in cache
    await cache.set(baseKey, userData);
    
    return userData;
  } finally {
    await client.quit();
  }
}

// When updating a user
async function updateUserProfile(userId, data) {
  const client = redis.createClient();
  await client.connect();
  
  try {
    // Update database
    await db.query(
      'UPDATE users SET name = ?, email = ? WHERE id = ?',
      [data.name, data.email, userId]
    );
    
    // Invalidate by incrementing version
    const cache = new VersionedCache(client);
    const newVersion = await cache.invalidate(`user:${userId}:profile`);
    
    console.log(`Invalidated user ${userId} profile, new version: ${newVersion}`);
    
    return { success: true };
  } catch (error) {
    console.error('Update failed:', error);
    return { success: false, error: error.message };
  } finally {
    await client.quit();
  }
}
            

Advantages:

Disadvantages:

Cache Stamping

Store a timestamp or version with each cached item and verify it's still valid before using it.


// Cache stamping implementation
async function getUserWithStamp(userId) {
  const client = redis.createClient();
  await client.connect();
  
  try {
    // Get latest timestamp from database
    const latestStamp = await db.query(
      'SELECT updated_at FROM users WHERE id = ? LIMIT 1',
      [userId]
    ).then(result => result ? result.updated_at.getTime() : 0);
    
    // Try to get from cache
    const cacheKey = `user:${userId}`;
    const cachedData = await client.get(cacheKey);
    
    if (cachedData) {
      const parsed = JSON.parse(cachedData);
      
      // Check if cache is still valid
      if (parsed._timestamp >= latestStamp) {
        return parsed;
      }
      
      console.log('Cache stamp outdated, refreshing');
    }
    
    // Cache miss or outdated, get from database
    const userData = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
    
    if (!userData) {
      return null;
    }
    
    // Add timestamp and store in cache
    userData._timestamp = latestStamp;
    
    await client.set(cacheKey, JSON.stringify(userData), {
      EX: 3600 // Cache for 1 hour
    });
    
    return userData;
  } finally {
    await client.quit();
  }
}

// When updating a user
async function updateUserWithStamp(userId, data) {
  try {
    // Update database with new timestamp
    const now = new Date();
    await db.query(
      'UPDATE users SET name = ?, email = ?, updated_at = ? WHERE id = ?',
      [data.name, data.email, now, userId]
    );
    
    return { success: true, timestamp: now.getTime() };
  } catch (error) {
    console.error('Update failed:', error);
    return { success: false, error: error.message };
  }
}
            

Advantages:

Disadvantages:

Real-World Cache Invalidation Scenarios

Microservice Architectures

In microservice architectures, cache invalidation becomes more challenging because different services may update data independently.

graph TD
    A[User Service] --> B[User Cache]
    C[Order Service] --> D[Order Cache]
    E[Product Service] --> F[Product Cache]
    G[API Gateway] --> B
    G --> D
    G --> F
    H[Event Bus] --> A
    H --> C
    H --> E
                

Event-Driven Invalidation

Use an event bus to publish data change events that trigger cache invalidation.


// Using an event bus (e.g., RabbitMQ, Kafka) for cache invalidation
const amqp = require('amqplib');

// Publisher service
async function updateUser(userId, userData) {
  try {
    // Update database
    await db.query(
      'UPDATE users SET name = ?, email = ? WHERE id = ?',
      [userData.name, userData.email, userId]
    );
    
    // Publish event
    const connection = await amqp.connect('amqp://localhost');
    const channel = await connection.createChannel();
    
    await channel.assertExchange('data-changes', 'topic', { durable: true });
    
    const event = {
      type: 'user.updated',
      userId,
      timestamp: Date.now()
    };
    
    channel.publish(
      'data-changes',
      'user.updated',
      Buffer.from(JSON.stringify(event))
    );
    
    await channel.close();
    await connection.close();
    
    return { success: true };
  } catch (error) {
    console.error('Update failed:', error);
    return { success: false, error: error.message };
  }
}

// Subscriber service (Cache Manager)
async function startCacheInvalidator() {
  const connection = await amqp.connect('amqp://localhost');
  const channel = await connection.createChannel();
  
  await channel.assertExchange('data-changes', 'topic', { durable: true });
  
  const { queue } = await channel.assertQueue('cache-invalidator', {
    exclusive: false
  });
  
  // Bind to relevant events
  await channel.bindQueue(queue, 'data-changes', 'user.#');
  await channel.bindQueue(queue, 'data-changes', 'order.#');
  await channel.bindQueue(queue, 'data-changes', 'product.#');
  
  console.log('Cache invalidator listening for events');
  
  channel.consume(queue, async (msg) => {
    if (!msg) return;
    
    try {
      const event = JSON.parse(msg.content.toString());
      console.log('Received event:', event);
      
      // Connect to Redis
      const client = redis.createClient();
      await client.connect();
      
      // Handle different event types
      switch (event.type) {
        case 'user.updated':
          await client.del(`user:${event.userId}`);
          await client.del(`user:${event.userId}:profile`);
          console.log(`Invalidated cache for user ${event.userId}`);
          break;
          
        case 'order.created':
        case 'order.updated':
          await client.del(`order:${event.orderId}`);
          // Also invalidate user's orders list
          await client.del(`user:${event.userId}:orders`);
          console.log(`Invalidated cache for order ${event.orderId}`);
          break;
          
        case 'product.updated':
          await client.del(`product:${event.productId}`);
          // Also invalidate category product lists
          await client.del(`category:${event.categoryId}:products`);
          console.log(`Invalidated cache for product ${event.productId}`);
          break;
      }
      
      await client.quit();
      
      // Acknowledge message
      channel.ack(msg);
    } catch (error) {
      console.error('Error processing event:', error);
      // Nack message for requeue
      channel.nack(msg);
    }
  });
}

startCacheInvalidator().catch(console.error);
            

Multi-Region Deployment

In multi-region deployments, cache invalidation must be coordinated across different geographical locations.

Global Cache Invalidation Service

Use a centralized service to coordinate cache invalidation across regions.


// Using Redis Pub/Sub for cross-region invalidation
const redis = require('redis');

class GlobalCacheInvalidator {
  constructor(options = {}) {
    // Create publisher client
    this.publisher = redis.createClient({
      url: options.redisUrl || 'redis://localhost:6379'
    });
    
    // Create subscriber client
    this.subscriber = this.publisher.duplicate();
    
    // Connect both clients
    Promise.all([
      this.publisher.connect(),
      this.subscriber.connect()
    ]).then(() => {
      console.log('Connected to Redis');
    }).catch(err => {
      console.error('Redis connection error:', err);
    });
    
    // The channel for invalidation events
    this.channel = options.channel || 'cache:invalidations';
    
    // Region identifier
    this.region = options.region || 'default';
    
    // Local cache client
    this.cacheClient = options.cacheClient || redis.createClient({
      url: options.cacheRedisUrl || 'redis://localhost:6379'
    });
    this.cacheClient.connect().catch(console.error);
  }
  
  // Start listening for invalidation events
  async startListening() {
    await this.subscriber.subscribe(this.channel, (message, channel) => {
      try {
        const event = JSON.parse(message);
        
        // Skip events from this region (we already handled them)
        if (event.sourceRegion === this.region) {
          return;
        }
        
        console.log(`Received invalidation from ${event.sourceRegion}:`, event);
        
        // Process the invalidation
        this.processInvalidation(event);
      } catch (error) {
        console.error('Error processing invalidation event:', error);
      }
    });
    
    console.log(`Listening for invalidations on channel: ${this.channel}`);
  }
  
  // Process an invalidation event
  async processInvalidation(event) {
    switch (event.type) {
      case 'key':
        // Invalidate a specific key
        await this.cacheClient.del(event.key);
        console.log(`Invalidated key: ${event.key}`);
        break;
        
      case 'pattern':
        // Invalidate keys matching a pattern
        const keys = await this.cacheClient.keys(event.pattern);
        
        if (keys.length > 0) {
          await this.cacheClient.del(keys);
          console.log(`Invalidated ${keys.length} keys matching pattern: ${event.pattern}`);
        }
        break;
        
      case 'tag':
        // Invalidate keys with a specific tag
        // Requires an implementation of tagged cache keys
        console.log(`Tag invalidation not implemented yet: ${event.tag}`);
        break;
    }
  }
  
  // Publish an invalidation event
  async invalidate(type, target) {
    const event = {
      type,
      sourceRegion: this.region,
      timestamp: Date.now()
    };
    
    // Add the appropriate target property
    switch (type) {
      case 'key':
        event.key = target;
        break;
      case 'pattern':
        event.pattern = target;
        break;
      case 'tag':
        event.tag = target;
        break;
    }
    
    // Publish the event
    await this.publisher.publish(this.channel, JSON.stringify(event));
    
    // Also process locally
    await this.processInvalidation(event);
    
    return true;
  }
  
  // Convenience methods
  async invalidateKey(key) {
    return this.invalidate('key', key);
  }
  
  async invalidatePattern(pattern) {
    return this.invalidate('pattern', pattern);
  }
  
  async invalidateTag(tag) {
    return this.invalidate('tag', tag);
  }
  
  // Clean up resources
  async close() {
    await this.publisher.quit();
    await this.subscriber.quit();
    await this.cacheClient.quit();
  }
}

// Usage example
async function updateUserGlobally(userId, userData) {
  try {
    // Update database
    await db.query(
      'UPDATE users SET name = ?, email = ? WHERE id = ?',
      [userData.name, userData.email, userId]
    );
    
    // Invalidate cache across all regions
    const invalidator = new GlobalCacheInvalidator({
      region: 'us-east-1'
    });
    
    await invalidator.invalidatePattern(`user:${userId}:*`);
    
    await invalidator.close();
    
    return { success: true };
  } catch (error) {
    console.error('Update failed:', error);
    return { success: false, error: error.message };
  }
}
            

High-Traffic Applications

For high-traffic applications, inefficient cache invalidation can lead to "thundering herd" problems.

Staggered Invalidation

Invalidate cache entries gradually to prevent all clients from hitting the database simultaneously.


// Staggered invalidation implementation
class StaggeredInvalidator {
  constructor(redisClient) {
    this.client = redisClient;
  }
  
  // Add a soft invalidation (probabilistic expiration)
  async softInvalidate(pattern, durationMs = 60000) {
    // Find keys matching the pattern
    const keys = await this.client.keys(pattern);
    
    if (keys.length === 0) {
      return { invalidated: 0 };
    }
    
    const startTime = Date.now();
    const endTime = startTime + durationMs;
    
    // For each key, set a special "invalidation" flag
    // with the end time of the invalidation window
    const pipeline = this.client.multi();
    
    for (const key of keys) {
      pipeline.hSet(`${key}:meta`, 'invalidating', endTime.toString());
    }
    
    await pipeline.exec();
    
    return { invalidated: keys.length, endTime };
  }
  
  // Check if a key is being invalidated
  async isInvalidating(key) {
    const invalidatingUntil = await this.client.hGet(`${key}:meta`, 'invalidating');
    
    if (!invalidatingUntil) {
      return false;
    }
    
    const endTime = parseInt(invalidatingUntil, 10);
    const now = Date.now();
    
    // Check if we're still in the invalidation window
    return now < endTime;
  }
  
  // Decide whether to use cache based on how far we are in the invalidation window
  async shouldUseCache(key) {
    const invalidatingUntil = await this.client.hGet(`${key}:meta`, 'invalidating');
    
    if (!invalidatingUntil) {
      return true; // Not being invalidated, use cache normally
    }
    
    const endTime = parseInt(invalidatingUntil, 10);
    const now = Date.now();
    
    // If invalidation period is over, clear the flag and use cache
    if (now >= endTime) {
      await this.client.hDel(`${key}:meta`, 'invalidating');
      return true;
    }
    
    // Calculate how far we are in the invalidation window (0.0 to 1.0)
    const startTime = endTime - 60000; // Assuming 1-minute window
    const progress = (now - startTime) / (endTime - startTime);
    
    // Probability increases as we get closer to the end of the window
    // This creates a gradual ramp-up of cache refreshes
    return Math.random() > progress;
  }
}

// Usage example with cache-aside pattern
async function getProductWithStaggered(productId) {
  const client = redis.createClient();
  await client.connect();
  
  const invalidator = new StaggeredInvalidator(client);
  
  try {
    const cacheKey = `product:${productId}`;
    
    // Check if we should use the cache
    const useCache = await invalidator.shouldUseCache(cacheKey);
    
    if (useCache) {
      // Try to get from cache
      const cachedData = await client.get(cacheKey);
      
      if (cachedData) {
        return JSON.parse(cachedData);
      }
    }
    
    // Cache miss or bypassing cache during invalidation
    console.log('Getting fresh data from database');
    
    const productData = await db.query(
      'SELECT * FROM products WHERE id = ?',
      [productId]
    );
    
    if (!productData) {
      return null;
    }
    
    // Update cache
    await client.set(cacheKey, JSON.stringify(productData), {
      EX: 3600 // Cache for 1 hour
    });
    
    return productData;
  } finally {
    await client.quit();
  }
}

// When updating a product (e.g., during a flash sale)
async function prepareProductUpdate(productId) {
  const client = redis.createClient();
  await client.connect();
  
  try {
    // Start a 1-minute staggered invalidation before the update
    const invalidator = new StaggeredInvalidator(client);
    await invalidator.softInvalidate(`product:${productId}`, 60000);
    
    console.log(`Started staggered invalidation for product ${productId}`);
    
    return { success: true };
  } finally {
    await client.quit();
  }
}
            

Best Practices for Cache Invalidation

Design Principles

Implementation Tips

Common Pitfalls

Operational Considerations

Conclusion

Cache invalidation is one of the most challenging aspects of distributed systems, but with the right patterns and strategies, it can be managed effectively. The key is to understand your application's consistency requirements and to choose the appropriate invalidation approach for each type of data.

Remember that there's no one-size-fits-all solution. Most applications will use a combination of approaches:

By applying the principles and patterns we've covered in this lecture, you'll be better equipped to build high-performance, consistent, and reliable applications that leverage the power of caching without falling into its pitfalls.

Further Reading

Practical Exercises

Exercise 1: Implement Cache-Aside Pattern

Implement a cache-aside pattern for a product catalog API. Create endpoints for:

Ensure that the cache is properly invalidated when products are updated.

Exercise 2: Event-Based Invalidation

Create a simple event bus using Redis Pub/Sub. Implement a system where multiple services can publish data change events, and a cache invalidation service subscribes to these events to keep the cache consistent.

Exercise 3: Versioned Cache Keys

Implement the versioned cache keys pattern for a user profile system. Ensure that profile views always see the latest data without explicitly deleting cache entries.

Exercise 4: Benchmark Different Strategies

Create a benchmark to compare the performance of different cache invalidation strategies under various workloads. Consider factors like:

Exercise 5: Cache Invalidation Dashboard

Create a simple dashboard that monitors cache invalidation events and displays metrics like: