Scheduled Tasks with Cron

Week 10: Full Stack Application Development - Friday Lecture

Introduction to Scheduled Tasks

In modern web applications, certain operations need to run automatically at specific times without user intervention. These scheduled tasks (also called cron jobs, background jobs, or scheduled jobs) are crucial for maintaining application health, processing data regularly, and automating routine operations.

Think of scheduled tasks like the automatic systems in your home. Your heating system runs on a schedule, your dishwasher might run at night, and your smartphone backs up while you sleep. Similarly, web applications need automated processes that run at predetermined times to handle various maintenance and operational tasks.

graph TD A[Application] -->|Creates| B[Scheduled Tasks] B -->|Can Execute| C[Database Maintenance] B -->|Can Execute| D[Data Processing] B -->|Can Execute| E[Notifications] B -->|Can Execute| F[Reports Generation] B -->|Can Execute| G[External API Calls] style A fill:#f9d5e5,stroke:#333,stroke-width:2px style B fill:#eeeeee,stroke:#333,stroke-width:2px style C fill:#c3e5e7,stroke:#333,stroke-width:2px style D fill:#d5f5e3,stroke:#333,stroke-width:2px style E fill:#fdebd0,stroke:#333,stroke-width:2px style F fill:#ebdef0,stroke:#333,stroke-width:2px style G fill:#eaeded,stroke:#333,stroke-width:2px

Common Use Cases for Scheduled Tasks

Database Maintenance

Data Processing

User Engagement

Business Operations

System Health

Real-World Example: E-commerce Platform

In an e-commerce application, you might schedule the following tasks:

Understanding Cron

What is Cron?

Cron is a time-based job scheduler in Unix-like operating systems. It enables users to schedule commands or scripts to run automatically at specified dates and times. The name "cron" comes from the Greek word for time, "chronos."

In modern web development, we often use cron-like systems that follow similar principles but are implemented within our application stack rather than at the operating system level.

Cron Syntax

A cron expression is a string representing a schedule. It consists of five or six fields separated by spaces:

┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of the month (1 - 31)
│ │ │ ┌───────────── month (1 - 12 or JAN-DEC)
│ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)
│ │ │ │ │
* * * * * [command to execute]

Cron Expression Examples

Expression Meaning
* * * * * Every minute
0 * * * * Every hour (at minute 0)
0 0 * * * Every day at midnight (00:00)
0 0 * * 0 Every Sunday at midnight
0 9 * * 1-5 Every weekday at 9 AM
0 0 1 * * First day of every month at midnight
*/15 * * * * Every 15 minutes
0 12 * * MON Every Monday at noon

Special Characters in Cron Expressions

Common Cron Variations

Different implementations of cron may have slight variations:

Implementing Scheduled Tasks in Node.js

Node.js Libraries for Scheduled Tasks

Several libraries can help implement cron-like functionality in Node.js applications:

Library Description Use Case
node-cron Pure JavaScript implementation of cron for Node.js Simple in-process scheduling
node-schedule Flexible job scheduler with cron-like syntax More complex scheduling patterns
Bull/Agenda Job queue libraries with scheduling capabilities Distributed, persistent job scheduling
Bree Modern job scheduler with sandboxed worker threads High-performance, isolated job execution

Using node-cron

node-cron is a simple and lightweight cron-like scheduler for Node.js:

// Install with: npm install node-cron

const cron = require('node-cron');

// Schedule a task to run every day at midnight
cron.schedule('0 0 * * *', () => {
  console.log('Running a task every day at midnight');
  // Your job logic here
  performDailyBackup();
});

// Schedule a task to run every hour
cron.schedule('0 * * * *', () => {
  console.log('Running a task every hour');
  // Your job logic here
  checkSystemHealth();
});

Using node-schedule

node-schedule provides more flexible scheduling options:

// Install with: npm install node-schedule

const schedule = require('node-schedule');

// Schedule a job for every 5th minute of every hour
const job1 = schedule.scheduleJob('5 * * * *', function() {
  console.log('Running job at 5 minutes past the hour');
  // Your job logic here
});

// Schedule using Date object
const date = new Date(2025, 4, 10, 15, 30, 0);
const job2 = schedule.scheduleJob(date, function() {
  console.log('Job ran at the specified date');
  // One-time job logic here
});

// More complex recurrence rule
const rule = new schedule.RecurrenceRule();
rule.dayOfWeek = [0, new schedule.Range(4, 6)]; // Sunday and Thursday to Saturday
rule.hour = 17;
rule.minute = 0;

const job3 = schedule.scheduleJob(rule, function() {
  console.log('Running at 5:00 PM on Sunday, Thursday, Friday, and Saturday');
  // Your job logic here
});

Scheduling with Bull

Bull is a Redis-based queue system that also supports scheduled jobs:

// Install with: npm install bull

const Queue = require('bull');
const reportQueue = new Queue('report-generation');

// Schedule a job to run after a delay
reportQueue.add(
  { userId: 123, reportType: 'monthly' },
  { delay: 60 * 1000 } // Run after 1 minute
);

// Schedule a recurring job (every day at 3 AM)
reportQueue.add(
  { reportType: 'daily-summary' },
  { 
    repeat: { 
      cron: '0 3 * * *' // Every day at 3 AM
    } 
  }
);

// Process the jobs
reportQueue.process(async (job) => {
  console.log(`Processing job: ${job.id}`);
  const { reportType } = job.data;
  
  switch(reportType) {
    case 'monthly':
      await generateMonthlyReport(job.data.userId);
      break;
    case 'daily-summary':
      await generateDailySummary();
      break;
  }
  
  return { success: true };
});

Using Bree

Bree is a modern job scheduler that runs tasks in separate threads for better performance:

// Install with: npm install bree

const Bree = require('bree');

// Initialize the scheduler
const bree = new Bree({
  jobs: [
    // Cron job (runs at midnight every day)
    {
      name: 'daily-cleanup',
      cron: '0 0 * * *',
      path: './jobs/cleanup.js'
    },
    
    // Interval job (runs every 5 minutes)
    {
      name: 'check-health',
      interval: '5m',
      path: './jobs/health-check.js'
    },
    
    // One-time job (runs after 30 seconds)
    {
      name: 'welcome-email',
      timeout: '30s',
      path: './jobs/welcome-email.js'
    }
  ]
});

// Start all jobs
bree.start();

// You can also control individual jobs
bree.start('daily-cleanup');
bree.stop('check-health');

Example job file (./jobs/cleanup.js):

// This runs in its own worker thread
console.log('Starting daily cleanup task');

// Your cleanup logic here
async function performCleanup() {
  // Delete temporary files
  // Archive old data
  // etc.
}

// Run the task and handle errors
(async () => {
  try {
    await performCleanup();
    console.log('Cleanup completed successfully');
  } catch (error) {
    console.error('Cleanup failed:', error);
    process.exit(1); // Exit with error
  }
})();

Best Practices for Scheduled Tasks

Separate Task Logic from Scheduling

Keep your task logic separate from the scheduling mechanism. This makes your code more maintainable and easier to test.

// Good practice
// tasks.js - Task logic
const tasks = {
  async dailyReport() {
    // Report generation logic
  },
  
  async cleanupTempFiles() {
    // Cleanup logic
  }
};

module.exports = tasks;

// scheduler.js - Scheduling
const cron = require('node-cron');
const tasks = require('./tasks');

cron.schedule('0 0 * * *', tasks.dailyReport);
cron.schedule('0 2 * * *', tasks.cleanupTempFiles);

Error Handling

Always implement proper error handling in your scheduled tasks to prevent crashes.

cron.schedule('0 0 * * *', async () => {
  try {
    await generateDailyReport();
    console.log('Daily report generated successfully');
  } catch (error) {
    console.error('Failed to generate daily report:', error);
    // Notify administrators or log to monitoring system
    await notifyAdmins('Daily report generation failed', error);
  }
});

Logging and Monitoring

Implement comprehensive logging to track execution and diagnose issues:

cron.schedule('0 0 * * *', async () => {
  console.log(`[${new Date().toISOString()}] Starting daily backup`);
  
  try {
    const startTime = Date.now();
    await performBackup();
    const duration = Date.now() - startTime;
    
    console.log(`[${new Date().toISOString()}] Backup completed successfully in ${duration}ms`);
    await metrics.recordTaskExecution('daily-backup', {
      success: true,
      duration
    });
  } catch (error) {
    console.error(`[${new Date().toISOString()}] Backup failed:`, error);
    await metrics.recordTaskExecution('daily-backup', {
      success: false,
      error: error.message
    });
  }
});

Task Idempotence

Design tasks to be idempotent (can be run multiple times without causing problems):

// Non-idempotent example (problematic)
async function sendDailyNewsletter() {
  const users = await User.findAll({ where: { subscribed: true } });
  
  for (const user of users) {
    await emailService.send(user.email, 'Daily Newsletter', template);
    // If this crashes halfway through, some users get duplicate emails on retry
  }
}

// Idempotent example (better)
async function sendDailyNewsletter() {
  const today = new Date().toISOString().split('T')[0];
  const users = await User.findAll({ 
    where: { 
      subscribed: true,
      // Skip users who already received today's newsletter
      '$emailRecords.date$': { $ne: today }
    },
    include: [{
      model: EmailRecord,
      where: { type: 'daily-newsletter', date: today },
      required: false
    }]
  });
  
  for (const user of users) {
    await emailService.send(user.email, 'Daily Newsletter', template);
    await EmailRecord.create({
      userId: user.id,
      type: 'daily-newsletter',
      date: today
    });
  }
}

Avoid Overlapping Executions

Ensure long-running tasks don't overlap:

// Using a locking mechanism
const lockKey = 'daily-report-lock';
const lockExpiry = 60 * 60; // 1 hour in seconds

cron.schedule('0 0 * * *', async () => {
  // Try to acquire a lock
  const locked = await redis.set(lockKey, 'locked', 'EX', lockExpiry, 'NX');
  
  if (!locked) {
    console.log('Another instance of this task is already running');
    return;
  }
  
  try {
    await generateDailyReport();
  } catch (error) {
    console.error('Error generating report:', error);
  } finally {
    // Release the lock
    await redis.del(lockKey);
  }
});

Time Zone Considerations

Be aware of time zone issues when scheduling tasks:

// Schedule a task to run at midnight in a specific timezone
cron.schedule('0 0 * * *', () => {
  console.log('Running at midnight in the specified timezone');
}, {
  scheduled: true,
  timezone: "America/New_York"
});

Deployment Considerations

In-Process vs Out-of-Process Scheduling

graph TB subgraph "In-Process" A[Web Server] --- B[Scheduler] B --- C[Task 1] B --- D[Task 2] end subgraph "Out-of-Process" E[Web Server] F[Dedicated Worker] G[Queue/Database] E -->|"Schedule Task"| G G -->|"Pull Tasks"| F F -->|"Process Tasks"| F end

In-Process Scheduling

Pros:

Cons:

Out-of-Process Scheduling

Pros:

Cons:

Choosing the Right Approach

Use In-Process When: Use Out-of-Process When:
Tasks are lightweight Tasks are resource-intensive
Application has low traffic Application has high traffic
Tasks are tied to app state Task reliability is critical
Simple development is prioritized System is deployed across multiple servers

Implementation Options for Production

Option 1: Process Manager (PM2)

Use PM2 to run a separate worker process for scheduled tasks:

// ecosystem.config.js
module.exports = {
  apps: [{
    name: 'web-server',
    script: 'server.js',
    instances: 'max',
    exec_mode: 'cluster'
  }, {
    name: 'scheduler',
    script: 'scheduler.js',
    instances: 1, // Only run one instance of the scheduler
    exec_mode: 'fork'
  }]
};

Option 2: Dedicated Cron Container in Docker

For containerized applications, use a separate container for scheduled tasks:

# docker-compose.yml
version: '3'

services:
  web:
    build: .
    ports:
      - "3000:3000"
    depends_on:
      - db
      - redis
    environment:
      - NODE_ENV=production
      
  worker:
    build: .
    command: node scheduler.js
    depends_on:
      - db
      - redis
    environment:
      - NODE_ENV=production
      
  db:
    image: postgres:13
    
  redis:
    image: redis:6

Option 3: Specialized Job Systems

Example AWS CloudFormation template for a scheduled Lambda function:

Resources:
  ScheduledFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: ./functions/
      Handler: scheduled-task.handler
      Runtime: nodejs16.x
      Events:
        DailyAt3AM:
          Type: Schedule
          Properties:
            Schedule: cron(0 3 * * ? *)
            Enabled: true

Advanced Patterns

Distributed Scheduling

When running multiple instances of your application, you need to ensure scheduled tasks aren't duplicated.

One approach is using a distributed lock with Redis:

const schedule = require('node-schedule');
const Redis = require('ioredis');
const redis = new Redis();

async function scheduleTasks() {
  // Try to acquire a leadership lock
  const result = await redis.set('scheduler-leader', process.env.HOSTNAME, 'NX', 'EX', 30);
  const isLeader = result === 'OK';
  
  if (isLeader) {
    console.log('This instance is the scheduling leader');
    
    // Refresh the lock periodically
    const interval = setInterval(async () => {
      await redis.expire('scheduler-leader', 30);
    }, 10000);
    
    // Set up schedules
    const dailyReport = schedule.scheduleJob('0 0 * * *', generateDailyReport);
    
    // Handle shutdown
    process.on('SIGTERM', async () => {
      clearInterval(interval);
      await redis.del('scheduler-leader');
      dailyReport.cancel();
      process.exit(0);
    });
  } else {
    console.log('Another instance is the scheduling leader');
    
    // Check periodically if the leader has failed
    setInterval(async () => {
      const leaderExists = await redis.exists('scheduler-leader');
      if (!leaderExists) {
        // Leader has failed, try to become the new leader
        scheduleTasks();
      }
    }, 15000);
  }
}

scheduleTasks();

Dynamic Scheduling

Sometimes you need to create or modify schedules dynamically based on user preferences or business rules:

// Store jobs in memory for easy access
const activeJobs = new Map();

// Cancel and reschedule a job
function updateJobSchedule(jobId, newCronExpression) {
  // Cancel existing job if it exists
  if (activeJobs.has(jobId)) {
    activeJobs.get(jobId).cancel();
  }
  
  // Create new job with updated schedule
  const job = schedule.scheduleJob(newCronExpression, async () => {
    await executeJob(jobId);
  });
  
  // Store the new job
  activeJobs.set(jobId, job);
  
  return job;
}

// Example: User updates notification preferences
app.post('/api/notification-settings', async (req, res) => {
  const { userId, frequency, time } = req.body;
  
  // Translate user preferences to cron expression
  let cronExpression;
  
  switch (frequency) {
    case 'daily':
      // Convert time (e.g., "09:00") to cron
      const [hour, minute] = time.split(':');
      cronExpression = `${minute} ${hour} * * *`;
      break;
    case 'weekly':
      cronExpression = `${minute} ${hour} * * 1`; // Mondays
      break;
    // other cases...
  }
  
  // Update the job schedule
  updateJobSchedule(`notification-${userId}`, cronExpression);
  
  // Save to database
  await User.update({ 
    notificationSchedule: cronExpression 
  }, { 
    where: { id: userId } 
  });
  
  res.json({ success: true });
});

Task Dependencies and Workflows

For complex sequences of tasks where one depends on another, consider a workflow approach:

// Define a workflow with dependent tasks
const DailyReportingWorkflow = {
  name: 'daily-reporting',
  schedule: '0 1 * * *', // Every day at 1 AM
  tasks: [
    {
      name: 'collect-data',
      handler: collectData,
      retryCount: 3,
      retryDelay: 5 * 60 * 1000 // 5 minutes
    },
    {
      name: 'generate-reports',
      handler: generateReports,
      dependencies: ['collect-data'] // This task depends on the first one
    },
    {
      name: 'send-emails',
      handler: sendEmailReports,
      dependencies: ['generate-reports'] // This task depends on the second one
    }
  ]
};

// Workflow execution engine
async function executeWorkflow(workflow) {
  console.log(`Starting workflow: ${workflow.name}`);
  
  // Track task completion
  const completedTasks = new Set();
  const failedTasks = new Map();
  
  // Process tasks until all are complete or max retries exceeded
  while (completedTasks.size < workflow.tasks.length) {
    for (const task of workflow.tasks) {
      // Skip if task is already completed
      if (completedTasks.has(task.name)) continue;
      
      // Skip if dependencies aren't satisfied
      if (task.dependencies && task.dependencies.some(dep => !completedTasks.has(dep))) {
        continue;
      }
      
      // Skip if already tried and failed too many times
      const failCount = failedTasks.get(task.name) || 0;
      if (failCount >= (task.retryCount || 0)) continue;
      
      // Execute the task
      try {
        console.log(`Executing task: ${task.name}`);
        await task.handler();
        completedTasks.add(task.name);
        console.log(`Task completed: ${task.name}`);
      } catch (error) {
        console.error(`Task failed: ${task.name}`, error);
        failedTasks.set(task.name, (failedTasks.get(task.name) || 0) + 1);
        
        // Wait before retry if specified
        if (task.retryDelay) {
          await new Promise(resolve => setTimeout(resolve, task.retryDelay));
        }
      }
    }
    
    // If no progress is being made, abort
    const canProgress = workflow.tasks.some(task => {
      // Task is not completed
      if (completedTasks.has(task.name)) return false;
      
      // Dependencies are satisfied
      if (task.dependencies && task.dependencies.some(dep => !completedTasks.has(dep))) {
        return false;
      }
      
      // Has retries left
      const failCount = failedTasks.get(task.name) || 0;
      return failCount < (task.retryCount || 0);
    });
    
    if (!canProgress) break;
  }
  
  // Report workflow completion status
  const allCompleted = completedTasks.size === workflow.tasks.length;
  console.log(`Workflow ${workflow.name} ${allCompleted ? 'completed' : 'failed'}`);
  return allCompleted;
}

// Schedule the workflow
cron.schedule(DailyReportingWorkflow.schedule, () => {
  executeWorkflow(DailyReportingWorkflow);
});

Case Study: Building a Newsletter System

Let's build a practical example of a scheduled task system for sending newsletters to subscribers. This system includes:

Project Structure

newsletter-system/
├── server.js             # Express server for UI and API
├── scheduler.js          # Main scheduler process
├── tasks/
│   ├── dailyDigest.js    # Daily email task
│   ├── weeklyNewsletter.js # Weekly newsletter task
│   └── utils.js          # Shared utilities
├── models/
│   ├── user.js           # User model
│   └── emailLog.js       # Email delivery logging
├── services/
│   ├── emailService.js   # Email sending service
│   └── contentService.js # Content generation service
└── monitoring/
    ├── logger.js         # Logging utility
    └── metrics.js        # Metrics collection

Initialize the Scheduler

// scheduler.js
const cron = require('node-cron');
const Redis = require('ioredis');
const redis = new Redis();
const logger = require('./monitoring/logger');
const metrics = require('./monitoring/metrics');
const dailyDigest = require('./tasks/dailyDigest');
const weeklyNewsletter = require('./tasks/weeklyNewsletter');

// Acquire a lock to prevent duplicate scheduling
async function initialize() {
  const lockAcquired = await redis.set('newsletter-scheduler-lock', process.pid, 'NX', 'EX', 60);
  
  if (!lockAcquired) {
    logger.info('Another scheduler is already running');
    return;
  }
  
  // Maintain the lock
  const lockInterval = setInterval(async () => {
    await redis.expire('newsletter-scheduler-lock', 60);
  }, 30000);
  
  // Handle graceful shutdown
  process.on('SIGTERM', async () => {
    clearInterval(lockInterval);
    await redis.del('newsletter-scheduler-lock');
    process.exit(0);
  });
  
  // Schedule tasks
  setupSchedules();
}

function setupSchedules() {
  // Daily digest at 8 AM in each time zone
  const timeZones = ['America/New_York', 'Europe/London', 'Asia/Tokyo'];
  
  timeZones.forEach(timeZone => {
    cron.schedule('0 8 * * *', async () => {
      logger.info(`Starting daily digest for ${timeZone}`);
      
      const task = `daily-digest-${timeZone}`;
      metrics.taskStarted(task);
      
      try {
        await dailyDigest.sendToTimeZone(timeZone);
        metrics.taskCompleted(task);
      } catch (error) {
        logger.error(`Daily digest for ${timeZone} failed`, error);
        metrics.taskFailed(task, error);
      }
    }, {
      scheduled: true,
      timezone: timeZone
    });
  });
  
  // Weekly newsletter every Monday at 10 AM UTC
  cron.schedule('0 10 * * 1', async () => {
    logger.info('Starting weekly newsletter');
    
    const task = 'weekly-newsletter';
    metrics.taskStarted(task);
    
    try {
      await weeklyNewsletter.sendToAllSubscribers();
      metrics.taskCompleted(task);
    } catch (error) {
      logger.error('Weekly newsletter failed', error);
      metrics.taskFailed(task, error);
    }
  });
  
  // Process failed emails every hour
  cron.schedule('0 * * * *', async () => {
    logger.info('Retrying failed emails');
    
    const task = 'retry-failed-emails';
    metrics.taskStarted(task);
    
    try {
      const retryCount = await retryFailedEmails();
      logger.info(`Retried ${retryCount} failed emails`);
      metrics.taskCompleted(task, { retryCount });
    } catch (error) {
      logger.error('Failed to retry emails', error);
      metrics.taskFailed(task, error);
    }
  });
  
  logger.info('All schedules have been set up');
}

// Retry mechanism for failed emails
async function retryFailedEmails() {
  const emailService = require('./services/emailService');
  const EmailLog = require('./models/emailLog');
  
  // Find failed emails with retry count < 3
  const failedEmails = await EmailLog.findAll({
    where: {
      status: 'failed',
      retryCount: { $lt: 3 },
      updatedAt: { $lt: new Date(Date.now() - 30 * 60 * 1000) } // 30 minutes ago
    }
  });
  
  let retryCount = 0;
  
  for (const email of failedEmails) {
    try {
      await emailService.sendEmail({
        to: email.recipient,
        subject: email.subject,
        template: email.template,
        data: JSON.parse(email.data)
      });
      
      // Update status
      await email.update({
        status: 'sent',
        sentAt: new Date()
      });
      
      retryCount++;
    } catch (error) {
      // Update retry count
      await email.update({
        retryCount: email.retryCount + 1,
        lastError: error.message
      });
      
      // If max retries reached, mark as permanently failed
      if (email.retryCount + 1 >= 3) {
        await email.update({ status: 'permanently-failed' });
      }
    }
  }
  
  return retryCount;
}

// Start the scheduler
initialize().catch(err => {
  logger.error('Failed to initialize scheduler', err);
  process.exit(1);
});

Daily Digest Task

// tasks/dailyDigest.js
const User = require('../models/user');
const EmailLog = require('../models/emailLog');
const emailService = require('../services/emailService');
const contentService = require('../services/contentService');
const logger = require('../monitoring/logger');

// Send daily digest to users in a specific time zone
async function sendToTimeZone(timeZone) {
  logger.info(`Preparing daily digest for users in ${timeZone}`);
  
  // Get content for today's digest
  const content = await contentService.getDailyDigestContent();
  
  // Find users in the specified time zone who are subscribed to daily digests
  const users = await User.findAll({
    where: {
      timeZone,
      emailPreferences: {
        dailyDigest: true
      },
      active: true
    }
  });
  
  logger.info(`Sending daily digest to ${users.length} users in ${timeZone}`);
  
  // Process in batches to avoid memory issues
  const batchSize = 100;
  
  for (let i = 0; i < users.length; i += batchSize) {
    const batch = users.slice(i, i + batchSize);
    
    await Promise.all(batch.map(async (user) => {
      try {
        // Personalize content
        const personalizedContent = contentService.personalizeContent(content, user);
        
        // Send email
        await emailService.sendEmail({
          to: user.email,
          subject: `Your Daily Digest for ${new Date().toLocaleDateString()}`,
          template: 'daily-digest',
          data: {
            firstName: user.firstName,
            content: personalizedContent,
            unsubscribeUrl: `https://example.com/unsubscribe?token=${user.unsubscribeToken}`
          }
        });
        
        // Log successful send
        await EmailLog.create({
          userId: user.id,
          type: 'daily-digest',
          recipient: user.email,
          subject: `Your Daily Digest for ${new Date().toLocaleDateString()}`,
          template: 'daily-digest',
          data: JSON.stringify(personalizedContent),
          status: 'sent',
          sentAt: new Date()
        });
      } catch (error) {
        logger.error(`Failed to send daily digest to ${user.email}`, error);
        
        // Log failed send for retry
        await EmailLog.create({
          userId: user.id,
          type: 'daily-digest',
          recipient: user.email,
          subject: `Your Daily Digest for ${new Date().toLocaleDateString()}`,
          template: 'daily-digest',
          data: JSON.stringify(content),
          status: 'failed',
          retryCount: 0,
          lastError: error.message
        });
      }
    }));
    
    logger.info(`Processed batch ${i / batchSize + 1} of ${Math.ceil(users.length / batchSize)}`);
  }
  
  return users.length;
}

module.exports = {
  sendToTimeZone
};

Monitoring and Metrics

// monitoring/metrics.js
const Prometheus = require('prom-client');

// Create metrics
const taskCounter = new Prometheus.Counter({
  name: 'scheduler_task_total',
  help: 'Count of scheduler tasks',
  labelNames: ['task', 'status']
});

const taskDuration = new Prometheus.Histogram({
  name: 'scheduler_task_duration_seconds',
  help: 'Duration of scheduler tasks in seconds',
  labelNames: ['task'],
  buckets: [0.1, 0.5, 1, 2, 5, 10, 30, 60, 120, 300, 600]
});

const emailCounter = new Prometheus.Counter({
  name: 'email_sent_total',
  help: 'Count of emails sent',
  labelNames: ['type', 'status']
});

// Track task execution
function taskStarted(taskName) {
  this.startTime = Date.now();
  taskCounter.inc({ task: taskName, status: 'started' });
}

function taskCompleted(taskName, data = {}) {
  const duration = (Date.now() - this.startTime) / 1000;
  taskCounter.inc({ task: taskName, status: 'completed' });
  taskDuration.observe({ task: taskName }, duration);
  
  // Track specific metrics based on task
  if (taskName === 'daily-digest' && data.emailsSent) {
    emailCounter.inc({ type: 'daily-digest', status: 'sent' }, data.emailsSent);
  }
}

function taskFailed(taskName, error) {
  taskCounter.inc({ task: taskName, status: 'failed' });
}

// Setup metrics endpoint for Prometheus
function setupMetricsEndpoint(app) {
  app.get('/metrics', (req, res) => {
    res.set('Content-Type', Prometheus.register.contentType);
    res.end(Prometheus.register.metrics());
  });
}

module.exports = {
  taskStarted,
  taskCompleted,
  taskFailed,
  setupMetricsEndpoint
};

Running the System

To run this newsletter system in production, you would set up:

  1. A PM2 configuration to run the scheduler as a separate process
  2. Redis for distributed locking
  3. A database for user data and email logs
  4. Prometheus for metrics collection
  5. Grafana for monitoring and alerting
// ecosystem.config.js
module.exports = {
  apps: [{
    name: 'newsletter-api',
    script: 'server.js',
    instances: 'max',
    exec_mode: 'cluster',
    env: {
      NODE_ENV: 'production',
      PORT: 3000
    }
  }, {
    name: 'newsletter-scheduler',
    script: 'scheduler.js',
    instances: 1,
    exec_mode: 'fork',
    env: {
      NODE_ENV: 'production'
    },
    restart_delay: 10000, // Wait 10s before restart
    max_memory_restart: '300M'
  }]
};

Conclusion and Best Practices Summary

mindmap root((Scheduled Tasks)) Use Cases Database Maintenance Data Processing User Engagement Business Operations System Health Implementation In-Process node-cron node-schedule Queue-Based Bull Agenda Dedicated Bree PM2 Best Practices Error Handling Logging Monitoring Idempotence Prevent Overlapping Timezone Awareness Separate Task Logic Deployment Docker Kubernetes Serverless Advanced Distributed Scheduling Dynamic Scheduling Workflows

Key Takeaways

Further Learning

Practice Exercises

Exercise 1: Basic Scheduled Task

Create a simple Node.js application that uses node-cron to schedule and execute a task every minute. The task should write the current timestamp to a file. Add proper error handling and logging.

Exercise 2: Database Cleanup

Build a scheduled task that connects to a database (MongoDB or PostgreSQL) and removes records older than 30 days from a "temporary_data" collection/table. Schedule it to run daily at midnight.

Exercise 3: User Engagement Email

Create a scheduled task that sends a "We miss you" email to users who haven't logged in for 14 days. Use Bull for job scheduling and processing, and implement a retry mechanism for failed email deliveries.

Exercise 4: Dynamically Scheduled Tasks

Build a small application with an API that allows creating, updating, and deleting scheduled tasks. Store task definitions in a database and implement a scheduler that loads and executes these tasks according to their schedules.

Exercise 5: Multi-stage Workflow

Implement a workflow system as described in the "Task Dependencies and Workflows" section. Create a workflow with at least three dependent tasks and schedule it to run daily.