Introduction to Multistage Builds
Docker multistage builds are a powerful feature that allows you to use multiple temporary build stages in a single Dockerfile, ultimately producing a single, optimized final image. This approach helps solve one of the most challenging aspects of Docker: creating efficient, lightweight images that don't include unnecessary build tools or dependencies.
Real-world Analogy: Constructing a House
Think of building a Docker image like constructing a house:
- Traditional single-stage build: After building your house, all construction equipment, materials, blueprints, and worker supplies remain on your property permanently. Your "house image" includes everything used to build it.
- Multistage build: After construction, you keep only the finished house. All scaffolding, construction equipment, and temporary work materials are removed. Your "house image" contains only what's needed for living in it.
Why Use Multistage Builds?
The Problems with Single-Stage Builds
Before multistage builds, developers faced a difficult choice:
- Option 1: Create one Dockerfile for development (with all build tools) that produces large, inefficient images
- Option 2: Maintain separate Dockerfiles for development and production, leading to inconsistencies
- Option 3: Create complex shell scripts to "clean up" after building, which was error-prone
Key Benefits of Multistage Builds
- Smaller final images: Reduces image size by including only what's necessary for runtime
- Improved security: Fewer packages means reduced attack surface
- Simplified build process: Single Dockerfile handles the entire build pipeline
- Better layer caching: Optimizes build time by leveraging Docker's caching mechanism
- DRY (Don't Repeat Yourself): Eliminates need for multiple Dockerfiles or external scripts
Traditional vs. Multistage Build Comparison
(Contains Everything)"] end subgraph "Multistage Build" A2[Build Stage: Base Image] --> B2[Install Build Tools] B2 --> C2[Copy Source Code] C2 --> D2[Build Application] E2[Runtime Stage: Base Image] --> F2[Configure Runtime] D2 -- "Copy only built artifacts" --> F2 F2 --> G2["Final Image
(Runtime Only)"] end
Multistage Build Syntax
Basic Structure
The key to multistage builds is using multiple FROM statements in your Dockerfile. Each FROM statement begins a new build stage.
# Build stage
FROM node:18 AS build
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
# Runtime stage
FROM node:18-alpine
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY --from=build /app/package*.json ./
RUN npm install --production
EXPOSE 3000
CMD ["node", "dist/server.js"]
Key Syntax Elements
FROM ... AS <name>: Names a build stage for referenceCOPY --from=<name>: Copies files from a previous stage- Multiple
FROMstatements: Each creates a new independent build environment
Practical Multistage Build Examples
Example 1: Node.js Application
This example shows a typical React frontend with Node.js backend:
# Build stage for frontend
FROM node:18 AS frontend-build
WORKDIR /app/frontend
COPY frontend/package*.json ./
RUN npm install
COPY frontend/ ./
RUN npm run build
# Build stage for backend
FROM node:18 AS backend-build
WORKDIR /app/backend
COPY backend/package*.json ./
RUN npm install
COPY backend/ ./
RUN npm run build
# Final stage
FROM node:18-alpine
WORKDIR /app
COPY --from=backend-build /app/backend/dist ./
COPY --from=frontend-build /app/frontend/build ./public
COPY --from=backend-build /app/backend/package*.json ./
RUN npm install --production
EXPOSE 3000
CMD ["node", "server.js"]
Example 2: TypeScript Application
Here's an example for a TypeScript Node.js application:
# Build stage
FROM node:18 AS build
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY tsconfig.json ./
COPY src/ ./src/
RUN npm run build
# Production stage
FROM node:18-alpine
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY --from=build /app/package*.json ./
RUN npm install --production
EXPOSE 3000
CMD ["node", "dist/index.js"]
Advanced Multistage Build Techniques
Using Different Base Images
One powerful technique is using different base images for build and runtime stages:
# Build stage - uses a larger image with build tools
FROM node:18 AS build
WORKDIR /app
COPY . .
RUN npm install && npm run build
# Runtime stage - uses a minimal image
FROM alpine:3.18
RUN apk add --no-cache nodejs
WORKDIR /app
COPY --from=build /app/dist ./dist
EXPOSE 3000
CMD ["node", "dist/index.js"]
Builder Pattern with Multiple Specialized Stages
For complex applications, you can use multiple specialized build stages:
# Base builder with common tools
FROM node:18 AS base
WORKDIR /app
COPY package*.json ./
# Dependencies builder
FROM base AS dependencies
RUN npm install
# Asset builder for frontend
FROM base AS builder
COPY --from=dependencies /app/node_modules ./node_modules
COPY . .
RUN npm run build
# Test stage
FROM dependencies AS test
COPY . .
RUN npm test
# Production image
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=dependencies /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]
Using Buildkit Features
Docker BuildKit offers additional features for multistage builds:
- Parallel building: BuildKit can build independent stages in parallel
- Build secrets: Securely use secrets without leaving them in the final image
- Mount options: Use temporary mounts during build time
# Enable BuildKit features with this syntax
# Example using build secrets
FROM node:18 AS build
WORKDIR /app
COPY . .
RUN --mount=type=secret,id=npm_token \
NPM_TOKEN=$(cat /run/secrets/npm_token) npm install
# Example using cache mounts for faster builds
FROM node:18 AS deps
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
npm install
Best Practices for Multistage Builds
- Order dependencies for efficient caching: Place infrequently changing items earlier in the Dockerfile
- Keep build stages focused: Each stage should have a single, clear responsibility
- Use appropriate base images: Use lightweight images for the final stage
- Name your stages meaningfully: Use descriptive names that indicate the stage's purpose
- Only copy what you need: Be specific about which artifacts to copy between stages
- Use .dockerignore: Prevent unnecessary files from being copied to the build context
- Consider development experience: Create targets that work well for both development and production
Optimizing Layer Caching Example
This example demonstrates proper ordering of operations for efficient caching:
# Good caching - Dependencies installed first, separated from code
FROM node:18 AS build
WORKDIR /app
# Copy only package files first (changes less frequently)
COPY package*.json ./
RUN npm install
# Then copy code (changes more frequently)
COPY . .
RUN npm run build
# Bad caching - Everything copied at once, breaking caching for dependencies
FROM node:18 AS build-inefficient
WORKDIR /app
# Copying everything at once forces npm install to run on every code change
COPY . .
RUN npm install
RUN npm run build
Real-world Applications
Case Study: Full Stack JavaScript Application
Consider a typical full-stack JavaScript application with:
- React frontend written in TypeScript
- Node.js/Express backend API
- Testing framework and linting tools
- Production optimization requirements
Complete Multistage Dockerfile Example
# Base development dependencies stage
FROM node:18 AS base
WORKDIR /app
COPY package*.json ./
RUN npm install
# Frontend build stage
FROM base AS frontend-build
WORKDIR /app/client
COPY client/package*.json ./
RUN npm install
COPY client/ ./
RUN npm run build
# Backend build stage
FROM base AS backend-build
WORKDIR /app/server
COPY server/package*.json ./
RUN npm install
COPY server/ ./
RUN npm run build
# Test stage
FROM base AS test
WORKDIR /app
COPY --from=frontend-build /app/client ./client
COPY --from=backend-build /app/server ./server
RUN npm test
# Production stage
FROM node:18-alpine
WORKDIR /app
# Copy backend build artifacts
COPY --from=backend-build /app/server/dist ./
# Copy frontend static assets
COPY --from=frontend-build /app/client/build ./public
# Copy package files and install only production dependencies
COPY --from=backend-build /app/server/package*.json ./
RUN npm install --production
EXPOSE 3000
CMD ["node", "index.js"]
Cost Savings and Performance Improvements
Real-world benefits from this approach:
- Size reduction: Final image ~120MB vs 800MB+ for a single-stage build
- Cost savings: Smaller images mean faster deployments and reduced storage costs
- Improved security posture: Fewer components mean fewer vulnerabilities
- Faster startup times: Lightweight containers start and scale faster
- CI/CD improvements: Cleaner pipelines with built-in testing
Hands-on Exercises
Exercise 1: Convert a Single-stage Dockerfile
Take the following single-stage Dockerfile and convert it to a multistage build:
# Single-stage Dockerfile
FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["node", "dist/server.js"]
Exercise 2: Optimize a React Application
Create a multistage Dockerfile for a React application that:
- Uses Node.js to build the application
- Uses Nginx to serve the static files
- Includes a testing stage
- Optimizes for caching and minimal final size
Exercise 3: Implement BuildKit Features
Enhance a multistage Dockerfile to use BuildKit features:
- Add a build cache for npm packages
- Use a build secret for accessing a private npm registry
- Implement parallel build stages where appropriate
Summary and Next Steps
Key Takeaways
- Multistage builds allow you to keep build tools separate from runtime environments
- They produce significantly smaller, more secure Docker images
- The approach simplifies your CI/CD pipeline with a single Dockerfile
- Proper structuring of multistage builds improves build performance through caching
- Different stages can use different base images optimized for their specific tasks
Further Learning
Additional Practice Activities
Activity 1: Image Size Comparison
Create both a single-stage and multistage Dockerfile for the same application, then compare:
- Final image size
- Build time (with and without cache)
- Number of layers
- Security scan results using
docker scan
Activity 2: Real-world Application Conversion
Take an existing application from your projects and convert its Dockerfile to a multistage build. Document the before and after metrics to demonstrate the improvements.
Activity 3: Advanced Multistage Pipeline
Create a sophisticated multistage Dockerfile that includes:
- Linting stage
- Unit testing stage
- Building stage
- Security scanning stage
- Final production stage
Ensure the stages are ordered to maximize caching efficiency.