Published
- 5 min read
ARG to Rescue: Reuse Variables in Multistage Dockerfile
Introduction
The Docker ecosystem is rich with tools and best practices that streamline containerization. One of these practices is using multistage builds to create lean, efficient containers. However, as your Dockerfiles grow more complex, managing variables and maintaining readability can become a challenge. Enter the ARG
instruction—your key to sharing variables across stages in a multistage Dockerfile. In this blog post, we’ll explore how ARG
can simplify your Dockerfiles, enhance reusability, and maintain cleaner code.
Why Use Multistage Builds?
Multistage builds in Docker allow you to use multiple FROM
statements in a single Dockerfile, creating separate stages that can be used to build a final image. This method is particularly useful for creating lightweight production images, as you can copy only the necessary artifacts from earlier stages. Here’s a quick example to illustrate:
# Stage 1: Build stage
FROM golang:1.16 AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp .
# Stage 2: Production stage
FROM alpine:latest
COPY --from=builder /app/myapp /usr/local/bin/myapp
CMD ["myapp"]
In this simple example, the build stage compiles a Go application, and the production stage creates a minimal image containing only the compiled binary.
The Challenge: Sharing Variables Across Stages
While multistage builds are powerful, they can introduce a common challenge: variable reuse. Suppose you need to use a specific version of an application or a common path across multiple stages. Without a way to share variables, you might end up duplicating code, leading to maintenance headaches and potential errors.
Here’s where ARG
(argument) comes into play. The ARG
instruction allows you to define variables that can be used throughout your Dockerfile, even across different stages.
Introducing ARG: The Basics
The ARG
instruction defines a variable that users can pass at build time to customize the build process. Unlike environment variables (ENV
), which are persisted in the image, ARG
variables are only available during the build process and do not become part of the final image.
Let’s start with a basic example:
# Define the argument with a default value
ARG BASE_IMAGE=alpine:3.12
# Use the argument in the FROM instruction
FROM ${BASE_IMAGE}
RUN echo "This image is based on ${BASE_IMAGE}"
In this example, BASE_IMAGE
is an argument that can be overridden when building the Dockerfile. The default value is alpine:3.12
, but you could specify a different base image at build time:
docker build --build-arg BASE_IMAGE=ubuntu:20.04 -t custom-image .
Sharing ARG Variables Across Stages
The real power of ARG
comes into play with multistage builds. To share ARG
variables between stages, you need to redefine the ARG
in each stage. Let’s look at a more advanced example:
# Define an argument for the Go version
ARG GO_VERSION=1.16
# Stage 1: Build stage
FROM golang:${GO_VERSION} AS builder
ARG GO_VERSION
WORKDIR /app
COPY . .
RUN go build -o myapp .
# Stage 2: Production stage
FROM alpine:latest
ARG GO_VERSION
RUN echo "Built with Go version ${GO_VERSION}"
COPY --from=builder /app/myapp /usr/local/bin/myapp
CMD ["myapp"]
In this Dockerfile, we define GO_VERSION
as an argument at the top. By repeating ARG GO_VERSION
in each stage, we make the argument available for use. Notice how the build stage uses GO_VERSION
to specify the Go image, and the production stage echoes the Go version used.
Advanced Usage: Combining ARG with Environment Variables
You might find it useful to combine ARG
with ENV
to set environment variables conditionally based on build arguments. This can further enhance your Dockerfile’s flexibility.
Following example demonstrates the use of ARG
and ENV
to have a generic Dockerfile for a Turborepo application:
ARG APP_NAME="web"
ARG PNPM_HOME="/root/.local/share/pnpm"
FROM node:20-alpine AS base
FROM base AS builder
# Set working directory
WORKDIR /app
ARG APP_NAME
ARG PNPM_HOME
ENV PNPM_HOME=${PNPM_HOME}
ENV PATH="${PATH}:${PNPM_HOME}"
RUN corepack enable
RUN pnpm add -g [email protected]
COPY . .
# Collect all the necessary dependencies for the project
RUN turbo prune ${APP_NAME} --docker
# Add lockfile and package.json's of isolated subworkspace
FROM base AS installer
WORKDIR /app
ARG APP_NAME
ARG PNPM_HOME
ENV PNPM_HOME=${PNPM_HOME}
ENV PATH="${PATH}:${PNPM_HOME}"
RUN corepack enable
# First install dependencies (as they change less often)
COPY .gitignore .gitignore
COPY --from=builder /app/out/json/ .
COPY --from=builder /app/out/pnpm-lock.yaml ./pnpm-lock.yaml
RUN pnpm install
# Build the project and its dependencies
COPY --from=builder /app/out/full/ .
COPY turbo.json turbo.json
# Build the app
RUN pnpm turbo build --filter=${APP_NAME}
FROM base AS production
WORKDIR /app
ARG APP_NAME
ARG PNPM_HOME
ENV PNPM_HOME=${PNPM_HOME}
ENV PATH="${PATH}:${PNPM_HOME}"
ENV NODE_ENV="production"
RUN corepack enable
COPY --from=installer /app .
USER node
WORKDIR /app/apps/${APP_NAME}
CMD pnpm start
Key Points on Environment Variables in Multistage Builds
-
Environment Variables (
ENV
):- Are specific to the stage where they are defined.
- Do not persist across stages.
- If you need an environment variable in multiple stages, you have to redefine it or pass it via
ARG
.
-
Build Arguments (
ARG
):- Are defined once and can be passed to any stage by redeclaring them.
- Provide a way to share configuration details like versions or paths between stages.
Debugging ARG Variables
When working with ARG
variables, you might run into issues where arguments aren’t passed correctly or variables aren’t set as expected. Here are some tips to help you debug:
-
Check Build Logs: Use
docker build
with the--progress=plain
flag to get more detailed logs that can help identify where arguments are being used or missed.docker build --progress=plain -t debug-image .
-
Echo Variables: Add
RUN echo
statements to print the values of yourARG
variables during the build process.RUN echo "ARG BASE_IMAGE=${BASE_IMAGE}"
-
Use Default Values: Define sensible default values for your
ARG
variables to ensure that your build doesn’t fail if arguments are not provided.
Best Practices for Using ARG
- Define ARG Variables Early: Place
ARG
instructions at the top of your Dockerfile to make them accessible in all stages. - Use Descriptive Names: Choose meaningful names for your arguments to make the Dockerfile easier to understand and maintain.
- Avoid Secrets in ARG: Never use
ARG
to pass sensitive data like passwords or API keys, as they can be exposed in the Docker image history.
Conclusion
Using ARG
to share variables across stages in a multistage Dockerfile can significantly improve your Docker builds’ maintainability and flexibility. Whether you’re building lightweight production images or dynamically configuring your builds, ARG
provides a powerful tool to streamline and enhance your Dockerfile.