Dockerfile: The RUN Gotcha Explained
Docker is a popular tool for packaging applications into containers, but it can be tricky to get right. One gotcha to be aware of is how to use RUN
in separate layers when installing dependencies like nvm. In this blog post, we'll explore how to avoid issues when using RUN
in separate layers and share best practices to help you build efficient Docker images.
Understanding Docker Layers
Before diving into how to use RUN
in separate layers, let's first understand what Docker layers are. When you build a Docker image, each instruction in your Dockerfile creates a new layer. Each layer represents a change to the file system, such as installing a package or copying a file. The layers are cached so that if you make a change to your Dockerfile and rebuild your image, Docker only builds the layers that have changed. This caching can speed up your builds and reduce the size of your images.
The Problem with Using RUN in Separate Layers
One common issue when building Docker images is using RUN
in separate layers when installing dependencies like nvm. For example, consider the following Dockerfile:
FROM openjdk:11-jdk
RUN apt-get update && \
apt-get install -y curl unzip libglu1-mesa libjaxb-api-java && \
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
ENV NODE_VERSION=v14.16.0
RUN nvm install $NODE_VERSION
RUN nvm use $NODE_VERSION
The RUN nvm install $NODE_VERSION
command fails with the error message:
[3/4] RUN nvm install v14.16.0:
0.437 /bin/sh: 1: nvm: not found
The error occurs because nvm
isn't available in the current shell session (current layer) during the RUN nvm install $NODE_VERSION
command. To fix this, source the nvm
setup script in each session or combine installation in one layer. In Docker, each RUN
instruction creates a new layer representing a distinct filesystem snapshot. However, variables set or environment changes made in one RUN command don't persist unless explicitly managed in subsequent RUN commands.
Best Practices for Using RUN in Separate Layers
To avoid issues with RUN
in separate layers, you should combine all your dependencies into a single RUN
instruction. For example, you could rewrite the Dockerfile as follows:
FROM openjdk:11-jdk
ENV NODE_VERSION=v14.16.0
RUN apt-get update && \
apt-get install -y curl unzip libglu1-mesa libjaxb-api-java && \
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash && \
. ~/.nvm/nvm.sh && \
nvm install $NODE_VERSION && \
nvm use $NODE_VERSION
Conclusion
When building Docker images, it’s important to be mindful of the layers that you are creating with each RUN
instruction. By following best practices and combining dependencies into a single RUN
instruction, you can avoid issues like the one we saw with nvm. This not only helps you build efficient Docker images but also ensures that your images are reliable and consistent.