Introduction to Docker files

6 min readSep 25, 2022

A Dockerfile is simply a plain text file that contains a set of user-defined instructions. When a Dockerfile is called by the docker image build command, which we will look at next, it is used to assemble a container image.

A Dockerfile looks as follows:

FROM node:alpine                                                           ENV CI=true                                                           WORKDIR /app                             
COPY package.json ./                             
RUN npm install                             
COPY ./ ./                                                           CMD ["npm", "start"]

As you can see, even with no explanation, it is quite easy to get an idea of what each step of the Dockerfile instructs the build command to do. Before we move on and work our way through the previous file, we should quickly touch upon Alpine Linux.

Alpine Linux is a modest, independently created, non-profit Linux distribution focused on security, effectiveness, and user-friendliness.
Despite being small, it provides a strong base for container images because of its large package repository and because grsecurity/PaX is an unofficial port that has been patched into its kernel. This port provides proactive defense against numerous potential zero-day threats and other weaknesses. Alpine Linux, due to both its size and how powerful it is, has become the default image base for the official container images supplied by Docker.

Reviewing Dockerfiles in depth

Let’s take a look at the instructions which are used in the Dockerfiles. We will look at them in the order they appeared in:

FROM
LABEL
RUN
COPY and ADD
EXPOSE
ENTRYPOINT and CMD
Other Dockerfile instructions

FROM

The FROM instruction tells Docker the image base you want to utilize.
Since Alpine Linux is what we are using, all we need to do is specify the image’s name and the release tag we want to apply. In our situation, we only need to add alpine:latest to use the most recent official Alpine Linux image.

LABEL

To provide the image with additional information, use the LABEL directive.
Anything from a version number to a description might be included in this information. Additionally, it is advised that you use fewer labels overall.
Others who utilize our image in the future will benefit from a well-structured label.

RUN

The RUN instruction is where we interact with our image to install software and run scripts, commands, and other tasks. As you can see from the following RUN instruction, we are actually running three commands:

RUN apk add --update nginx && \
rm -rf /var/cache/apk/* && \
mkdir -p /tmp/nginx/

The first of our three commands is the equivalent of running the following command if we had a shell on an Alpine Linux host:

$ apk add --update nginx

This command installs NGINX using Alpine Linux’s package manager.

The following command in our chain removes any temporary files to keep the size of our image to a minimum:

$ rm -rf /var/cache/apk/*

The final command in our chain creates a folder with a path of /tmp/nginx/ so that NGINX will start correctly when we run the container:

$ mkdir -p /tmp/nginx/

We could have also used the following in our Dockerfile to achieve the same results:

RUN apk add --update nginx
RUN rm -rf /var/cache/apk/*
RUN mkdir -p /tmp/nginx/

Similar to adding several labels, this is viewed as wasteful because it can increase the image’s overall size, which is something we should want to avoid. Since some commands don’t function effectively when they are strung together using &&, there are some legitimate use cases for this.
However, when creating your picture, you should generally avoid using this method of performing instructions.

COPY and ADD

At first glance, COPY and ADD look like they are doing the same task in that they are bothused to transfer files to the image. However, there are some important differences, whichwe will discuss here.
The COPY instruction is the more straightforward of the two:

COPY files/nginx.conf /etc/nginx/nginx.conf
COPY files/default.conf /etc/nginx/conf.d/default.conf

As you have probably guessed, we are copying two files from the files folder on the host we are building our image on. The first file is nginx.conf, which is a minimal NGINXconfiguration file. This will overwrite the NGINX configuration that was installed as part of the APK installation in the RUN instruction.
The next file, default.conf, is the simplest virtual host that we can configure. Again, this will overwrite any existing files.

The ADD instruction looks as follows:

ADD files/html.tar.gz /usr/share/nginx/

As you can see, we are adding a file called html.tar.gz, but we are not actually doing anything with the archive to uncompress it in our Dockerfile. This is because ADD automatically uploads, uncompresses, and adds the resulting folders and files to the path we request it to, which in our case is /usr/share/nginx/. This gives us our web root of /usr/share/nginx/html/, as we defined in the virtual host block in the default.conf file that we copied to the image.

The ADD instruction can also be used to add content from remote sources.

EXPOSE

The EXPOSE instruction lets Docker know that when the image is executed, the port and protocol defined will be exposed at runtime. This instruction does not map the port to the host machine; instead, it opens the port to allow access to the service on the container network.

For example, in our Dockerfile, we are telling Docker to open port 80 every time the image runs:

EXPOSE 80/tcp

ENTRYPOINT and CMD

The benefit of using ENTRYPOINT over CMD is that you can use them in conjunction witheach other. ENTRYPOINT can be used by itself but remember that you would only want touse ENTRYPOINT by itself if you wanted your container to be executable.

For reference, if you think of some of the CLI commands you might use, you must specify more than just the CLI command. You might have to add extra parameters that you want the command to interpret. This would be the use case for using ENTRYPOINT only. For example, if you want to have a default command that you want to execute inside a container, you could do something similar to the following example. Be sure to use a command that keeps the container alive.

ENTRYPOINT [“nginx”]
CMD [“-g”, “daemon off;”]

What this means is that whenever we launch a container from our image, the NGINX binary is executed, which, as we have defined, is our entry point. Then, whatever we have as CMD is executed, giving us the equivalent of running the following command:

$ nginx -g daemon off;

Other Dockerfile instructions

USER: The USER instruction lets you specify the username to be used when a command is run. The USER instruction can be used on the RUN instruction, the CMD instruction, or the ENTRYPOINT instruction in the Dockerfile. Also, the user defined in the USER instruction must exist, or your image will fail to build. Using the USER instruction can also introduce permission issues, not only on the container itself, but also if you mount volumes.
WORKDIR: The WORKDIR instruction sets the working directory for the same set of instructions that the USER instruction can use (RUN, CMD, and ENTRYPOINT). It will allow you to use the CMD and ADD instructions as well.
ONBUILD: The ONBUILD instruction lets you stash a set of commands to be used when the image is used in the future, as a base image for another container image. For example, if you want to give an image to developers and they all have a different code base that they want to test, you can use the ONBUILD instruction to lay the groundwork ahead of the fact of needing the actual code. Then, the developers will simply add their code to the directory you ask them to, and when they run a new Docker build command, it will add their code to the running image.
The ONBUILD instruction can be used in conjunction with the ADD and RUN instructions, such as in the following example:

This would run an update and package upgrade every time our image is used as a base for another container image

ONBUILD RUN apk update && apk upgrade && rm -rf /var/cache/
apk/*

ENV: The ENV instruction sets ENVs within the image both when it is built and when it is executed. These variables can be overridden when you launch your image.