How Dockerfile Works

Docker has emerged as a vital tool for creating, deploying, and managing containerized applications. At the heart of Docker is the Dockerfile, a simple text file that holds instructions for building a Docker image. If you're new to Docker or want to dive deeper into how Dockerfile works, this guide will walk you through the essentials.

What is a Dockerfile?


A Dockerfile is essentially a script of instructions that tell Docker how to build a custom image. These instructions are executed step-by-step, ensuring that the image is configured exactly as specified. The Dockerfile allows you to automate the image creation process, making it easy to replicate environments consistently.

Docker images built from a Dockerfile are the foundation of Docker containers. When you run a container, you're running an instance of a Docker image, which includes everything your application needs to operate: code, libraries, dependencies, and even the runtime.

Basic Structure of a Dockerfile


A Dockerfile is composed of various instructions, each serving a specific purpose. Let’s look at some of the most common Dockerfile instructions and how they work:

  1. FROM: This instruction is mandatory and is usually the first line in a Dockerfile. It specifies the base image from which your image will be built.

  2. FROM ubuntu:20.04

    Here, Docker is instructed to start with the official Ubuntu 20.04 image as the base.

  3. RUN: This command allows you to run commands inside the image during the build process. It is often used for installing software packages or setting up the environment.

  4. RUN apt-get update && apt-get install -y python3

    This example updates the package list and installs Python 3 on the image.

  5. COPY/ADD: Both COPY and ADD instructions copy files from your local machine into the Docker image.

  6. COPY . /app

    This command copies the contents of the current directory (.) into the /app directory inside the container.

  7. CMD: This defines the default command that should be run when a container starts.

  8. CMD ["python3", "app.py"]

    This will run the app.py file with Python 3 when the container is started.

  9. EXPOSE: This tells Docker which port your application will use.

  10. EXPOSE 8080

    This indicates that the application inside the container will use port 8080.

  11. WORKDIR: This instruction sets the working directory inside the container.

  12. WORKDIR /app

    This sets /app as the working directory for subsequent instructions like RUN, CMD, etc.

  13. ENTRYPOINT: Similar to CMD, but with one key difference: ENTRYPOINT cannot be overridden from the command line.

  14. ENTRYPOINT ["python3", "app.py"]

    This ensures that app.py is always executed when the container starts.

  15. ENV: This instruction sets environment variables.

  16. ENV APP_ENV production

    This sets APP_ENV to production, which can be used by the application within the container.

  17. VOLUME: This creates a mount point for a shared volume between the container and the host machine.

  18. VOLUME /app/data

    This mounts the /app/data directory to the host, allowing data to persist.

How Dockerfile Works in Action


Let’s put all the pieces together and build a basic Dockerfile to containerize a simple Python application.

Here’s the structure of our Python project:

/myapp
   ├── app.py
   └── Dockerfile
    

The content of the app.py file:

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello():
    return "Hello, Docker!"

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8080)
    

Now, here’s a basic Dockerfile to containerize this Flask app:

# Step 1: Use an official Python runtime as the base image
FROM python:3.9-slim

# Step 2: Set the working directory inside the container
WORKDIR /app

# Step 3: Copy the local project files into the container
COPY . /app

# Step 4: Install required Python packages
RUN pip install Flask

# Step 5: Set environment variables
ENV FLASK_ENV=production

# Step 6: Expose the port Flask runs on
EXPOSE 8080

# Step 7: Define the default command to run the app
CMD ["python", "app.py"]
    

Best Practices for Writing Dockerfiles


  1. Use Official Images: Always use official, well-maintained base images like python, ubuntu, or alpine to ensure security and reliability.
  2. Minimize Layers: Each RUN, COPY, and ADD instruction creates a new layer. Combining related instructions can help minimize the number of layers and reduce the final image size.

  3. RUN apt-get update && apt-get install -y package1 package2
  4. Use Multi-Stage Builds: For large projects, you can use multi-stage builds to create smaller and more efficient images.
  5. Leverage .dockerignore: The .dockerignore file works similarly to .gitignore, helping you avoid copying unnecessary files into the image, thereby reducing the image size.
  6. Security: Always scan your images for vulnerabilities, and avoid running containers as the root user.

Conclusion


Dockerfile is a powerful tool that simplifies the process of building Docker images. By understanding the core instructions and best practices, you can create efficient and maintainable images for your applications. Docker's ability to provide consistent environments across various systems makes Dockerfiles an essential skill for developers and system administrators alike.