Stop Learning Docker for Dummies. Learn It Like a DevOps Engineer

DEV Community

Akhilesh Mishra

Apr 25, 2026, 04:52 PM

Most Docker tutorials teach you commands. Not how Docker actually works in real systems. That's why people can run docker run but freeze when asked to debug a broken container in production. This post walks you through them in order. Layer caching. Multi-stage builds. Networking. Volumes. ENTRYPOINT vs CMD. docker inspect. The concepts most DevOps engineers fake their way through until they get burned. Let's go. The Docker story began with the problems people had 20-30 years ago. You had bare hardware. You installed an OS on top. You compiled your app and sorted every dependency by hand. Need to run another application? Buy another server. Spend days setting it up from scratch. Then virtualization came. Hypervisors let you spin 10–20 VMs on the same hardware. Better. But the dependency problem stayed. You still had to install and configure everything on every single VM. Apps worked on one machine. Failed on another. "Works on my machine" became the most dreaded phrase in software engineering. Docker killed that phrase. You install software directly on the machine. Works fine. Until your app needs Python 3.8 and another app on the same machine needs Python 3.11. They conflict. One of them breaks. You start managing dependencies manually. It becomes a full-time job. So Docker introduced the image. A Docker image packages everything your app needs. Code. Runtime. Libraries. Environment variables. Config files. All of it, in one artifact. You build it once. Run it anywhere. No more "works on my machine." The machine is inside the image. But an image sitting on disk does nothing. You need to run it. A container is a running process. A live instance of your image. One image can spin up dozens of containers at the same time. On any machine. On any cloud. docker run -d -t --name Thor alpine docker run -d -t busybox This spins up two containers. Both are minimalist Linux images pulled from Docker Hub. -d runs the container in the background -t attaches a terminal --name gives it a name. Skip it and Docker invents a random one for you Images sitting idle are useless. Containers made them run. You have containers running. You don't know which ones. docker ps # running containers only docker ps -a # all containers, including stopped ones docker image ls # see images on your machine Notice the size. Alpine is around 7MB. A full Ubuntu VM is gigabytes. That's why you can run 50 containers where you might fit only 5 VMs. To interact with a running container, exec into it: # Run a single command inside the container docker exec -t Thor ls docker exec -t Thor ps # Open an interactive shell docker exec -it Thor sh -it opens an interactive terminal session. From here you can inspect the filesystem, check processes, read logs, debug your app live. Type exit to come back out. Container lifecycle commands you'll use every day: docker stop Thor # gracefully stop a running container docker start Thor # start a stopped container docker rm Thor # remove a stopped container docker rm -f Thor # force remove a running container Containers are ephemeral by design. Stop them. Delete them. Spin new ones from the same image. Repeat. The image never changes. Only the containers do. Running other people's images only takes you so far. You need to package your own app. That's what the Dockerfile does. A blueprint. A plain text file of instructions. FROM python:3.11 WORKDIR /app COPY requirements.txt /app RUN pip install --no-cache-dir -r requirements.txt COPY app.py /app EXPOSE 5000 CMD ["python", "app.py"] What each instruction does: FROM — sets the base image WORKDIR — sets the working directory inside the container COPY — copies files from your machine into the image RUN — executes commands during the build EXPOSE — documents which port your app listens on CMD — the command that runs when the container starts Build it: docker build -t flask-image . The . tells Docker to look for the Dockerfile in the current directory. But your builds keep taking 3 minutes even when you changed one line of code. That's a layer caching problem. Docker builds images in layers. Every instruction in your Dockerfile is a layer. When you rebuild, Docker checks each layer top to bottom. If nothing changed in that layer, Docker reuses the cached version and moves on. If something changed, Docker rebuilds that layer and every layer after it. This is why order matters in your Dockerfile. ❌ Wrong order — cache breaks every time: COPY . /app # copies everything including app code RUN pip install -r requirements.txt # installs dependencies You change one line in app.py. Docker sees the COPY changed. It invalidates the cache. It reinstalls all your dependencies from scratch. Every single time. ✅ Right order — cache works for you: COPY requirements.txt /app # copy only the dependency file first RUN pip install -r requirements.txt # cached until requirements change COPY . /app # copy app code last You change app.py. Docker sees requirements.txt hasn't changed. It uses the cached pip install. Only the final COPY reruns. Build goes from 3 minutes to 10 seconds. The rule: copy what changes least, first. Copy what changes most, last. Cache was there all along. You just had to stop fighting it. Your image is 1.2GB. That's a problem. A large image means slow pulls across environments. More storage cost. More packages installed means more CVEs to patch. A bigger attack surface for anyone who gets inside. The culprit is usually your base image. python:3.11 is convenient. It is also built on Debian and ships with compilers, build tools, and packages your running app never needs. Switch to python:3.11-slim and your image drops a lot. Switch to python:3.11-alpine and it drops further. But sometimes you need the full build environment to compile dependencies. You just don't need it at runtime. That's what multi-stage builds solve. Build in one image. Run in another. # Stage 1: Build FROM python:3.11 AS builder WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt # Stage 2: Run FROM python:3.11-slim WORKDIR /app COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages COPY app.py . EXPOSE 5000 CMD ["python", "app.py"] The first stage uses the full image to install everything. The second stage starts clean from a slim base. It copies only what it needs from the builder — the installed packages and your app code. Nothing else. Your final image has no compiler. No build tools. No leftover cache. Just your app and what it needs to run. Image goes from 1.2GB to under 150MB. Fewer packages. Fewer CVEs. Faster pulls. Smaller attack surface. Your build environment and your runtime environment are finally separate. Most tutorials treat them as the same thing. They are not. CMD sets the default command. It can be completely overridden. docker run flask-image python debug.py That python debug.py replaces your CMD entirely. The container runs your debug script instead. ENTRYPOINT sets the command that always runs. It cannot be overridden — only appended to. ENTRYPOINT ["python"] CMD ["app.py"] Now python always runs. app.py is just the default argument. docker run flask-image # runs: python app.py docker run flask-image debug.py # runs: python debug.py docker run flask-image --version # runs: python --version The pattern in production: use ENTRYPOINT for the executable that never changes. Use CMD for the default arguments that might. Treat them as the same and you hit confusing behavior when passing arguments to containers. Now you know why. Docker Hub is the public registry for images. Like GitHub, but for Docker images. Tag your image first. The tag tells Docker where to push it. docker tag flask-image yourusername/flask-demo:1.0 docker login docker push yourusername/flask-demo:1.0 Now anyone, on any machine, anywhere in the world, can pull and run your app: docker pull yourusername/flask-demo:1.0 docker run -td -p 8080:5000 yourusername/flask-demo:1.0 The image is portable. The environment comes with it. Same behavior everywhere. But now you hit a new problem. Your container is running. You can't reach it. By default, Docker uses a bridge network. The container gets its own IP. Your host machine is on a different network. They cannot talk directly. You try to curl your Nginx container from your laptop. It fails. The container is running. But it is isolated. Port forwarding solves this: docker run -t -d -p 5000:80 --name nginx-container nginx:latest This forwards port 80 inside the container to port 5000 on your host. Now localhost:5000 reaches your container. But here's the other annoying thing about the default bridge network — containers on it cannot reach each other by name. Only by IP. And IPs change every time a container restarts. You cannot hardcode them. docker network create my-network A user-defined network gives you two things the default bridge doesn't: Isolation — your containers live in their own network, separate from the host Name resolution — containers can reach each other by name, not by IP docker run -itd --network my-network --name web-app nginx docker run -itd --network my-network --name api-app busybox Now web-app can ping api-app by name. Restart either one with a new IP. Name resolution still works. In production, your app container talks to your database container by name. This is how. You can also inspect networks to see who's connected: docker network ls docker network inspect my-network Containers could not find each other reliably. User-defined networks fixed that. You stop a container. Start it again. The data inside is gone. Logs. Database writes. Uploaded files. All wiped. Containers are ephemeral. That is by design. But data cannot be. Volumes solve this. A volume is storage managed by Docker that lives outside the lifecycle of any container. docker volume create mydata docker run -d --mount source=mydata,target=/app nginx:latest The container writes to /app. Docker stores that data in the volume. Delete the container. Create a new one. Mount the same volume. The data is still there. You can also mount the same volume into multiple containers: docker run -td --mount source=mydata,target=/app/log --name container-1 busybox docker run -td --mount source=mydata,target=/app/log --name container-2 busybox Anything container-1 writes to /app/log, container-2 can read. And vice versa. Your log aggregator reads what your app writes. Your backup container reads what your database writes. Volume commands you'll use regularly: docker volume create mydata # create a volume docker volume ls # list volumes docker volume inspect mydata # see details including mount path on host docker volume rm mydata # delete a volume Containers were ephemeral. Volumes made data survive them. This is where most engineers get stuck. They can start containers. They cannot debug them. docker inspect is the command that changes that. docker inspect flask-container It returns everything Docker knows about a running container. In JSON. The things you actually use it for in production: # What network is this container on? What IP did it get? docker inspect flask-container | grep -A 20 "Networks" # What volumes are mounted? Where do they point on the host? docker inspect flask-container | grep -A 10 "Mounts" # What environment variables were actually injected at runtime? docker inspect flask-container | grep -A 20 "Env" # Is the health check passing or failing? docker inspect flask-container | grep -A 10 "Health" The difference between what you think is running and what is actually running often lives in this output. Container behaving differently in staging than local? Check the environment variables that were actually injected. Volume not persisting? Check where it is actually mounted. Network connectivity failing? Check which network the container actually joined. docker inspect shows you reality. Everything else is assumption. These are the advanced Docker concepts that separate engineers who run containers from engineers who ship them to production: Layer caching — copy what changes least, first Multi-stage builds — separate build environment from runtime CMD vs ENTRYPOINT — defaults vs always-runs User-defined networks — name resolution between containers Volumes — make data survive containers docker inspect — see what's actually running, not what you think is running Every one of these exists because something broke in production. Now you know the story behind each. If you are serious about production-grade DevOps — not tutorial DevOps — I run a 25-week live bootcamp covering AWS, Kubernetes, MLOps, and AIOps. Real projects. Real troubleshooting. The kind of skills that actually show up in interviews. [25-Week AWS DevOps + MLOps + AIOps Bootcamp →]