I’m getting into the habit of building everything inside containers.

Initially, Docker’s most attractive feature was removing the ever-present fear of hosing my entire setup with a misplaced make install. More recently I’ve come to appreciate the time it saves when revisiting old projects.

Without containers, running old code means a hold-your-breath moment as you wait to see what’s broken. You wrote this against Elastic 6? Well, too bad you’ve since installed 7. An hour of today will be figuring out how to roll it back and an hour of tomorrow will be rolling it forward.

That’s not a productive use of your time.

Containers fulfil the promise of entirely reproducible environments and the end of “works on my machine”, even when the “my machine” in question is “this machine, but last January”.

Which brings me to rebooting this site on Jekyll, but containerized.

Bootstrapping Jekyll without installing Jekyll

First off, we install Jekyll on a throwaway container to create a skeleton site.

$ docker run --rm -it -v ${PWD}:/app ruby:2.5.8 /bin/bash
# we're now executing inside the container
root@836b9b7e5fa4: cd /app
root@836b9b7e5fa4: gem install jekyll
root@836b9b7e5fa4: jekyll new jekyll --force
root@836b9b7e5fa4: exit
$ ls
jekyll
  • --rm delete the container once it stops, we’re not going to use it again
  • -it broadly speaking; make the container respond to interactive commands. Any deeper explanation starts with “When mainframes had physical terminals…”
  • -v mount the current directory as /app on the container. This replicates any changes we make in the container back to our machine.

A simple Dockerfile for development

With the basic site created, we add a Dockerfile to make the whole thing reproducible.

FROM ruby:2.6.3
COPY ./ /app/
WORKDIR /app/jekyll
RUN bundle install
EXPOSE 4000
CMD ["jekyll", "serve", "--host=0.0.0.0"]

That’s a decent first pass, but there’s a problem applicable to any Ruby container. Let’s time how long it takes to build our image

$ time docker build -t leftbrained/leftbrained-jekyll .
Sending build context to Docker daemon  39.49MB
Step 1/6 : FROM ruby:2.6.3
 Step 2/6 : COPY ./ /app/
  Step 3/6 : WORKDIR /app/jekyll
    Step 4/6 : RUN bundle install
      Step 5/6 : EXPOSE 4000
        Step 6/6 : CMD ["jekyll", "serve", "--host=0.0.0.0"]
          Successfully tagged leftbrained/leftbrained-jekyll:latest
2:34.90 total

About two and a half minutes with the majority of time spent on 4/6 installing gems.

A second build of the image is instantaneous because Docker has cached each step as an intermediate layer. These are reused on subsequent builds until an altered, uncached step is encountered.

$ time docker build -t leftbrained/leftbrained-jekyll .
Sending build context to Docker daemon  39.49MB
   # ... trimmed for brevity...
0.863 total

But if we make the slightest change to our site we’re back to several minutes

$ touch ./jekyll/index.md
$ time docker build -t leftbrained/leftbrained-jekyll .
Sending build context to Docker daemon  39.49MB
   # ... trimmed for brevity...
2:54.20 total

For COPY, the contents of each file is considered: changing them will bust the cache. If we add a post, change the stylesheet or delete an image from Jekyll, the next build have to rewind to the COPY step and subsequently rebundle from scratch two steps later.

This wasted time will quickly mount up, especially when using gems with native compilation, like sassc or nokogiri.

Better Gem management

The solution is to separate the steps where we COPY files that require changes to gems.

Docker’s diff is smart. It will identify which step individual files are on copied on and preserves the cache up to that step. With this Dockerfile most builds will be instant.

FROM ruby:2.6.3
# Alter Gemfile or Gemfile.lock will cache-bust from here
COPY ./jekyll/Gemfile* /app/jekyll/
WORKDIR /app/jekyll
RUN bundle install
# Altering anything else will cache-bust from here
COPY ./ /app/
WORKDIR /app/jekyll
EXPOSE 4000
CMD ["jekyll", "serve", "--host=0.0.0.0"]

And with that, we have a working Jekyll image that’s only dependent on Docker.