So let’s say that you have a shiny new Ruby program that you wrote. But how will you ensure every environment has the right version of Ruby installed and every user knows how to install all of its dependencies? Before you answer that yourself, ask the lead maintainer of Homebrew how much fun he’s having.

I think there is a simpler method to this madness. This is a tutorial to automatically build and push container images for your project, which will save you time and minimize tedious tasks. Manual process is for people that don’t yet realize how much time you can save yourself by using DevOps principles to automate this tedious work. The pattern in this post will work for a variety of languages, but you’ll have to apply changes in a language-specific manner to be compatible.

Author’s Note I think that the term “DevOps” gets used incorrectly in a lot of situations today. In my opinion, this tutorial is the most basic application of the term. I am applying automation to the code since I am familiar with both deployment strategies and application code. One foot in operations, the other in development. This was the original inspiration for this blog post, since I have spoken about automating things at internal Test Double events and Justin had not yet built any sort of workflow to perform these tasks before.

Today we will be modifying a side-project of our very own Justin Searls called feed2gram. I will be able to point out specific commits, too. This isn’t a hypothetical exercise, it’s what we actually implemented so a Docker container on his Synology could continuously to cross-post photos from his blog to his Instagram account.

In the beginning…

First, you run tests. The application has tests, right? Of course! So this entire process only runs if the tests pass.

# the default task runs rake test standard
bundle exec rake

[Justin’s note: I actually didn’t write any tests for this gem. Don’t tell Andrew.]

Ensure that tests run in Github Actions

This is our first step to continuous integration. We will create a file .github/workflows/main.yml. The exact filename isn’t important so long as it’s contained in .github/workflows.

# the name can be anything you want, but try not to change it once you push it to Github
name: Ruby

on:
  # this runs on each push...
  push:
    # ...to this list of branches, which is just main
    branches:
      - main

  # also runs on each pull request
  pull_request:

jobs:
  # this is the "pipeline" that does the business
  build:
    # this is the name of the kind of runner that Github hosts
    runs-on: ubuntu-latest
    # each job has a name, too. the bits next to "matrix" means "run one job for each combination of things in the strategy.matrix structure"
    name: Ruby ${{ matrix.ruby }}
    strategy:
      matrix:
        ruby:
          # we are only testing Ruby 3.2.2, but it could be easily changed to support more versions
          - '3.2.2'

    steps:
    # you have to check out the code first, this is almost always the first step in a GHA job
    - uses: actions/checkout@v3
    # this installs Ruby, since the base runner does not have Ruby
    - name: Set up Ruby
      uses: ruby/setup-ruby@v1
      with:
        # magic sauce for the matrix jobs to substitute the values in the combinations
        ruby-version: ${{ matrix.ruby }}
        # use a cache. it saves a ton of time.
        bundler-cache: true
    # this runs the tests
    - name: Run the default task
      run: bundle exec rake

Add a Dockerfile

Docker containers are ubiquitous in the self-hosted world. It is one of the most common methods of distributing software for self-hosting. Since feed2gram is all about POSSE, self-hosting is going to be our primary design objective.

To get started with building a docker container, you need a Dockerfile:

# we make an image from the default Ruby version tag for our version of Ruby
FROM ruby:3.2.2

# this directory comes from the base image and is where we will deploy the app
# it is common practice to use a simple and non-FHS directory in the root
WORKDIR /srv

# important! we do this specifically for caching. if these files do not change,
# then later builds will re-use the cache! it can sometimes take minutes to
# install dependencies, so this is an excellent optimization
COPY Gemfile Gemfile.lock feed2gram.gemspec .

# this app requires this file to calculate the version
COPY lib/feed2gram/version.rb lib/feed2gram/

# install dependencies
# so long as the files in the prior COPY statements do not change, then this
# step will be cached on subsequent builds and save potentially _minutes_
RUN bundle install

# just copy everything, even the files we already copied
ADD . .

# this makes every invocation of the container work like a command-line tool
# any arguments to `docker run` will be passed to the application just
# like you would on the command line
ENTRYPOINT ["/srv/exe/feed2gram"]

We have an excellent guide that deep-dives into optimizing Docker layer caching that is written by our own Mavrick Laakso.

Using the image

Let’s take a small detour and actually use this setup now.

git clone https://github.com/searls/feed2gram
cd feed2gram
docker build -t feed2gram .
# wait for the build to finish
docker run --rm -it \ # --rm removes and -it allows it to work like a CLI tool
  -v my-config.yml:/srv/config.yml \ # this bind-mounts the config-you-have-made into the container
  feed2gram \ # name of the docker image
  --config /srv/config.yml # arguments passed to the entrypoint

Assuming you followed the directions in the README.md to get all of the tokens ready, then you should see something work.

Building the container automatically

Now we’ll modify our workflow to build the container. Because the new job requires the build job to succeed, tests must pass to build the container! While not a perfect method, it will minimize the possibility of releasing buggy software to our end users.

name: Ruby

on:
  push:
    branches:
      - main

  pull_request:

jobs:
  build:
    runs-on: ubuntu-latest
    name: Ruby ${{ matrix.ruby }}
    strategy:
      matrix:
        ruby:
          - '3.2.2'

    steps:
    - uses: actions/checkout@v3
    - name: Set up Ruby
      uses: ruby/setup-ruby@v1
      with:
        ruby-version: ${{ matrix.ruby }}
        bundler-cache: true
    - name: Run the default task
      run: bundle exec rake

  # πŸ‘‡New job is here!
  docker:
    runs-on: ubuntu-latest
    name: Build Docker Container
    # this job will only run if all of the "needs" are successful
    # hence, tests must pass!
    needs: [build]
    steps:
      - uses: actions/checkout@v4
      # we install qemu to build arm images
      - uses: docker/setup-qemu-action@v3
      # configure buildkit
      - uses: docker/setup-buildx-action@v3
      # log into github container registry if not a pull request
      - uses: docker/login-action@v3
        if: github.event_name != 'pull_request'
        with:
          registry: ghcr.io
          username: searls
          # πŸ‘‡Github container registry requires a personal access token
          # Don't commit this to code, ever.
          password: ${{ secrets.CH_PAT }}
      # this does all the work for us
      - uses: docker/build-push-action@v5
        with:
          # πŸ‘‡Important! you want to use a cache to save CI minutes
          cache-from: type=gha
          # mode=max means "also cache intermediate layers and not just the resulting image"
          # which is the part that makes the prior optimizations worth while
          cache-to: type=gha,mode=max
          context: .
          # build for both amd64 and arm64, for the mac/raspberrypi folks!
          platforms: linux/amd64,linux/arm64
          # πŸ‘» we don't push the resulting image if this is a pull request
          push: ${{ github.event_name != 'pull_request' }}

And there you have it. A fully functional workflow that will automatically build and push container images for your project on each push to the default branch. You could stop here… but we’re going to keep going. There are some rough edges and small improvements that will still carry us further toward our goal.

Perpetual processing

If you looked at how we ran the resulting container previously, it is a one-shot execution. So every time you want to sync your posts, you’ll have to run the program again. This is kind of tedious and an anti-pattern to our “automate all the things”, so let’s investigate how to run this program continuously.

There are a few possible choices:

  1. Add the daemon gem and implement a start / stop / restart workflow with Ruby.
  2. Run busybox cron and then write a crontab file.
  3. Add a very small shell script that just runs a loop.

We’re going to go with the last of those here. Let’s add another file called bin/daemon that will accomplish this task for us.

#!/usr/bin/env bash
# ☝️ this is the preferred method of running bash.
# not everyone has /bin/bash
# and not every /bin/bash is the same (macos 🀒)

echo "Starting feed2gram daemon..."

# run the app once
# the "$@" will pass all of the arguments given to the container to the
# original script, so it's πŸ’― compatible with the old implementation!
exe/feed2gram --verbose "$@"

# sleep by default 60 seconds
# or define env var SLEEP_TIME=XXX to pick a different interval
while sleep ${SLEEP_TIME:-60} ; do

  # run it again, with a timestamp output
  echo "[$(date -R)] Re-running feed2gram..."
  exe/feed2gram --verbose "$@"
done

This solution happily bypasses a lot of gotchas. If we used cron, then you would have to write a crontab file and not even system administrators want to have to remember the format of those files. Also, when you use cron, you would have to know that the previous invocation had finished before you start the next one. You really don’t want to run the script twice and have them do the same thing at the same time. The shell script is less code than adding the gem and wrapper, plus you can still use the container as a one-shot if you want!

Using the new image

docker run --rm -it
  -e SLEEP_TIME=600 \ # πŸ›οΈ this is how you change the sleep interval
  -v my-config.yml:/srv/my-config.yml \
  feed2gram \
  --config /srv/my-config.yml

πŸ₯³ now you have a long-running process that will continually monitor your feed and syndicate content to Instagram.

How to get the old behavior

You just change the entrypoint back to the old one!

docker run --rm -it \
  --entrypoint /srv/exe/feed2gram \ # πŸ‘ˆ this is all you need
  -v my-config.yml:/srv/my-config.yml \
  feed2gram \
  --config /srv/my-config.yml

Container image optimization

By now we have a really convenient image for running our new fancy application. 🀠

andrew@potassium:~/feed2gram$ docker images feed2gram
REPOSITORY   TAG       IMAGE ID       CREATED          SIZE
feed2gram    latest    881fe940ab29   45 seconds ago   1.04GB

Oh. Okay. That’s … really big for a little application like this. We can use dive to examine each layer, but maybe it’s not our fault that the resulting image is so big.

andrew@potassium:~/feed2gram$ docker images ruby
REPOSITORY   TAG            IMAGE ID       CREATED       SIZE
ruby         3.2.2          e1ebac6c7119   6 days ago    988MB

Right. Looks like our application is taking up around 75 MB, so that’s more in line with what I expected. So let’s find a different base image. While we are at it, let’s add a VOLUME into the Dockerfile for the configuration files. This will give the end users a clear place to mount their configuration files and be an obvious location for persistent data.

FROM ruby:3.2.2-alpine
# πŸ‘†The alpine image is much more spartan,
# but it's more than enough for our little application

WORKDIR /srv
COPY Gemfile Gemfile.lock feed2gram.gemspec ./
COPY lib/feed2gram/version.rb lib/feed2gram/

# πŸ‘‡ the dependencies are different in alpine
RUN apk update && \
    apk add autoconf bash git gcc make musl-dev && \
    bundle install && \
    apk del --purge --rdepends git gcc autoconf make musl-dev
ADD . .

# πŸ‘‡ this is where persistent data lives
VOLUME /config

# πŸ‘‡ the volume subtly changes the default arguments
CMD ["--config", "/config/feed2gram.yml"]

ENTRYPOINT ["/srv/bin/daemon"]

The end result

andrew@potassium:~/feed2gram$ docker images feed2gram
REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
feed2gram    latest    dc054fbec1a7   5 seconds ago   132MB

That’s much better. I’m sure all of the self-hosters will appreciate the smaller download. There is one more step to go before we call it done. If we want to pull the containers, we have to know the git SHA of the revision. This is kind of irritating to the end user to have to go and cross-reference what is in the repository and it will be a pain to remove old versions of the image since you can’t sort a SHA.

Image tags

There are a lot of conventions around version numbers and docker containers. We’ll use a (new to me) action called docker/metadata-action that will generate any combination of tags for the resulting build. The action will also generate labels that will help Github associate the image with the source repository.

  1. On each build, it will tag with the git commit SHA
  2. On a PR, it will generate a tag like pr-2
  3. On the default (main) branch it will tag with latest
  4. For a tag like v1.2.3 it will tag with 1.2.3
  5. For a tag like v2.0.0 it will tag with 2
  6. For early development tags like v0.0.4, it will not tag with 0
name: Ruby

on:
  push:
    branches:
      - main
    # πŸ‘‡ we run this workflow on tags that start with a "v"
    tags:
      - 'v*'

  pull_request:

jobs:
  build:
    runs-on: ubuntu-latest
    name: Ruby ${{ matrix.ruby }}
    strategy:
      matrix:
        ruby:
          - '3.2.2'

    steps:
    - uses: actions/checkout@v3
    - name: Set up Ruby
      uses: ruby/setup-ruby@v1
      with:
        ruby-version: ${{ matrix.ruby }}
        bundler-cache: true
    - name: Run the default task
      run: bundle exec rake

  docker:
    runs-on: ubuntu-latest
    name: Build Docker Container
    needs: [build]
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-qemu-action@v3
      - uses: docker/setup-buildx-action@v3
      # πŸ‘‡ here is the new action
      - uses: docker/metadata-action@v5
        id: metadata
        with:
          images: |
            ghcr.io/searls/feed2gram            
          tags: |
            type=raw,value=latest,enable={{is_default_branch}}
            type=ref,event=pr
            type=semver,pattern={{version}}
            type=semver,pattern={{major}}.{{minor}}
            type=semver,pattern={{major}},enable=${{ !startsWith(github.ref, 'refs/tags/v0.') }}
            type=sha,prefix=,format=long            
      - uses: docker/login-action@v3
        if: github.event_name != 'pull_request'
        with:
          registry: ghcr.io
          username: searls
          password: ${{ secrets.CH_PAT }}
      - uses: docker/build-push-action@v5
        with:
          cache-from: type=gha
          cache-to: type=gha,mode=max
          context: .
          platforms: linux/amd64,linux/arm64
          push: ${{ github.event_name != 'pull_request' }}
          # πŸ‘‡ here we use the outputs to magic all of this work away
          tags: ${{ steps.metadata.outputs.tags }}
          labels: ${{ steps.metadata.outputs.labels }}

Conclusion

It seems like a ton of work, but this gets easier the more you are exposed to it. The patterns laid out here are generally applicable to a lot of different projects, but with some minor modifications. This work is a natural process for DevOps development. Take a nice thing and then automate it. Try to use it and then shave off the rough spots. Iterate and improve. At all steps in the path you need to evaluate what would benefit your end-users and optimize for those features.

  1. We added a Dockerfile in 6201ff5.
  2. We added a job to build a multi-arch container when tests pass in 66cbbfb.
  3. We added a tiny shell script to perpetually run the program in 9b0970c.
  4. We minimized the resulting size of the container image in 12f582a.
  5. We added conventional tags to the container image in 3fc2298.

To celebrate our automation, feed2gram has been officially blessed as version 1.0! πŸŽ‰ πŸ₯³

Andrew Coleman

Person An icon of a human figure Status
Double Agent
Hash An icon of a hash sign Code Name
Agent 00164
Location An icon of a map marker Location
Chattanooga, TN