Make slim Rails Docker images

With deploying on Kubernetes in view, the production images aim to be small. It isn't so easy to find documentation on how to produce slim Rails production images, so we present a solution here as well as a development hot-reload image.

TL;TR

A development mode Debian based image "slim-buster" (using the apt package manager) can be around 1000Mb. To go down to a slim size of approx 200 Mb image, you may:

  1. use a two-staged Dockerfile
  2. use a Linux Alpine base (with the apk package manager)
  3. use only Webpack and remove Sprockets (plus much shorter building time)
  4. bundle the production gems locally (as opposed to the "global" host gems) to the folder 'vendor/bundle'.
  5. Make the container stateless by removing the logs and using a Redis cache store.

References:

We present also the development Dockerfile and how to use images in a context of micro-services to run a Rails monolith app with docker-compose.

Production mode

We have a two-step building process starting from a ruby:alpine image.

First stage

In the first stage, size is not of utmost importance: its main task is compile the gems and the static assets. We upload there the needed tools for:

  • Bundler to compile the declared gems and save them in a local folder "vendor/bundle"
  • Webpacker to compile the static assets to the "/public" folder.

Bundler

We want Bundler to compile the gems needed for the app (the "Gemfile.lock" file) into the app's code local subfolder "/vendor/bundle" to minimise the code. We thus set the environment variable:

BUNDLE_PATH='vendor/bundle'

This also indicates to Bundler where the gems are located, so it is repeated in the second stage.

We want to bundle only the production gems:

bundle config set --without 'development test'

Webpacker

We want to use a Webpack only version of Rails, without Sprockets.

rails new <app-name> --skip-sprockets ...

In particular, this removes the sass-rails gem which besides saving space, reduces a lot building time.

In this stage, we want Webpack to compile and minimise the static assets into the "/public" folder.

RUN bundle exec rails webpacker:compile

Minimize the number of layers

We can put all RUN commands together since every new layer will increase the final size, so the less the better.

Stateless container

We also want to redirect the logs from the container to STDOUT and STDERR. They will be lost but can further be used by a log collector. Just declare the logger in the file "/config/application.rb" and set:

RAILS_LOG_TO_STDOUT=true

Remark: in production mode, if we use caching, you may take advantage of the built-in Redis cache store. Redis will be setup as LRU with say maxmemory 100mb(set in "redis.conf" circa line 566) and a maxmemory policy maxmemory-policy allkeys-lru. We will load the gem hiredis, a C client library designed for speeding up, and configure:

#/config/production.rb
config.cache_store = :redis_cache_store, { driver: :hiredis, url: ENV['REDIS_CACHE'] }

where ENV['REDIS_CACHE'] is a reference to the dedicated Redis cache server URL (different from the one Sidekiq uses).

Serving the static assets

There are different strategies: use a CDN or a reverse-proxy eg Nginx, or serving them with the default configured app-server Puma.
Since the app will most probably be running on Kubernetes, behind an Ingres controller, we will most probably serve the static assets from a CDN, on a separate port and not integrate a stand-alone Nginx layer. By default Rails won't serve static files in production mode so we need the app server - Puma here - to serve the static files before migrating to Kubernetes. This can be adjusted with a built time argument.

Second stage

The second building stage will firstly set a user to remove root privileges.
It will then simply copy the app code, the compiled static files and the bundled gems from the host into the container.
Finally, we set the needed ENV vars to the ARGs values.

The mandatory tzdata package is needed by the tzinfo-data gem when running on Windows.

Since we use Postgres, we still need the libpq package so Rails can communicate with Postgres.

The production Dockerfile

#alpine.prod.Dockerfile

ARG RUBY_VERSION
FROM ruby:${RUBY_VERSION} AS builder

ARG BUNDLER_VERSION
ARG NODE_ENV
ARG RAILS_ENV
ARG RAILS_LOG_TO_STDOUT

RUN apk -U upgrade && apk add --no-cache \
   postgresql-dev nodejs yarn build-base

WORKDIR /app

COPY Gemfile Gemfile.lock package.json yarn.lock ./

ENV LANG=C.UTF-8 \
   BUNDLE_JOBS=4 \
   BUNDLE_RETRY=3 \
   BUNDLE_PATH='vendor/bundle'

RUN gem install bundler:${BUNDLER_VERSION} --no-document \
   && bundle config set --without 'development test' \
   && bundle install --quiet \
   && rm -rf $GEM_HOME/cache/* \
   && yarn --check-files --silent --production && yarn cache clean

COPY . ./

RUN bundle exec rails webpacker:compile assets:clean

############################################################
FROM ruby:${RUBY_VERSION}

ARG RAILS_ENV
ARG RAILS_LOG_TO_STDOUT

RUN apk -U upgrade && apk add --no-cache libpq netcat-openbsd tzdata\
   && rm -rf /var/cache/apk/* \
   && adduser --disabled-password app-user 
# --disabled-password: don't assign a pwd, so cannot login
USER app-user

COPY --from=builder --chown=app-user /app /app

ENV RAILS_ENV=$RAILS_ENV \
   BUNDLE_PATH='vendor/bundle' \
   RAILS_LOG_TO_STDOUT=$RAILS_LOG_TO_STDOUT

WORKDIR /app
RUN rm -rf node_modules

To build the image, tag it and push it to the Docker registry, we would run:

docker build -t usr/appname --build-arg RUBY_VERSION=3.0.1-alpine --build-arg NODE_ENV=production --build-arg RAILS_ENV=production --build-arg BUNDLER_VERSION=2.2.21 --build-arg RAILS_SERVE_STATIC_FILES=true -f _alpine.prod.Dockerfile .

docker push usr/appname

ARG and ENV

We use "args" to pass variables at build time, with:
docker build --build-arg RAILS_ENV=production ... or from the "docker-compose". They are just declared in the Dockerfile with ARG RAILS_ENV (only supports one arg).
Whenever we need to use some "env" variables for the code, such as RAILS_ENV=production that are already args, then we can pass the "arg" value to the "env" value with:

ENV RAILS_ENV=$RAILS_ENV

If we do further ${RAILS_ENV:-production}, then this supplies a default value.

Dev mode

In the development stage, we do not need a multi-stage build since hot-reload is our priority, not size. We use a separate Webpack service run with webpacker-dev-server and code bindings to accelerate the changes. Size is not the priority, but when using Alpine and Webpack only, the compilation runs faster. It is largely inspired by these guys and this guy but somehow simplified.

Alpine proposes at the time of writing Nodejs LTS 14, Postgres 13 and Yarn 1.22 packages by default, which is largely acceptable.

ARG RUBY_VERSION
FROM ruby:${RUBY_VERSION} AS builder

ARG BUNDLER_VERSION
ARG NODE_ENV
ARG RAILS_ENV

ENV RAILS_ENV=${RAILS_ENV:-development} \
   NODE_ENV=${NODE_ENV:-development} \
   BUNDLER_VERSION=${BUNDLER_VERSION:-2.2.21}

RUN apk update && apk add --no-cache \
   build-base postgresql-dev nodejs yarn \
   tzdata netcat-openbsd \
   && rm -rf /var/cache/apk/*

WORKDIR /app

COPY Gemfile Gemfile.lock package.json yarn.lock ./

ENV LANG=C.UTF-8 \
   BUNDLE_JOBS=4 \
   BUNDLE_RETRY=3
# BUNDLE_PATH='vendor/bundle'
# <- to bundle only the gems needed from Gemfile into local folder /vendor/bundle

RUN gem install bundler:${BUNDLER_VERSION} --no-document \
   && bundle install --quiet \
   && rm -rf /usr/local/bundle/cache/*.gem \
   && find /usr/local/bundle/gems/ -name "*.c" -delete \
   && find /usr/local/bundle/gems/ -name "*.o" -delete

RUN yarn --check-files --silent && yarn cache clean

COPY . ./

How to use this with a code example

We built a toy Rails monolith app Webpack-only. This simple app increments on button click a counter whose value is saved to a Postgres database and to a Redis database, and in parallel triggers asynchronous background jobs/workers with Sidekiq/Redis.

Local dev mode [branch "master"]

To see how this works, we start by running this "by hand" locally, without containers.

You need to run a Postgres service and (at least one) Redis service. You then launch a Sidekiq service, the Rails server and finally run Webpack in dev mode. We used the process manager Overmind. Once the Postgres services are up, run overmind start with the following "Procfile":

#Procfile
assets:  ./bin/webpack-dev-server
web:     bundle exec rails server
redis-server:   redis-server redis/redis.conf
worker:  bundle exec sidekiq -C config/sidekiq.yml

Containerised prod mode [branch "docker-prod"]

To run the app in containers, you will need to build the images and run the containers with "docker-compose" so you get the containers connected to a local network. The "docker-compose.yml" file holds the state of our deployment. It is a validation step before deploying with Kubernetes.

You will launch a Sidekiq service linked to a Redis server, a Postgres database service, another Redis database and the Rails app served with the app-server Puma, the latter serves the compiled static files as per our configuration.

In real life, the two Redis services will most probably be managed solutions, a remote Redis cluster, so you may not need to create a Redis service. Here, we maintain a custom Redis database. You can test a managed service by commenting off the Redis service and dependencies in the "docker-compose.yml" file and pass the remote URL from the managed service by specifying ENV['REDIS_URL'] in the code for both Sidekiq and Redis (see note at the end). We used for example a free tier from Redislabs.
The same remark applies to the Postgres service. We will most probably use a managed service with replicas rather than running your own service on bare metal, so you can comment off this service as well and it's dependencies (such as depends_on: keys) and use the remote url. For example, we can use a free tier service ElephantSQL and set ENV['ELEPHANT_URL'] pointing to the managed service.

The four images are:

  • Rails image is based on the Dockerfile. It is launched with bundle exec rails server with an open port,
  • the background job processor Sidekiq boots Rails, so uses the same image as Rails. It is launched with bundle exec sidekiq with a config file to link to the Redis session used.
  • the custom Redis database uses the official Redis image launched with redis-server and uses a config file (password, persistence)...).
  • the Postgres database uses the official Postgres image with an initialisation script ("init-user.sql").

Therefor, with managed services, this app deployed on Kubernetes would only use one image, for the Rails app and Sidekiq.

Initialising, Volumes and bindings

In the compose file, we use two mount binds:

  • an SQL initialiser for Postgres
  • a Redis config file

The Postgres initialiser "init-user.sql" will create a user with password when the data directory is empty. This is done by pushing an file in the "/docker-entrypoint-initdb.d" folder (see "Initialisation scripts"). It is run whenever the "data" folder is empty and the script is made idempotent (see code at the end). These credentials will be used as environment variables to configure Postgres in Rails and used in the "docker-compose.yml" file.

The two other named volumes are relative to the main data of the databases Postgres (PG_DATA=/var/lib/postgresql/data) and Redis (/data).

The Redis database can or not be used by Sidekiq for managing its queue. We can run two distinct Redis sessions (see section on remote services at the end).

  • the "sidekiq" service is initialised in the file "/config/initialisers/sidekiq.rb" where we provide the link to a Redis session via ENV['SIDEKIQ_REDIS'].

  • the "app" service initialises the Redis in-memory database in a ad-hoc "/config.redis.yml" file where we pass ENV['REDIS_URL'].
    It is initialised with a "redis.conf" file that holds the password requirepass secretpwd (circa line 500).

Since "sidekiq" and "app" are built with the same image, the app code holds the environment variables REDIS_URL and SIDEKIQ_REDIS in the ".env" file. This ".env" file is passed to the "app", "sidekiq" and "redis" services.

  • migration. Once the Postgres database has been initialised with a "user|password", the "app" service has an entry point file "manage-db.sh" that performs the idempotent command rake db:prepare to create and migrate the database. Again, we need to pass the same credentials with the environment variable POSTGRES_PASSWORD=<password> in the docker-compose.yml file and save them in the ".env" file in the code (see note at the end).

The "docker-compose" production file

version: "3"

x-app: &common
  env_file: .env
  build:
    context: .
    dockerfile: alpine.prod.Dockerfile
    args:
      - RUBY_VERSION=3.0.1-alpine
      - BUNDLER_VERSION=2.2.21
      - NODE_ENV=production
      - RAILS_ENV=production

services:
  pg:
    image: postgres:13.3-alpine
    ports:
      - 5432
    environment:
      - POSTGRES_PASSWORD=dockerpassword
    volumes:
      - pg_data:/var/lib/postgresql/data
      - ./pg/init-user.sql:/docker-entrypoint-initdb.d/init-user.sql:ro
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      timeout: 5s

  redisdb:
    build:
      context: ./redis
    ports:
      - 6379
    env_file: .env
    command: ["redis-server", "/usr/local/etc/redis.conf"]
    volumes:
      - redis_data:/data
      - ./redis/config:/usr/local/etc/redis:ro
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 30s
      timeout: 10s
      retries: 3

  sidekiq:
    <<: *common
    entrypoint: ["./app-pid.sh"]
    command: ["bundle", "exec", "sidekiq", "-C", "config/sidekiq.yml"]
    depends_on:
      redisdb:
        condition: service_health


  app:
    <<: *common
    depends_on:
      pg:
        condition: service_healthy
      redisdb:
        condition: service_healthy
    ports:
      - 4000:3000
    entrypoint:  [./manage-db.sh]
    command: ["bundle", "exec", "rails", "server", "-b", "0.0.0.0"]
    tmpfs:
      - /tmp

volumes:
  redis_data: {}
  pg_data: {}

We run:

docker-compose up --build

If we don't use the "manage-db.sh" entrypoint (using a Kubernetes job), you we would need to run the following to setup the database and get the app up and running.

docker-compose run --rm app bundle exec rails db:prepare

We added the package netcat in the Dockerfile since we use nc for health testing. This is for testing purposes and should be removed when using a managed service.

The "development" "docker-compose.yml" file

For the development mode, we define five processes:

  • the Rails app in dev mode for hot-reload,
  • the Sidekiq background process
  • the Webpack static assets manager run with webpacker-dev-server for hot-reload,
  • the Redis database adapter,
  • the Postgres database adapter

We use mount bindings for hot-reload and faster loading.

version: "3"

x-app: &common
  env_file: .env
  build:
    context: .
    args:
      - RUBY_VERSION=3.0.1-alpine
      - BUNDLER_VERSION=2.2.21
      #- NODE_VERSION=14 <- for "slim-buster" based image
      - NODE_ENV=development
      - RAILS_ENV=development

services:
  pg:
    image: postgres:13.3-alpine
    ports:
      - 5432
    environment:
      - POSTGRES_PASSWORD=cyberdyne
      - PG_DATA=/var/lib/postgresql/data
    volumes:
      - pg_data:/var/lib/postgresql/data
      - ./pg/init-user.sql:/docker-entrypoint-initdb.d/init-user.sql:ro
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      timeout: 5s
      retries: 5

  redisdb:
    build:
      context: ./redis
    ports:
      - 6379
    env_file: .env
    command: ["redis-server", "/usr/local/etc/redis.conf"]
    volumes:
      - redis_data:/data
      - ./redis/config:/usr/local/etc/redis:ro
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 30s
      timeout: 10s
      retries: 3

  webpacker:
    <<: *common
    command: ["./bin/webpack-dev-server"]
    ports:
      - "3035:3035"
    volumes:
      - .:/app:cached
      - packs:/app/public/packs
    environment:
      - WEBPACKER_DEV_SERVER_HOST=0.0.0.0

  sidekiq:
    <<: *common
    entrypoint: ["./app-pid.sh"]
    command: ["bundle", "exec", "sidekiq", "-C", "config/sidekiq.yml"]
    depends_on:
      redisdb:
        condition: service_healthy

  app:
    <<: *common
    depends_on:
      pg:
        condition: service_healthy
      redisdb:
        condition: service_healthy
    environment:
      YARN_CACHE_FOLDER: /app/node_modules/.yarn-cache
      WEBPACKER_DEV_SERVER_HOST: webpacker
    ports:
      - 4000:3000
    entrypoint: ["./manage-db.sh"]
    command: ["bundle", "exec", "rails", "server", "-b", "0.0.0.0"]
    volumes:
      - bundle_cache:/usr/local/bundle
      - node_modules:/app/node_modules
      - .:/app:cached
      - packs:/app/public/packs
      - rails_cache:/app/tmp/cache
    tmpfs:
      - /tmp

volumes:
  redis_data: {}
  pg_data: {}
  node_modules:
  packs:
  bundle_cache:
  rails_cache:

Misc.

The "slim-buster" Dockerfile version

ARG RUBY_VERSION
FROM ruby:${RUBY_VERSION} AS builder

ARG BUNDLER_VERSION \
   NODE_VERSION \
   NODE_ENV \
   RAILS_ENV

ENV RAILS_ENV=${RAILS_ENV} \
   NODE_ENV=${NODE_ENV} \
   BUNDLER_VERSION=${BUNDLER_VERSION} \
   DEBIAN_FRONTEND=noninteractive

RUN apt-get update \
   && apt-get install -y --no-install-recommends \
   # for gems to be compiled
   build-essential \
   # to get desired node version
   curl \
   && apt-get clean \
   && rm -rf /var/cache/apt/archives/* \
   && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
   && truncate -s 0 /var/log/*log

RUN curl -sL https://deb.nodesource.com/setup_${NODE_VERSION}.x | bash - \
   && curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - \
   && echo 'deb https://dl.yarnpkg.com/debian/ stable main' > /etc/apt/sources.list.d/yarn.list

RUN apt-get  update  \ 
   && apt-get install -y --no-install-recommends \
   # comm with PG with gem 'pg'
   libpq-dev \ 
   # compile assets
   nodejs \
   yarn \
   && apt-get clean \
   && rm -rf /var/cache/apt/archives/* \
   && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
   && truncate -s 0 /var/log/*log

WORKDIR /app

COPY Gemfile Gemfile.lock package.json yarn.lock ./

ENV LANG=C.UTF-8 \
   BUNDLE_JOBS=4 \
   BUNDLE_RETRY=3 \
   BUNDLE_PATH='vendor/bundle'

RUN gem install bundler:${BUNDLER_VERSION} --no-document \
   && bundle config set --without 'development test' \
   && bundle install --quiet \
   && rm -rf /usr/local/bundle/cache/*.gem \
   && find /usr/local/bundle/gems/ -name "*.c" -delete \
   && find /usr/local/bundle/gems/ -name "*.o" -delete

RUN yarn --check-files --silent

COPY . ./

RUN bundle exec rails webpacker:compile
# && rm -rf node_modules tmp/cache app/assets vendor/assets lib/assets spec

###########################################
ARG RUBY_VERSION
FROM ruby:${RUBY_VERSION}

ARG NODE_ENV
ARG RAILS_ENV

RUN apt-get  update  \
   && apt-get install -y --no-install-recommends \
   # detect when services inside containers are up and running
   netcat-openbsd \
   # communicate with PG with gem 'pg'
   libpq-dev \
   && apt-get clean \
   && rm -rf /var/cache/apt/archives/* \
   && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
   && truncate -s 0 /var/log/*log


#<- if didn't used the flag BUNBLER_PATH='vendor/bundle', we would copy the host bundle folder
# COPY --from=builder /usr/local/bundle/ /usr/local/bundle/

RUN adduser --disabled-password app-user
USER app-user

COPY --from=builder  --chown=app-user /app /app

ENTRYPOINT ["./app-pid.sh"]

ENV RAILS_ENV=$RAILS_ENV\
   NODE_ENV=$NODE_ENV \
   RAILS_LOG_TO_STDOUT=true \
   RAILS_SERVE_STATIC_FILES=true \
   BUNDLE_PATH='vendor/bundle'

WORKDIR /app
RUN rm -rf node_modules tmp/cache  lib/assets

EXPOSE 3000

Remote services

If you want to rely on a remote Postgres or Redis service, you need to configure your account and pass the provided URL to the app.

For example, with a remote (free) service ElephantSQL, pass the supplied ENV[ELEPHANT_URL] into "/config/database.yml" and set the supplied POSTGRES_PASSWORD=xxxx variable in the "docker-compose.yml" file.

We can also use:

  • a local "containerised" Redis server for Sidekiq's queue. Set SIDEKIQ_REDIS=redis://user:password@redisdb:6379 in the file "/config/sidekiq.rb"

  • a remote Redis database for the app. We used a (free) service supplied by Redis Labs. Set url: <%= ENV.fetch('REDIS_URL','') %> with the supplied URL from Redislabs in the file "/config/redis.yml". This can replace the Redis service.

".env" file

An example of the environment variables used:

#.env
export POSTGRES_URL=postgresql://docker:dockerpassword@pg:5432
export ELEPHANT_URL=postgres://ortkcbqt:fhSBQrF3Dzl9WWA1FfRIjQmU7u3pBtTd@batyr.db.elephantsql.com/ortkcbqt

export SIDEKIQ_REDIS=redis://user:secretpwd@redisdb:6379
export REDIS_URL=redis://user:tq4hBlYvIvq0uU7hYMOYS6ErQKsSA2N8@redis-13424.c258.us-east-1-4.ec2.cloud.redislabs.com:13424

export RAILS_MASTER_KEY=cce3c51968fc41dd85b3d8b5d54f43eb

export RAILS_SERVE_STATIC_FILES=true
export RAILS_LOG_TO_STDOUT=true

Postgres USER initialiser

This code is run whenever the "data" directory is empty to declare a with (the "<" & ">" signs are just here to emphasis).

#init.sql
DO $$
BEGIN
  CREATE ROLE <docker> WITH SUPERUSER CREATEDB LOGIN PASSWORD <'dockerpassword'>;
  EXCEPTION WHEN DUPLICATE_OBJECT THEN
  RAISE NOTICE 'not creating role my_role -- it already exists';
END
$$;

The environment variable POSTGRES_URL=postgresql://<user>:<password>@pg:5423 passed to the "database.yml" in the code must match the "|" used in the "init-user.sql" file and we must pass POSTGRES_PASSWORD=<pasword> in the "docker-compose.yml" file.
In you compose with non matching credentials, then you need to run docker volume prune and rebuild.

Entrypoints

Note: the rake command rails db:prepare is idempotent.

#manage-db.sh <-- "app" service

#!/bin/sh
set -e
if [ -f tmp/pids/server.pid ]; then
  rm tmp/pids/server.pid
fi
echo "Waiting for Postgres to start..."
while ! nc -z pg 5432; do sleep 0.2; done
echo "Postgres is up"

bundle exec rake db:prepare
exec "$@"

This code is the entry point of the "sidekiq" service. It "cleans" the "pid" file.

#app-pid.sh <-- "sidekiq" service

#!/bin/sh
set -e
if [ -f tmp/pids/server.pid ]; then
  rm tmp/pids/server.pid
fi
exec "$@"

Redis database initialiser

We create a file "/config/initialisers/redis.rb" so that Rails will load it on startup and instantiate a Redis session.

#config/initialisers/redis.rb
REDIS = Redis.new(Rails.application.config_for(:redis))

where config_for will parse and fetch the "/config/redis.yml" config so we can use for example REDIS.set("key", 10) in the app.

We can test the protection of the Redis database with the command below once the project is up and running:details here:

> docker-compose run --rm redisdb redis-cli -h redisdb -p 6379

The output should be:

redisdb:6379> get compteur
(error) NOAUTH Authentication required.
redisdb:6379> auth secretpwd
OK
redisdb:6379> get compteur
"3"

The Redis persistence is configured in the "redis.conf" file:

  • RDB mode periodic: dbfilename "dump.rdb"(line 327)
  • AOF mode: every seconde with appendonly yes (line 699) and appendfsync everysec (line 729) and appendfilename appendonly.aof(line 703)

Links:

Hope this help!

29