29
Make slim Rails Docker images
With deploying on Kubernetes in view, the production images aim to be small. It isn't so easy to find documentation on how to produce slim Rails production images, so we present a solution here as well as a development hot-reload image.
A development mode Debian based image "slim-buster" (using the apt
package manager) can be around 1000Mb. To go down to a slim size of approx 200 Mb image, you may:
- use a two-staged Dockerfile
- use a Linux Alpine base (with the
apk
package manager) - use only Webpack and remove Sprockets (plus much shorter building time)
- bundle the production gems locally (as opposed to the "global" host gems) to the folder 'vendor/bundle'.
- Make the container stateless by removing the logs and using a Redis cache store.
We present also the development Dockerfile and how to use images in a context of micro-services to run a Rails monolith app with docker-compose
.
We have a two-step building process starting from a ruby:alpine image.
In the first stage, size is not of utmost importance: its main task is compile the gems and the static assets. We upload there the needed tools for:
We want Bundler to compile the gems needed for the app (the "Gemfile.lock" file) into the app's code local subfolder "/vendor/bundle" to minimise the code. We thus set the environment variable:
BUNDLE_PATH='vendor/bundle'
This also indicates to Bundler where the gems are located, so it is repeated in the second stage.
We want to bundle only the production gems:
bundle config set --without 'development test'
We want to use a Webpack only version of Rails, without Sprockets.
rails new <app-name> --skip-sprockets ...
In particular, this removes the
sass-rails
gem which besides saving space, reduces a lot building time.
In this stage, we want Webpack to compile and minimise the static assets into the "/public" folder.
RUN bundle exec rails webpacker:compile
We can put all RUN
commands together since every new layer will increase the final size, so the less the better.
We also want to redirect the logs from the container to STDOUT and STDERR. They will be lost but can further be used by a log collector. Just declare the logger in the file "/config/application.rb" and set:
RAILS_LOG_TO_STDOUT=true
Remark: in production mode, if we use caching, you may take advantage of the built-in Redis cache store. Redis will be setup as LRU with say maxmemory 100mb
(set in "redis.conf" circa line 566) and a maxmemory policy maxmemory-policy allkeys-lru
. We will load the gem hiredis
, a C client library designed for speeding up, and configure:
#/config/production.rb
config.cache_store = :redis_cache_store, { driver: :hiredis, url: ENV['REDIS_CACHE'] }
where ENV['REDIS_CACHE']
is a reference to the dedicated Redis cache server URL (different from the one Sidekiq uses).
There are different strategies: use a CDN or a reverse-proxy eg Nginx, or serving them with the default configured app-server Puma.
Since the app will most probably be running on Kubernetes, behind an Ingres controller, we will most probably serve the static assets from a CDN, on a separate port and not integrate a stand-alone Nginx layer. By default Rails won't serve static files in production mode so we need the app server - Puma here - to serve the static files before migrating to Kubernetes. This can be adjusted with a built time argument.
The second building stage will firstly set a user to remove root privileges.
It will then simply copy the app code, the compiled static files and the bundled gems from the host into the container.
Finally, we set the needed ENV vars to the ARGs values.
The mandatory
tzdata
package is needed by thetzinfo-data
gem when running on Windows.Since we use Postgres, we still need the
libpq
package so Rails can communicate with Postgres.
#alpine.prod.Dockerfile
ARG RUBY_VERSION
FROM ruby:${RUBY_VERSION} AS builder
ARG BUNDLER_VERSION
ARG NODE_ENV
ARG RAILS_ENV
ARG RAILS_LOG_TO_STDOUT
RUN apk -U upgrade && apk add --no-cache \
postgresql-dev nodejs yarn build-base
WORKDIR /app
COPY Gemfile Gemfile.lock package.json yarn.lock ./
ENV LANG=C.UTF-8 \
BUNDLE_JOBS=4 \
BUNDLE_RETRY=3 \
BUNDLE_PATH='vendor/bundle'
RUN gem install bundler:${BUNDLER_VERSION} --no-document \
&& bundle config set --without 'development test' \
&& bundle install --quiet \
&& rm -rf $GEM_HOME/cache/* \
&& yarn --check-files --silent --production && yarn cache clean
COPY . ./
RUN bundle exec rails webpacker:compile assets:clean
############################################################
FROM ruby:${RUBY_VERSION}
ARG RAILS_ENV
ARG RAILS_LOG_TO_STDOUT
RUN apk -U upgrade && apk add --no-cache libpq netcat-openbsd tzdata\
&& rm -rf /var/cache/apk/* \
&& adduser --disabled-password app-user
# --disabled-password: don't assign a pwd, so cannot login
USER app-user
COPY --from=builder --chown=app-user /app /app
ENV RAILS_ENV=$RAILS_ENV \
BUNDLE_PATH='vendor/bundle' \
RAILS_LOG_TO_STDOUT=$RAILS_LOG_TO_STDOUT
WORKDIR /app
RUN rm -rf node_modules
To build the image, tag it and push it to the Docker registry, we would run:
docker build -t usr/appname --build-arg RUBY_VERSION=3.0.1-alpine --build-arg NODE_ENV=production --build-arg RAILS_ENV=production --build-arg BUNDLER_VERSION=2.2.21 --build-arg RAILS_SERVE_STATIC_FILES=true -f _alpine.prod.Dockerfile .
docker push usr/appname
We use "args" to pass variables at build time, with:
docker build --build-arg RAILS_ENV=production ...
or from the "docker-compose". They are just declared in the Dockerfile with ARG RAILS_ENV
(only supports one arg).
Whenever we need to use some "env" variables for the code, such as RAILS_ENV=production
that are already args, then we can pass the "arg" value to the "env" value with:
ENV RAILS_ENV=$RAILS_ENV
If we do further ${RAILS_ENV:-production}
, then this supplies a default value.
In the development stage, we do not need a multi-stage build since hot-reload is our priority, not size. We use a separate Webpack service run with webpacker-dev-server
and code bindings to accelerate the changes. Size is not the priority, but when using Alpine and Webpack only, the compilation runs faster. It is largely inspired by these guys and this guy but somehow simplified.
Alpine proposes at the time of writing Nodejs LTS 14, Postgres 13 and Yarn 1.22 packages by default, which is largely acceptable.
ARG RUBY_VERSION
FROM ruby:${RUBY_VERSION} AS builder
ARG BUNDLER_VERSION
ARG NODE_ENV
ARG RAILS_ENV
ENV RAILS_ENV=${RAILS_ENV:-development} \
NODE_ENV=${NODE_ENV:-development} \
BUNDLER_VERSION=${BUNDLER_VERSION:-2.2.21}
RUN apk update && apk add --no-cache \
build-base postgresql-dev nodejs yarn \
tzdata netcat-openbsd \
&& rm -rf /var/cache/apk/*
WORKDIR /app
COPY Gemfile Gemfile.lock package.json yarn.lock ./
ENV LANG=C.UTF-8 \
BUNDLE_JOBS=4 \
BUNDLE_RETRY=3
# BUNDLE_PATH='vendor/bundle'
# <- to bundle only the gems needed from Gemfile into local folder /vendor/bundle
RUN gem install bundler:${BUNDLER_VERSION} --no-document \
&& bundle install --quiet \
&& rm -rf /usr/local/bundle/cache/*.gem \
&& find /usr/local/bundle/gems/ -name "*.c" -delete \
&& find /usr/local/bundle/gems/ -name "*.o" -delete
RUN yarn --check-files --silent && yarn cache clean
COPY . ./
We built a toy Rails monolith app Webpack-only. This simple app increments on button click a counter whose value is saved to a Postgres database and to a Redis database, and in parallel triggers asynchronous background jobs/workers with Sidekiq/Redis.
To see how this works, we start by running this "by hand" locally, without containers.
You need to run a Postgres service and (at least one) Redis service. You then launch a Sidekiq service, the Rails server and finally run Webpack in dev mode. We used the process manager Overmind. Once the Postgres services are up, run overmind start
with the following "Procfile":
#Procfile
assets: ./bin/webpack-dev-server
web: bundle exec rails server
redis-server: redis-server redis/redis.conf
worker: bundle exec sidekiq -C config/sidekiq.yml
To run the app in containers, you will need to build the images and run the containers with "docker-compose" so you get the containers connected to a local network. The "docker-compose.yml" file holds the state of our deployment. It is a validation step before deploying with Kubernetes.
You will launch a Sidekiq service linked to a Redis server, a Postgres database service, another Redis database and the Rails app served with the app-server Puma, the latter serves the compiled static files as per our configuration.
In real life, the two Redis services will most probably be managed solutions, a remote Redis cluster, so you may not need to create a Redis service. Here, we maintain a custom Redis database. You can test a managed service by commenting off the Redis service and dependencies in the "docker-compose.yml" file and pass the remote URL from the managed service by specifying ENV['REDIS_URL']
in the code for both Sidekiq and Redis (see note at the end). We used for example a free tier from Redislabs.
The same remark applies to the Postgres service. We will most probably use a managed service with replicas rather than running your own service on bare metal, so you can comment off this service as well and it's dependencies (such as depends_on:
keys) and use the remote url. For example, we can use a free tier service ElephantSQL and set ENV['ELEPHANT_URL']
pointing to the managed service.
The four images are:
- Rails image is based on the Dockerfile. It is launched with
bundle exec rails server
with an open port, - the background job processor Sidekiq boots Rails, so uses the same image as Rails. It is launched with
bundle exec sidekiq
with a config file to link to the Redis session used. - the custom Redis database uses the official Redis image launched with
redis-server
and uses a config file (password, persistence)...). - the Postgres database uses the official Postgres image with an initialisation script ("init-user.sql").
Therefor, with managed services, this app deployed on Kubernetes would only use one image, for the Rails app and Sidekiq.
In the compose file, we use two mount binds:
- an SQL initialiser for Postgres
- a Redis config file
The Postgres initialiser "init-user.sql" will create a user with password when the data directory is empty. This is done by pushing an file in the "/docker-entrypoint-initdb.d" folder (see "Initialisation scripts"). It is run whenever the "data" folder is empty and the script is made idempotent (see code at the end). These credentials will be used as environment variables to configure Postgres in Rails and used in the "docker-compose.yml" file.
The two other named volumes are relative to the main data of the databases Postgres (PG_DATA=/var/lib/postgresql/data
) and Redis (/data
).
The Redis database can or not be used by Sidekiq for managing its queue. We can run two distinct Redis sessions (see section on remote services at the end).
the "sidekiq" service is initialised in the file "/config/initialisers/sidekiq.rb" where we provide the link to a Redis session via
ENV['SIDEKIQ_REDIS']
.the "app" service initialises the Redis in-memory database in a ad-hoc "/config.redis.yml" file where we pass
ENV['REDIS_URL']
.
It is initialised with a "redis.conf" file that holds the passwordrequirepass secretpwd
(circa line 500).
Since "sidekiq" and "app" are built with the same image, the app code holds the environment variables
REDIS_URL
andSIDEKIQ_REDIS
in the ".env" file. This ".env" file is passed to the "app", "sidekiq" and "redis" services.
- migration. Once the Postgres database has been initialised with a "user|password", the "app" service has an entry point file "manage-db.sh" that performs the idempotent command
rake db:prepare
to create and migrate the database. Again, we need to pass the same credentials with the environment variablePOSTGRES_PASSWORD=<password>
in the docker-compose.yml file and save them in the ".env" file in the code (see note at the end).
version: "3"
x-app: &common
env_file: .env
build:
context: .
dockerfile: alpine.prod.Dockerfile
args:
- RUBY_VERSION=3.0.1-alpine
- BUNDLER_VERSION=2.2.21
- NODE_ENV=production
- RAILS_ENV=production
services:
pg:
image: postgres:13.3-alpine
ports:
- 5432
environment:
- POSTGRES_PASSWORD=dockerpassword
volumes:
- pg_data:/var/lib/postgresql/data
- ./pg/init-user.sql:/docker-entrypoint-initdb.d/init-user.sql:ro
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 5s
redisdb:
build:
context: ./redis
ports:
- 6379
env_file: .env
command: ["redis-server", "/usr/local/etc/redis.conf"]
volumes:
- redis_data:/data
- ./redis/config:/usr/local/etc/redis:ro
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 30s
timeout: 10s
retries: 3
sidekiq:
<<: *common
entrypoint: ["./app-pid.sh"]
command: ["bundle", "exec", "sidekiq", "-C", "config/sidekiq.yml"]
depends_on:
redisdb:
condition: service_health
app:
<<: *common
depends_on:
pg:
condition: service_healthy
redisdb:
condition: service_healthy
ports:
- 4000:3000
entrypoint: [./manage-db.sh]
command: ["bundle", "exec", "rails", "server", "-b", "0.0.0.0"]
tmpfs:
- /tmp
volumes:
redis_data: {}
pg_data: {}
We run:
docker-compose up --build
If we don't use the "manage-db.sh" entrypoint (using a Kubernetes job), you we would need to run the following to setup the database and get the app up and running.
docker-compose run --rm app bundle exec rails db:prepare
We added the package
netcat
in the Dockerfile since we usenc
for health testing. This is for testing purposes and should be removed when using a managed service.
For the development mode, we define five processes:
- the Rails app in dev mode for hot-reload,
- the Sidekiq background process
- the Webpack static assets manager run with
webpacker-dev-server
for hot-reload, - the Redis database adapter,
- the Postgres database adapter
We use mount bindings for hot-reload and faster loading.
version: "3"
x-app: &common
env_file: .env
build:
context: .
args:
- RUBY_VERSION=3.0.1-alpine
- BUNDLER_VERSION=2.2.21
#- NODE_VERSION=14 <- for "slim-buster" based image
- NODE_ENV=development
- RAILS_ENV=development
services:
pg:
image: postgres:13.3-alpine
ports:
- 5432
environment:
- POSTGRES_PASSWORD=cyberdyne
- PG_DATA=/var/lib/postgresql/data
volumes:
- pg_data:/var/lib/postgresql/data
- ./pg/init-user.sql:/docker-entrypoint-initdb.d/init-user.sql:ro
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 5s
retries: 5
redisdb:
build:
context: ./redis
ports:
- 6379
env_file: .env
command: ["redis-server", "/usr/local/etc/redis.conf"]
volumes:
- redis_data:/data
- ./redis/config:/usr/local/etc/redis:ro
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 30s
timeout: 10s
retries: 3
webpacker:
<<: *common
command: ["./bin/webpack-dev-server"]
ports:
- "3035:3035"
volumes:
- .:/app:cached
- packs:/app/public/packs
environment:
- WEBPACKER_DEV_SERVER_HOST=0.0.0.0
sidekiq:
<<: *common
entrypoint: ["./app-pid.sh"]
command: ["bundle", "exec", "sidekiq", "-C", "config/sidekiq.yml"]
depends_on:
redisdb:
condition: service_healthy
app:
<<: *common
depends_on:
pg:
condition: service_healthy
redisdb:
condition: service_healthy
environment:
YARN_CACHE_FOLDER: /app/node_modules/.yarn-cache
WEBPACKER_DEV_SERVER_HOST: webpacker
ports:
- 4000:3000
entrypoint: ["./manage-db.sh"]
command: ["bundle", "exec", "rails", "server", "-b", "0.0.0.0"]
volumes:
- bundle_cache:/usr/local/bundle
- node_modules:/app/node_modules
- .:/app:cached
- packs:/app/public/packs
- rails_cache:/app/tmp/cache
tmpfs:
- /tmp
volumes:
redis_data: {}
pg_data: {}
node_modules:
packs:
bundle_cache:
rails_cache:
ARG RUBY_VERSION
FROM ruby:${RUBY_VERSION} AS builder
ARG BUNDLER_VERSION \
NODE_VERSION \
NODE_ENV \
RAILS_ENV
ENV RAILS_ENV=${RAILS_ENV} \
NODE_ENV=${NODE_ENV} \
BUNDLER_VERSION=${BUNDLER_VERSION} \
DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
# for gems to be compiled
build-essential \
# to get desired node version
curl \
&& apt-get clean \
&& rm -rf /var/cache/apt/archives/* \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
&& truncate -s 0 /var/log/*log
RUN curl -sL https://deb.nodesource.com/setup_${NODE_VERSION}.x | bash - \
&& curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - \
&& echo 'deb https://dl.yarnpkg.com/debian/ stable main' > /etc/apt/sources.list.d/yarn.list
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
# comm with PG with gem 'pg'
libpq-dev \
# compile assets
nodejs \
yarn \
&& apt-get clean \
&& rm -rf /var/cache/apt/archives/* \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
&& truncate -s 0 /var/log/*log
WORKDIR /app
COPY Gemfile Gemfile.lock package.json yarn.lock ./
ENV LANG=C.UTF-8 \
BUNDLE_JOBS=4 \
BUNDLE_RETRY=3 \
BUNDLE_PATH='vendor/bundle'
RUN gem install bundler:${BUNDLER_VERSION} --no-document \
&& bundle config set --without 'development test' \
&& bundle install --quiet \
&& rm -rf /usr/local/bundle/cache/*.gem \
&& find /usr/local/bundle/gems/ -name "*.c" -delete \
&& find /usr/local/bundle/gems/ -name "*.o" -delete
RUN yarn --check-files --silent
COPY . ./
RUN bundle exec rails webpacker:compile
# && rm -rf node_modules tmp/cache app/assets vendor/assets lib/assets spec
###########################################
ARG RUBY_VERSION
FROM ruby:${RUBY_VERSION}
ARG NODE_ENV
ARG RAILS_ENV
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
# detect when services inside containers are up and running
netcat-openbsd \
# communicate with PG with gem 'pg'
libpq-dev \
&& apt-get clean \
&& rm -rf /var/cache/apt/archives/* \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
&& truncate -s 0 /var/log/*log
#<- if didn't used the flag BUNBLER_PATH='vendor/bundle', we would copy the host bundle folder
# COPY --from=builder /usr/local/bundle/ /usr/local/bundle/
RUN adduser --disabled-password app-user
USER app-user
COPY --from=builder --chown=app-user /app /app
ENTRYPOINT ["./app-pid.sh"]
ENV RAILS_ENV=$RAILS_ENV\
NODE_ENV=$NODE_ENV \
RAILS_LOG_TO_STDOUT=true \
RAILS_SERVE_STATIC_FILES=true \
BUNDLE_PATH='vendor/bundle'
WORKDIR /app
RUN rm -rf node_modules tmp/cache lib/assets
EXPOSE 3000
If you want to rely on a remote Postgres or Redis service, you need to configure your account and pass the provided URL to the app.
For example, with a remote (free) service ElephantSQL, pass the supplied ENV[ELEPHANT_URL]
into "/config/database.yml" and set the supplied POSTGRES_PASSWORD=xxxx
variable in the "docker-compose.yml" file.
We can also use:
a local "containerised" Redis server for Sidekiq's queue. Set
SIDEKIQ_REDIS=redis://user:password@redisdb:6379
in the file "/config/sidekiq.rb"a remote Redis database for the app. We used a (free) service supplied by Redis Labs. Set
url: <%= ENV.fetch('REDIS_URL','') %>
with the supplied URL from Redislabs in the file "/config/redis.yml". This can replace the Redis service.
An example of the environment variables used:
#.env
export POSTGRES_URL=postgresql://docker:dockerpassword@pg:5432
export ELEPHANT_URL=postgres://ortkcbqt:fhSBQrF3Dzl9WWA1FfRIjQmU7u3pBtTd@batyr.db.elephantsql.com/ortkcbqt
export SIDEKIQ_REDIS=redis://user:secretpwd@redisdb:6379
export REDIS_URL=redis://user:tq4hBlYvIvq0uU7hYMOYS6ErQKsSA2N8@redis-13424.c258.us-east-1-4.ec2.cloud.redislabs.com:13424
export RAILS_MASTER_KEY=cce3c51968fc41dd85b3d8b5d54f43eb
export RAILS_SERVE_STATIC_FILES=true
export RAILS_LOG_TO_STDOUT=true
This code is run whenever the "data" directory is empty to declare a with (the "<" & ">" signs are just here to emphasis).
#init.sql
DO $$
BEGIN
CREATE ROLE <docker> WITH SUPERUSER CREATEDB LOGIN PASSWORD <'dockerpassword'>;
EXCEPTION WHEN DUPLICATE_OBJECT THEN
RAISE NOTICE 'not creating role my_role -- it already exists';
END
$$;
The environment variable POSTGRES_URL=postgresql://<user>:<password>@pg:5423
passed to the "database.yml" in the code must match the "|" used in the "init-user.sql" file and we must pass POSTGRES_PASSWORD=<pasword>
in the "docker-compose.yml" file.
In you compose with non matching credentials, then you need to run docker volume prune
and rebuild.
Note: the rake command rails db:prepare
is idempotent.
#manage-db.sh <-- "app" service
#!/bin/sh
set -e
if [ -f tmp/pids/server.pid ]; then
rm tmp/pids/server.pid
fi
echo "Waiting for Postgres to start..."
while ! nc -z pg 5432; do sleep 0.2; done
echo "Postgres is up"
bundle exec rake db:prepare
exec "$@"
This code is the entry point of the "sidekiq" service. It "cleans" the "pid" file.
#app-pid.sh <-- "sidekiq" service
#!/bin/sh
set -e
if [ -f tmp/pids/server.pid ]; then
rm tmp/pids/server.pid
fi
exec "$@"
We create a file "/config/initialisers/redis.rb" so that Rails will load it on startup and instantiate a Redis session.
#config/initialisers/redis.rb
REDIS = Redis.new(Rails.application.config_for(:redis))
where config_for
will parse and fetch the "/config/redis.yml" config so we can use for example REDIS.set("key", 10)
in the app.
We can test the protection of the Redis database with the command below once the project is up and running:details here:
> docker-compose run --rm redisdb redis-cli -h redisdb -p 6379
The output should be:
redisdb:6379> get compteur
(error) NOAUTH Authentication required.
redisdb:6379> auth secretpwd
OK
redisdb:6379> get compteur
"3"
The Redis persistence is configured in the "redis.conf" file:
- RDB mode periodic:
dbfilename "dump.rdb"
(line 327) - AOF mode: every seconde with
appendonly yes
(line 699) andappendfsync everysec
(line 729) andappendfilename appendonly.aof
(line 703)
Links:
Hope this help!
29