Converting PDFs to/from images in a Ruby on Rails application

There are quite a few options if you need to convert images to/from PDFs in Ruby (on Rails). I am going to detail two of the most common options I came across in my projects: using ImageMagick and using an external API (ConvertAPI in my case).

ImageMagick

Once installed, you can manipulate images with the command-line interface (e.g. converting an image to a PDF of a certain size and DPI):

`convert -density 300 -units 'PixelsPerInch' image.jpg -extent 2479x3508 -format pdf result.pdf`

If you are using Rails Active Storage, ImageMagick is the default processor for image analysis and transformations. This allows you to create different image variants on upload, without the need to add your custom CLI commands:

<%= image_tag user.avatar.variant(resize_to_limit: [100, 100]) %>

ImageMagick is an extremely powerful tool, but getting it to do exactly what you want can be challenging at times. Check out the authoritative website imagemagick.org for all the different options at your disposal.

A disadvantage of ImageMagick is that managing installations and policies can be difficult if you do not have full access to your system. One problem I ran into when deploying to Heroku was that I could not directly read an image served via https, because of the restrictive default policy. The Heroku helpdesk article 'How can I override ImageMagick settings in a policy.xml file?' didn't solve the problem. I finally got the confirmation from Heroku support, that

[...] because we use the pre-packaged Ubuntu ImageMagic which is built with the 'installed' option you cannot override the policy from the system path [...] You can disable/reduce options that are enabled by the system policy but once something is disabled you can't re-enable it.

Instead of having ImageMagick request the file itself and do a transform on the result of that request, download the image to /tmp and work off that, using local transforms.

In my use-case, I had to first download the image from AWS S3 (active storage) to /tmp before converting it to a PDF document.

Another challenge I came across with ImageMagick was converting PDFs with any kind of transparent elements. The resulting image would either miss entire sections or have large black/gray elements as overlays. The solution was to add -alpha remove to the convert command.

ConvertAPI

Most conversions to/from PDFs can be done via an external service. The advantage of using a 3rd party API that you don't need to worry about the conversion process, managing PDFs versions and edge-cases or implementing new image formats.

I have had a good experience converting different file formats with ConvertAPI. Their pricing is fair and the API is easy to implement in a Rails project. The ImageMagick version I was using in a project didn't support HEIC images, so I had to use ConvertAPI to convert them to JPEGs.

ConvertApi.config.api_secret = 'SECRET'

    ConvertApi.convert('jpg', {
        File: ConvertApi::UploadIO.new(File.open(file.path)),
      }, from_format: 'heic'
    ).save_files(file.path)

Using an external service is not always possible though. Managing/testing the dependency can be difficult, performance is not ideal and you might not want to send (confidential) customer information to an external service. In that case, ImageMagick remains a powerful alternative!

23