Things I learnt migrating an application to Kubernetes

Let me paint the picture, an application forgotten to gain dust and bugs. Created in a rush to be deployed solely using a custom in-house deployment system on ageing and unmaintained Virtual Machines.

Python2. Django 1. Hardcoding.

It was a mess. It was hard.

But it was a rewarding experience.

Knowledge of tooling is sometimes sparse

The application used Docker - thankfully - but used a lot of scripts to do little things.

Times change and knowledge drifts. But we can still over issues with a bash script.

First thing I did was remove all these scripts.

  • Script to run commands in the container, was replaced with a Makefile. Serving as both documentation and one-command setup.

  • Script to setup local environment variables, was replaced with default values set either in the Docker Compose or the application. Local development shouldn't require extra effort to setup and should work from a fresh pull.

  • Script to pull additional libraries, was replaced with git submodules. Love or hate them, submodules are useful and cleaner when done right.

If you were ever think about using a script to overcome an issue, double check what you are already using and you might find a better solution.

Focus on what the Application needs when building a container

The container originally when built was over a whopping 1GB! But as I mentioned it was a Django application, so how was it so big?

"Useful tools and packages"

Database Client, Text editors, etc., you name it and it was probably there.

The application didn't need any of it. A developer did.

The convenience of pre-install tools and packages adds unneccesary bulk. This slows building time and add extra maintenance.

Ideally, you shouldn't need to shell into a container to Debug. But if you did, you could just install the necessary when debugging and then destroy the container when done.

Containers are meant to emphemeral.

After removing all those 'convenient' tools and packages, the image size dropped to about 300MB. đŸ’Ș

Moving code repositories to a new host is easy?

The code needed moved, from BitBucket to GitHub. But I had never done anything like this before.

After a bit of searching around and trying GitHub's Importer, which didn't work, I came across this:

git push --mirror {destination}

I had the power.

16 repositories pulled. 16 repositories created. 16 mirror pushes. Done.

Though as easy as it was to do with fresh empty repositories., the command is destructive so be wary.

Helm + Terraform works but better used separately

We use a combination of tools. Terraform for all the infrastructure and Helm to template the application settings.

But Terraform can do the things Helm does.

The application is aged, and reliant on older versions of services e.g., Elastic Search. I tried using the Elastic Search Helm chart but I couldn't get the exact version that the application was previously using and it added extra things that I didn't need e.g., multiple pods.

Using Terraform, I created a simple Kubernetes Deployment for the Elastic Search service and that's all I needed. Job Done.

Another challenge I encountered, as I had created the Helm template for the application and referenced it locally with Terraform, is that any update to the template wouldn't cause Terraform to update it.

A way around this was to ensure Values passed into the helm_release resource were referenced and easily updatable e.g., Image Tag.

But looking back, if I had just created it all in Terraform, I wouldn't have had to do such things and I feel it would have been nicer overall.

Plus Helm templates are painful to debug. Too many spaces here, too little there. 🙃

Embracing Failing Fast

I thought I was being smart when taking out the hardcoded settings for Django e.g., Database Name, User, etc. and using Environment Variables but providing Defaults.

DATABASE_NAME = os.environ.get('DATABASE_NAME', 'postgres')

It worked fine for the local development setup as I didn't need to update the Compose file with an extra Environment Variable.

But then I noticed an issue in the Live environment on the Cluster, that the application was using the wrong Database. But how?!

...

I forgot to include it in the Helm Values.

...

A quick fix.

DATABASE_NAME = os.environ['DATABASE_NAME']

And now I'll never forget to include important settings.

You won't always get thanks but should always take on the challenge

This took me a lot of time. Research, distractions, failures, repeat.

And from the Client and User side, nothing changed. The site is up and running.

But it was worth it.

There didn't need to be any fanfare. When you put in a lot of effort to make something a little bit better, more secure, reliable and can be proud of what you've managed, then that's all you need.

The relief from finishing it and merging all the code after a clean was pure bliss.

I've learned a lot doing this but I would happily do it again.

22