How to Serve Massive Computations Using Python Web Apps.

In a previous article, I wrote about the limitations of using Python web apps for analytics projects. Some of the points kindled readers' curiosity and inspired me to write another story to complement it.

The central question of this article is, "if Python has severe drawbacks because of its sync behavior, how do platforms such as Instagram and Spotify use it to serve millions around the world?"

While I have no official information from these platforms (or similar ones), I have insights into handling such massive requests from my experience.

Here in this article, I prepared a demo to show the basics. A real-life project might have several other technologies in the stack; I haven't included them all here. But I've hinted at some of my favorites in this story and links to resources whenever possible.

Let's begin by discussing the sync-async problem again.

Python's synchronous behavior and the cost of serving web requests.

Having a lean technology stack is an excellent choice. And the data science community is gifted with Python on that front. Because python is a general-purpose language, you can use it to build data pipelines, machine learning, web apps, and a lot more.

A little caveat when using Python web frameworks is its sync behavior. Meaning, Pyhton handles tasks one at a time in every thread. Until the request finishes, others will have to wait in a queue. If you need to serve more concurrent requests, you'll have to increase the cores and the number of instances.

If your web app serves only a limited number of users, this isn't a thing. Also, if you don't perform heavy computations on-demand, you're still okay. But when you grow and compute more, your hello world type of web app could cause serious issues.

This is also true even if you are choosing a Node app. Node is JavaScript-based, and it's the language's nature to run asynchronously. The point is, the cost is more (if unhandled) when using a Python framework.

So, let's handle it.

Decoupling computation from the request-response cycle.

Imagine, on Instagram; you come across the image of a magnificent blue whale. Stunned by its beauty, you double-tap and like the picture. You wish you could see more of them. But this was the first time you encountered such a majestic photo on the platform.

You keep scrolling down; the following few images you see are the old data science memes and stuff.

In a few moments, you revisit Instagram. Splendid! You see more giant blue whales swimming and playing with their cubs. It captivates you.

It is how most platforms work in suggesting content to their users. Immediate learning is not required. They might even lead to false positives as well.

A clever way to handle this is to decouple computation from the request-response cycle. Use the prior knowledge about the user to serve the immediate request. Accumulate more data to learn and perform better in the future. It is how we see old data science memes by continuing scrolling down and more blue whales in the next visit.

In large-scale applications, we collect every action of users in a dedicated database. Datastores such as Cassandra are an excellent choice for this purpose. Based on the nature of the application, periodic tasks update the models with new data, or the system uses triggers to do the same. If they happen before your next visit to the platform, you're lucky to be served by an up-to-date model.

That's enough explanation; it's time to get our hands dirty.

Performing heavy computations outside web requests.

For simplicity, I'll be using Flask. But this technique would work in any other Python frameworks just as well. We use celery to handle heavy computations outside the request-response cycle. Lastly, we will use a database to collect user actions.

In this demo, we use the request itself as the trigger and begin computation immediately. But it may vary according to the nature of your application. Often, you might have to use a separate pipeline as well. In such scenarios, you may need technologies such as Apache Airflow or Prefect.

Installing a message broker—Redis.

A message broker does what its name suggests. You can send a message to the broker, and the broker delivers it to all its subscribers simultaneously. Although this definition seems quite simple, life before message brokers was difficult for engineers.

Here we'll install Redis, a well-known message broker software. Instead, you can also use RabbitMQ, Amazon SQS, and many other brokers with Celery.

The Redis official site has instructions for all the OSs. A platform-agnostic way is to use their official docker image.

docker run --name redis-server -p 6379:6379 -d redis

Installing Celery and Flask

Installation of these two modules is straightforward. Just like many other Python packages, you can use PyPI.

pip install celery Flask

Flask's purpose is to serve web requests over HTTP. It is a minimalistic framework, where its popular alternative Django is a one-stop solution for many needs. The techniques I'm about to share will work on any of these.

Celery is the bridge between Flask (or Django) and the message broker. It parses your Python objects into messages that broker software can understand. A more extensive usage includes tasks scheduling and periodic executions as well.

Decoupled hello world example.

Now, we have Celery and Flask installed, and our Redis service is running. Let's try to create our first decoupled app.

Create a file named app.py and add the below content.

from celery import Celery
from flask import Flask
import time

# 1. A function to create Celery object.
def make_celery(app):
    celery = Celery(
        app.import_name,
        backend=app.config["CELERY_RESULT_BACKEND"],
        broker=app.config["CELERY_BROKER_URL"],
    )
    celery.conf.update(app.config)

    class ContextTask(celery.Task):
        def __call__(self, *args, **kwargs):
            with app.app_context():
                return self.run(*args, **kwargs)

    celery.Task = ContextTask
    return celery


# 2. Create a Flask app
app = Flask(__name__)
app.config.update(
    CELERY_BROKER_URL="redis://localhost:6379",
    CELERY_RESULT_BACKEND="redis://localhost:6379",
)

# 3. Connect Flask with Celery
celery = make_celery(app)

# 4. Now, we can create Celery tasks by annotating @celery.task
@celery.task()
def massive_computation(name):
    time.sleep(5)
    return f"Hello {name}"

The above code has four parts. The first is a function (make_celery) that will return a Celery object.

The second is a Flask app creation. This part may be familiar to you if you have some experience using Flask. We pass in extra configs to inform the Flask app and Celery about the Redis server.

In the third part, we create the Celery app using the make_celery function. We pass in the Flask app instance, which contains the Redis configuration information.

With this setup, we can now create asynchronous tasks. This means their execution may not interrupt the regular operations of the main thread. The example task here, massive_computation, will return "Hello ," not immediately, but after five seconds.

To test this, first, we need to start the Celery server. This is the decoupled process that will run your massive computation. This command below will do it.

celery -A app.celery worker -E

In a different terminal, run the command as shown in the recording below.

This imports our massive_computation function from the app module and calls it. Note the unique delay method used for calling the function. The @celery.task decorator added this extra method.

Once you called the function and stored its value in a variable, you can use the result property to access its value.

As you can see in the recording, the result property has no value for the first few calls. After few seconds, we have "Hello Thuwarakesh" printed on the screen.

Until the function returns the value, you can perform other activities in the same thread. This isn't the typical behavior of apps developed using synchronous languages, such as Python.

Serving massive computation through web requests.

We've installed the technologies we need and created a basic app that runs alongside the application's thread. Now, we've got one final part to put in place—the web request handling.

Now replace the massive_computation function and everything below with the following code.

Create a redis database instance
# ------------------------------------------------------------------
import redis

r = redis.Redis(host="localhost", port=6379, db=0)
# -------------------------------------------------------------------------

r.set("last_calculated_value", 0)  # Set it's initial value to 0.

# Asynchronous execution
# -------------------------------------------------------------------------
def fib(n):
    """This function will calculate the Fibonacci number"""
    return n if n < 2 else fib(n - 1) + fib(n - 2)


@celery.task()
def massive_computation(num: int):
    r.set("last_calculated_value", fib(num))
# -------------------------------------------------------------------------

# Web api endpoints
# -------------------------------------------------------------------------
@app.route("/set/<num>")
def set_fib(num: int = 0):
    massive_computation.delay(int(num))
    return "New Fibonacci number will be assigned soon."


@app.route("/current")
def get_current():
    last_calculated_value = r.get("last_calculated_value")
    return f"Current fibonacci number is {last_calculated_value}\n"

We have created an instance of the Redis database to store our decoupled computation results.

The computation we perform here is a Fibonacci calculation. The massive_computation function accepts a number, calculates the Fibonacci number, and store it in the database.

We've also created two web API endpoints. The first one will set the last calculated value, and the other will read the current value.

To test this app, we need a Flask app running beside the already running Celery service. You can start the Flask app in a different terminal using the following command.

flask run

Now we have everything set up to test our massive computation using the Python web app. Let's try it out.

You can either use your browser or CURL to do this. I'll be using curl in a terminal window.

The recording above explains the usage clearly. When we query the '/current' endpoint, it gives the initial value of 0.

Then we assign a new value, the 40th Fibonacci number, through the '/set' endpoint. Calculating this number would take some time. Yet the system gives an immediate response, "New Fibonacci number will be assigned soon." This conveniently informs the user about the back-end task.

An immediate query of the '/current' endpoint again results in 0. This is because the calculation hasn't been completed, and the value in the database is still 0.

Yet, after a while, the same endpoint returns the value of the 40th Fibonacci number. The back-end calculations have finished and the Celery task updated the value in the database.

Calculating the Fibonacci number isn't the massive computation you'd see in real-life scenarios. But it's a fair estimation of what they may look like. Instead of Fibonacci, you could trigger retraining of your machine learning model.

Final Thoughts

You might be wondering where is the massive computation. Yet, the article is not about the calculation itself. We've focused on how we serve massive computations. For a fair estimation, we've used a Fibonacci number calculator.

We've decoupled computations from the web server's request-response cycle. This allows us to perform calculations without interrupting the webserver.

In this article, we used Celery to handle such decoupled tasks. While celery is capable of running in production, modern alternatives offer more features. Apache airflow and Prefect are some great technologies on this front.

Python is a synchronous language by nature. Attempting to make it asynchronous comes with lots of overheads and drawbacks. Using celery for heavy computations is an elegant way to handle the problem.

Yet, it doesn't mean asynchronous languages can get away with it. Massive computations aren't supposed to be served through the request-response cycle. They usually serve a different purpose than immediate, meaningful responses. Thus this method is helpful in every case.

20