Python use case - Pagination

The use case

A common use in development is to get some data from an external backend.

For performance reason, most implements pagination.

Let see how we can get all result from a backend.

For these examples, i'll use httpx to perform the http operation. It's an excellent client, you should use it.

Note that these are quick examples, not suitable for production environment (no pydantic, no security, no retry, no error management etc...)

In my examples i only print the lenght of the results, however, feel free to do something more interesting ;)

The backend

It provides a paginated API of a fake airline company.

Result looks like that:

{
    "totalPassengers": 19675,
    "totalPages": 1968,
    "data": [
        {
            "_id": "5ff393986feae02b6c22b251",
            "name": "Deepanss",
            "trips": 0,
            "airline": [
                {
                    "id": 3,
                    "name": "Cathay Pacific",
                    "country": "Hong Kong",
                    "logo": "https://upload.wikimedia.org/wikipedia/en/thumb/1/17/Cathay_Pacific_logo.svg/300px-Cathay_Pacific_logo.svg.png",
                    "slogan": "Move Beyond",
                    "head_quaters": "Cathay City, Hong Kong International Airport, Chek Lap Kok, Hong Kong",
                    "website": "www.cathaypacific.com",
                    "established": "1946"
                }
            ],
            "__v": 0
        },...
    ]
}

The async implementation

A common implementation would look like that:

import asyncio
import typing as t

import httpx


async def get_passengers(
    page: int = 0,
    size: int = 10,
) -> httpx.Response:
    print(f"Getting page n°{page}")

    response: httpx.Response = httpx.get(
        f"https://api.instantwebtools.net/v1/passenger?page={page}&size={size}"
    )
    response.raise_for_status()

    return response


async def get_all_passengers() -> None:
    page = 0
    result = []
    while True:
        response = await get_passengers(page=page)
        response_data = response.json()

        result += response_data["data"]
        if page >= response_data["totalPages"]:
            break
        page += 1

    print(f"total result:{len(result)} ")


if __name__ == "__main__":
    asyncio.run(get_all_passengers())

Output should look like:

Getting page n°0
Getting page n°1
Getting page n°2
Getting page n°3
Getting page n°4
Getting page n°5
Getting page n°6
Getting page n°7
Getting page n°8
Getting page n°9
Getting page n°10
total result:110

The async for implementation

This implementation use async generator in order to use async for .

import asyncio
import typing as t

import httpx


async def get_passengers(
    page: int = 0,
    size: int = 10,
) -> httpx.Response:
    print(f"Getting page n°{page}")

    response: httpx.Response = httpx.get(
        f"https://api.instantwebtools.net/v1/passenger?page={page}&size={size}"
    )
    response.raise_for_status()

    return response


async def get_all_passengers() -> t.AsyncGenerator[t.Any, None]:
    page = 0

    while True:
        response = await get_passengers(page=page)
        response_data = response.json()

        for passenger in response_data["data"] or []:
            yield passenger

        if page >= response_data["totalPages"]:
            break
        page += 1


async def get_all_passengers_async_for() -> None:
    result = [passenger async for passenger in get_all_passengers()]
    print(f"total result:{len(result)} ")


if __name__ == "__main__":
    asyncio.run(get_all_passengers_async_for())

This implementation is based on generator. Since the async get_all_passengers method yield a result, it become an AsyncGenerator.

AsyncGenerator implements the __aiter__ special method, that's why you can perform an async for on like in get_all_passengers_async_for .

In order to test it, you'll have to mock the __aiter__ attribute.

>>> mock = MagicMock()  # AsyncMock also works here
>>> mock.__aiter__.return_value = [1, 2, 3]
>>> async def main():
...     return [i async for i in mock]
...
>>> asyncio.run(main())
[1, 2, 3]

Conclusion

I hope you enjoy learning this article :)

You have two way to handle pagination into your code, depending on your use case :)

38