Quickly find and fix performance problems by combining performance, debug, and execution trace info

The most effective way to improve the performance of code is to create a local dev environment that stresses it. Once you can replicate a performance issue locally, you can use all your tools and tricks to address the issue. Avoiding guesswork and leveraging the power of modern code editors and tools helps you to reduce your mean time to repair (MTTR) when responding to an incident or ticket.

Once you can collect the performance data you need, it's a cardinal sin to start guessing about where the problem lies. Don't guess! Collect data about the performance of your code and use it. The primary reason that we are tempted to avoid profilers is that they have a (deserved) reputation for being clunky and hard to setup. Lately, I've been working on a way of collecting performance data that is dirt simple to setup, and as of today, our free, open source tool AppMap do this. AppMaps have always included timing data - and now, the AppMap UI shows performance data when you open an AppMap in your code editor. This tells you for sure if you can replicate a performance bug locally - and you can drill down through the code execution tree to find performance hotspots. You can also compare the AppMap data to a reported issue to verify that you can identify and replicate a performance issue.

Beyond verifying that the issue can be replicated locally, AppMap captures the entire context of code being recorded and executed, and provides detailed information about web services, function calls, parameter values, and SQL. Taken together, the AppMap contains a wealth of data that you can use to analyze performance issues. By virtue of its simple install and setup process, flexible methods for recording your code, and the comprehensive data included in the code maps, AppMap is a great way to get started with a performance optimization task.

Once you identify the root cause and make your fix, comparing AppMap’s before-and-after performance data allows you to confirm that you’ve addressed the performance issue (in addition to any associated test cases). These before-and-after data can also be shared in a post-mortem to communicate the problem, attached to a ticket that describes the fix, and saved in a knowledge base (to help prevent the issue from happening again!)

Here’s a brief demo of how to quickly troubleshoot and verify a fix of a performance issue in a popular Python application. (FYI, we’ve modified this app slightly for demo purposes). In case you prefer to read vs. watch, a written version of the demo follows.

Questions or feedback? We’d love to hear from you -- reach out in Discord or Twitter!

Demo transcript

  • 00:02 It’s come to my attention that in our community app, listings of past meetings (that were recently added by a new team member) feel notably slower than renderings of future meetings, and a quick glance at database utilization shows a rapid increase of queries since the new feature has been introduced.

  • 00:22 I’ll use a custom test that will amplify the performance difference between the two cases, simulating fetching and rendering of 1000 meetings.

    Test

  • To observe the behavior and measure the performance differences, I’m going to use the free and open source framework AppMap.

  • 00:39 Now let me run the test and record my code behavior with AppMap:

    AppMap with Performance numbers

  • This is interesting, because the future meetings took .16s while past meetings took a lot more time, .42s.

  • 00:54 Let’s drill down and compare the code flows of the two requests.

    Performance of function calls

  • This is a Django app and I can see that the renderer took .14s for future and .40s for past meetings. I can also instantly see that the first code flow has four children vs. 1505 of the second one.

  • 01:15 Let’s look at the first flow, first.

    Future meetings code flow

I can see that three SQL commands were executed for the future meetings request: The first lists all future meetings from the table, and the two following SQL commands pre-fetch data for all future meetings from related tables.

  • 01:33 Let’s drill down to the second flow, the request to view all past meetings.

    Past meetings code flow

  • The second sequence starts with fetching all past meetings. What follows is very different though: The ORM runs three queries for each past meeting in the database -- that’s 1500 trips to the database total for 500 past meetings created by the test. Even on my dev environment with no other database load, the performance difference between the two SQL command patterns is pretty remarkable.

  • 02:02 Let’s see what happens in the code that sets up the data entities. A quick glance in the view implementation reveals that the past meetings view lacks the select_related and prefetch_related method calls, causing the observed N+1 problem.

class FutureMeetingsAll(ListView):
    template_name = "meetings/future_meetings_all.html"
    queryset = Meeting.objects.filter(
        when__gt=datetime.datetime.now() - datetime.timedelta(hours=3)
    ).order_by("when").select_related("where").prefetch_related("topics", "meeting_sponsors")

class PastMeetingsAll(ListView):
    template_name = "meetings/past_meetings_all.html"
    queryset = Meeting.objects.filter(
        when__lt=datetime.datetime.now() - datetime.timedelta(hours=3)
    )
  • Note that despite the obvious code issue, the tests were passing previously -- functionally, the existing code works -- but it’s causing extra costs and risk, and puts an unnecessary strain on the database.

  • 02:34 Let’s fix and run the test again.

class PastMeetingsAll(ListView):
    template_name = "meetings/past_meetings_all.html"
    queryset = Meeting.objects.filter(
        when__lt=datetime.datetime.now() - datetime.timedelta(hours=3)
    ).order_by("when").select_related("where").prefetch_related("topics", "meeting_sponsors")
  • 02:50 And here we can see that with the fix in place, the number of the database queries dropped, and the performance of the past meetings flow has improved dramatically.

    Performance numbers after the fix

    Code flow after the fix

To recap this demo, AppMap helped me quickly identify a performance problem and verify a fix by providing me with all the clues required to discover and address the performance-impacting flaw in my code.

19