Learnings from creating recommendation engines with Amazon Personalize

Alexa...set a timer for 15 minutes. ⏳

In the last year of working in my role as an AWS Solution Architect, I was involved in several projects to improve products to provide a more personalized experience. Recommendation Engines are one good way to provide a personalized product usage.

I want to give you some insights about my personal learnings on building recommendation engines using Amazon Personalize.

What is Amazon Personalize?

As part of the AI Services, Amazon Personalize is a managed service to provide personalization and recommendations based on the same technology used at Amazon.com. The marketing claim further adds: "...with no ML experience required". Well in general that is true, but my personal experience is: having ML skills on board will boost you project. Lets dive deeper into this in the next chapters.

Using Amazon Personalize you will get convenient APIs you can use to solve a specific personalization and recommendation business problem. The trade-off of using more convenience is less flexibility in the end. This does not have to be bad at all. But it is good to keep this in mind when choosing on which level of the AWS ML stack you want to focus your work.

Amazon Personalize is a fully managed service. It generates highly relevant recommendations using deep learning techniques. It build custom and private ML models using your own data. Private ML models means, data and models are not shared to improve the Amazon Personalize service itself. Which makes GDPR compliance a bit easier 🥳

Three simple steps to your recommendation engine

The cool thing about Amazon Personalize, it includes a fully managed ML pipeline to identify features, select the right hyperparameters, train your model, optimize your model, host your model and provide a real-time feature store.

Your main task as a consumer is:

  1. Define you target schema of user interactions, item metadata and user metadata
  2. Preprocess or transform historical data to match your desired target schema
  3. Import your data into Amazon Personalize using a batch import
  4. Deploy your auto-trained recommendation model resulting in an API endpoint to
  5. Infer recommendations for your users

Sounds easy. But working on three recommendation engine projects in the past, I can tell you: there are a lot of more things to consider besides those 5 steps.

Amazon Personalize solves three foundational personalization cases like user pesonalization, similar items or personalized ranking.

From PoC to MVP

Data analysis and feature engineering

Before you just import your historical data, it is recommended to gather knowledge. Both on your data and on your business domain. Every recommendation engine project is kind of unique if we look at the data we have to process and the way how the business works.

In a very first step during a proof-of-concept phase, it is all about finding answers on:

  1. What data do we need?
  2. How do we get those data?
  3. How do we identify a user?

Amazon Personalize is able to collect data for three main data sources: user metadata, item metadata and user interactions. Personalize at least needs interaction metadata to operate and work. Item and user data are not necessary required but having them will improve the recommendation output.

So even if you do not need data science skills to use Amazon Personalize, it is really helpful to have those skills in your package for data analysis and preparation.

Strong collaboration within your team enables you to get quick answers the above questions. Existing web analytics or user tracking data is often a very interesting source for interaction data. At least for using these data to create an initial and production-ready recommendation model.

Data analysis might also reveal some additional challenges. If data quality is poor, it is likely that providing good recommendations from the start might not be the easy task. Making wrong assumptions might lead to biased datasets which will influence recommendations. In a nutshell: data analysis and preceding feature engineering is key to understand the context you are in.

Define your KPIs

Without knowing the exact business value you want to provide and how to measure success, it can be very hard to estimate whether the solution you provide is successful or not. Personalized recommendations means: you get different content compared to me. Looking just at my result it is very hard to debug and judge if the recommendations me or one of my collegues get, are sufficient or not. Simply trusting the process that Amazon Personalize will make a good job might also blur your results.

You have to define your KPIs upfront. What makes your recommendation engine successful? Is it an increase of session duration? Is it an increase of read articles per user per session? Is it an increase of selled products? Is retention the right KPI?

Defining the right KPI is not an easy task. And doing it too late makes this process even harder. So my recommendation is: do it as a very very first step and align with your business stakeholders, what KPIs make sense.

A/B testing

A/B testing is mandatory when you consider to build a recommendation engine. It should be prepared right from the start of a project. Otherwise it could make things unnecessary complex if we blindly pick an out-of-the box solution without looking at what is really needed for a given workload.

A/B testing is a common technique for comparing the efficiency of different recommendation strategies. These capabilities should support us in getting answers based on our KPIs. By applying A/B testing you are able to compare different kind of user journeys:

  • user journeys including recommendations vs no recommendations to measure the impact of recommendations on your KPIs.
  • measuring the performance of two recommendation models to decide which one is the better one.

My personal conclusion after the past months of work in this field: data is key. If you want to use data to make better decisions quickly, you have to work on both accessibility, semantic and quality of your data.

Analytics has to scale - and here I not mean the technical side of scaling. It has to scale organizational along business requirements and frequency of change. Focusing on AI/ML use cases introduces a lot of changes as you have to explore and experiment with data to measure your success. The more and better data we have, the better is your foundation for experiments. Introducing experiments and A/B testing might also come with a change in organizational culture or peoples mindset.

Kick start your proof of concept

During my learning path, I collected several resources that helped me a lot to understand more about Personalize and also use existing implementations to setup a working recommendation engine in less than an hour. This is actually very helpful in a proof-of-concept phase to get fast results.

Below you find a curated list of helpful resources on Github and the AWS blog.

Github

AWS Blog

Workshops

My top three learnings

Amazon Personalize is an awesome service. I am very impressed about the results. But Personalize can only provide good recommendations, if you keep some things in mind. Here are my top three learnings.

  1. Scope out the business problem you want to solve
  2. Define your KPIs to measure success and implement A/B testing
  3. Analyze your data and crunch as much knowledge as you can about the business domain and business problem.

Alexa says, time is over...see you next time. Happy to get your feedback, experience and thoughts in the comments. đź‘‹

Image Sources:

17