Getting Started with Amazon SageMaker Studio Lab

The best way to learn data science and machine learning is with hands-on labs, tutorials and experimentations. Unfortunately, there are common pain points that add a layer of friction to get aspiring data scientists started.

These struggles include:

  • setting up hardware like GPU or frameworks installation on personal laptops
  • cloud-hosted ML environments are easy to set up but expensive
  • lack persistent storage on free options (i.e. your data and environment will reset after the session expires)

Introduction

In the latest AWS re:Invent 2021, the AWS team announced the launch of SageMaker Studio Lab (currently in preview) to address these challenges and eliminate the setup hassle.

Amazon Sagemaker Studio is a free, no-configuration service that allows developers, academics and data scientist to learn and experiment with machine learning.

Unlike SageMaker Notebook Instances or SageMaker Studio, where you need to set up an AWS account (and the need for a credit card), you now only need a valid email address to register for an account and start experimenting.

Overview

AWS SageMaker Studio Lab is free (yes, FREE!). You can even choose between CPU or GPU, depending on your project needs.
Your account is allocated 15 GB of persistent storage, and 16GB of RAM. What this means is you can save your project and dataset in the cloud (no need to start from scratch every time)

For those who are familiar with AWS, the underlying resources are as follow — G4dn.xlarge for GPU and T3.xlarge for CPU instances (subject to change)

Account Registration and Creation

  1. Visit https://studiolab.sagemaker.aws/ and request an account.

  2. Fill up the form with your details
    Request Form (screenshot from author)

  3. Wait for request approval (AWS claimed the process is within 1 to 5 business days. I have gotten my account approved the next day after my request)

  4. Upon receiving your approval email, you can follow the account creation instruction, proceed with the sign-up link from the email.

Exploring the Interface

Upon reaching the landing page, you will need to start your project runtime. You’ll need to select between CPU and GPU runtime, and the sessions last from 12 (CPU) and 4 (GPU) respectively.

Once the session is timed out, you will have to restart the project runtime again. Don’t worry, all your files will be saved on the persistent project storage.

Learn and Experiment

AWS Machine Learning University (MLU)

MLU notebooks contain materials used to train Amazon’s own developers on machine learning. Courses include

  • Natural Language Processing
  • Tabular Data
  • Computer Vision
  • Decision Trees and Ensemble Methods

Dive Into DeepLearning (D2L)

Interactive notebooks (over 150 Jupyter Notebooks) that teach the fundamentals of machine learning, adopted from 300 universities, including Stanford, MIT, Harvard and Cambridge.

Hugging Face

Jupyter Lab Interface

Since it is based on the open-source JupyterLab, you can take advantage of open-source Jupyter extensions to run your Jupyter notebooks.

You can also have full control with your (virtual) environment to leverage frameworks such as PyTorch, TensorFlow, MxNet, Hugging Face and libraries such as Scikit Learn, Pandas and NumPy.

You can clone your own Github repository and work on SageMaker Studio Lab as it has integration to Github and to Git for version control.

Furthermore, if you have a public Github repo with Jupyter Notebook, you can make it easy for others to open your notebooks in SageMaker Studio Lab.

All you need to do is to add the Open in Studio Lab link (badge) to your README.md file or notebook. The markdown to be included is as follow:

[![Open In Studio Lab]
(https://studiolab.sagemaker.aws/studiolab.svg)]
(https://studiolab.sagemaker.aws/import/github/org/repo/blob/master/path/to/notebook.ipynb)

The created badge will look like this.

Bonus: Hackathon

At the time of writing (December 2021), there’s an ongoing hackathon (AWS Disaster Response Hackathon) where you can explore and train your models in SageMaker Studio Lab. The deadline is 7 Feb 2022, 5:00 pm EST.

Thank you for reading this article and I hope you will find it insightful. Get your SageMaker Studio Lab before the waitlist is getting longer.

Happy learning everyone.

28