AII4DEVS #10: Diverse knowledge is the key to grow the next generation of ML practitioners into AI engineers.

Diverse knowledge is the key to grow the next generation of ML practitioners into AI engineers. Our weekly newsletter becomes bi-weekly, with more content, deep dives, and lands on Medium's most relevant publication.

The road so far…

When the AI4DEVS newsletter started, it was just an experiment following a gentle nudge from people asking to have a reference about what’s new on AI and machine learning. The fun part of the story is that all these requests came mainly from peers with no prior expertise in data science willing to get started somewhere.

The last few years saw an unprecedented evolution in the technical role of machine learning experts, growing to a mere wizard pulling complex spells out from their arcane books into an engineering profession, with workflows and best practices to bring any machine learning project to a successful production. The well-known Facebook anecdote of “less than 10% of our AI projects makes it to production” is not acceptable anymore if these competencies are becoming mainstream if we’re approaching the Fourth Industrial Revolution.

AI4DEVS was born out of an attempt to empower people with the knowledge they needed to start their journey. Week after week, a lot of feedback coming to the newsletter better shaped this feeling about what people need: a newsletter that is a mere collection of URLs is good enough if you have time, but it is not great enough to make people deep dive into the content. People, especially technical folks with no time to experiment on their own, need someone to shed light on these results, in the form of curated comments and hints of discussion, to build a strong opinion about what could and what couldn’t work them. This is why AI4DEVS took a few weeks off and comes back with a new format that will hopefully support people to leap AI.

A new beginning

The AI4DEVS newsletter is evolving with its tenth issue, becoming longer and shifting to a bi-weekly publication. Content will be curated by a team of data scientists, developers, and guest experts willing to support this initiative. Moreover, this initiative is thought from the community, for the community so that the obvious choice would be to Our final goal remains the same: to empower people adopting machine learning and AI in what they love, thus building the deep transformation we believe these technologies can bring to our lives.

Trending topics

In many applications, just having a good prediction is not enough, but we need a way to understand “why” a model produced such a result. Think about a model deciding who will get a loan: regulatory constraints require knowing why a given contract is approved or rejected perfectly. This happens because models could incur bias either in data or within the models themselves. Bias detection is one fundamental step in explainable AI, and it is getting more and more attention in recent months. Shashwat Siddhant explains all the issues related to bias in machine learning and how to identify them.

Real-life use cases

One of my favorite approaches to machine learning is the practical “learn by doing.” Hence AWS offers a great use case with this post about **vessel time arrival prediction*. Maybe ship tracking is not your job, but this concrete use case offers a great example of handling tabular structured data. Moreover, the post showcases how to address a machine learning project, framing the problem, understanding the data, and engineering features. In this scenario, vessel position is computed as well as vessel efficiency. In the end, this post is a nice and comprehensive use case for *scikit-learn.

One of the most underrated capabilities of Amazon Rekognition is Custom Labels, which enable image classification customization on specific datasets. A great use case for this managed service is Ostervall which uses this amazing technology to classify claims to insurance companies from driver images. Amazon Rekognition is used to classify the type of damage and identify where the damage is within an image.

On the Natural Language Processing side, a great application comes from Daniel Wellington’s service department implementation of Amazon Translate to reduce translation costs for their customer care services, which led to providing answers to customer issues in their home language. An application developed in less than two weeks reduced operational costs by 90% and improved team response time.

Frameworks and libraries

To all folks in love with Rust programming language, **linfa** is a promising library to check out: a complete porting of the well known scikit-learn library, which enables common preprocessing tasks and classical ML algorithms such as clustering, linear learners, logistic regression, and decision trees as well as support vector machines and Bayesian algorithms such as Naive Bayes. We all know that Python has the 98% of the machine learning languages market share, but if I looked to something else, a super-fast Rust implementation would be my first stop.

Engineering Machine Learning

In a recent panel, Ryan Keenan joined together Andrew Ng, Robert Crowe, Laurence Moroney, Chip Huyen, and Rajat Monga to discuss how MLOps shapes the future of machine learning production engineering.

Speaking of MLOps, Julien Simon, technical evangelist at AWS, presents Amazon SageMaker comprehensive set of applications within their platform to enable project management. It’s pretty scary to understand how many tools can be used to address every aspect, from data cleaning to feature engineering, model training, and bias evaluation.

Where to go from here?

AI4DEVS is coming back in two weeks with a new issue full of machine learning use cases, MLOps stuff, and outlines from the world of AI.

If you are interested in joining the team, send me a DM, and we’ll start a discussion.