Machine Learning Must-Reads: Fall Edition
Getting a handle on the current state of machine learning is tricky: on the one hand, it takes time to catch up with foundational concepts and methods, even if you've worked in the field for a while. On the other hand, new tools and models keep popping up at a rapid clip. What's an ML learner to do?
We tend to favor a balanced, cumulative approach—one that recognizes that no single person can master all the knowledge out there, but that digesting well-scoped pieces of information at a steady, ongoing cadence will help you gain a firm footing in the field.
Our selection of highlights this week reflects that belief: we've chosen a few well-executed articles that cover both essential topics and cutting-edge ones, and that both beginners and more seasoned professionals can benefit from reading. Let's dive in.
- SHAP vs. ALE for Feature Interactions: Understanding Conflicting ResultsMaking sense of model predictions is at the core of data professionals' work, but it's a process that is rarely straightforward. Valerie Carey‘s latest article focuses on a particularly thorny scenario where two explainability tools—SHAP and ALE—produce conflicting results, and expands on how to move beyond these confusing moments.
- The Olympics of AI: Benchmarking Machine Learning SystemsTaking a cue from the athletes who first broke the 4-minute mile barrier, Matthew Stewart, PhD offers a panoramic overview of benchmarking in Machine Learning and explores how they facilitate innovation and improved performance: "A well-designed benchmark can guide a whole community toward breakthroughs that redefine a field."

- DINO – A Foundation Model for Computer VisionIf you learn best by diving deep into a topic, you don't want to miss Sascha Kirch‘s series, which unpacks and contextualizes influential machine learning papers, one model at a time. In a recent installation, Sascha walked us through the inner workings of DINO, a foundation model based on the groundbreaking abilities of visual transformers (ViT).
- Exploring GEMBA: A New LLM-Based Metric for Translation Quality AssessmentMachine translation isn't exactly a novel technology, but the rise of LLMs has generated new possibilities for enhancing current tools and workflows. Dr. Varshita Sher‘s latest article introduces us to GEMBA, a recently introduced metric that leverages the power of GPT models to evaluate the quality of machine-translated text.
-
Machine Learning, Illustrated: Incremental Learning For the visual learners out there, and especially those taking their first steps in the field, Shreya Rao‘s beginner-friendly guide to incremental learning addresses a key question: how do models maintain and build upon existing knowledge?
In the mood for branching out into other topics this week? We hope so—here are a few other recent standouts:
- If you find it difficult to make time in your busy schedule to explore new topics and expand your skill set, don't miss Zijing Zhu‘s guide to forming healthy continuous-learning habits as a data scientist.
- Bridging the gap between existing conversational-AI tools and real-world, user-facing UI systems is a real challenge; Janna Lipenkova‘s deep dive provides a detailed roadmap to help you get there.
- What would a functional, useful AI ethics toolkit look like? Malak Sadek shares helpful insights based on a design-oriented workshop she recently led.
- For marketing- and business-focused data scientists, Damian Gil outlines several advanced customer-segmentation techniques (including one that relies on the power of LLMs) that can help you produce valuable insights.
- There are several attempts underway, by governments around the world, to regulate AI tools. Viggy Balagopalakrishnan reflects on their shortcomings, and advocates for a more pragmatic, mechanism-based approach.
Thank you for supporting our authors' work! If you enjoy the articles you read on TDS, consider becoming a Medium member – it unlocks our entire archive (and every other post on Medium, too).