Edge 267: A Summary of our Machine Learning Interpretability Series

11 issues that cover the fundamental topics in machine learning interpretability.

Feb 07, 2023

Over the last few weeks, we have been deep diving into different machine learning(ML) interpretability concepts, techniques and technologies. ML interpretability is essential to the future of AI as models are becoming bigger and more complex to understand. From a value proposition standpoint, interpretability brings four clear benefits to ML solutions.

Causality: Interpretable ML models should provide a clear correlation between the variables and the possible outcomes.
Transferability: One of the benefits of explainable ML models is the ability to understand how model weights can adapt to changes in the training environment which is essential to generalize across different scenarios.
Informativeness: ML models that are easily interpretable allow us to understand how specific features or intermediate layers can influence the final prediction.
Fairness: Interpretability is essential to foment fairness and mitigate bias in ML models.

✏️ Please take a Survey

apply() is the ML data engineering event series hosted by Tecton, where the ML community comes together to share best practices. We’re currently working to get a better idea of the major challenges faced by ML teams at their organizations.

Whether you’re a product manager, data scientist, engineer, architect, or ML aficionado, we want to hear from you! Please fill out this 10-minute survey to share your thoughts and experiences. Your information and responses will remain anonymous. To thank you for your time, the first 150 respondents will receive a $25 Amazon gift card. Plus, we’ll send all survey respondents a free copy of the research report before it’s publicly released.

TAKE THE SURVEY

One of the important characteristics of ML interpretability methods is how they explain behaviors relative to the entire model. From that perspective, there are two main groups of interpretability methods:

i. Model Agnostic: By far the most important group of ML interpretability methods, these techniques assume that ML models are black boxes and ignore their internal architecture. Instead, model-agnostic interpretability methods focus on areas such as features and outputs to explain the model's behavior.

ii. Model Specific: An alternative group of techniques is optimized for specific model architectures and assumes prior knowledge of the model's internals.

Additionally, ML interpretability methods also be qualified based on the scope of the explanations.

i. Local: These interpretability techniques drive explanations based on the outputs of individual predictions.

ii. Global: Interpretability methods that attempt to explain the complete behavior of a model.

Our ML interpretability series tried to provide a holistic but also deep view about the state-of-the-art of ML interpretability. Here is a quick recap:

1)      Edge 245 provides an intro to the ML interpretability series; it deeps dive into Uber’s Manifold which provides a visual model for interpreting ML models and provides an overview of Meta AI’s Captum interpretability framework.
2)      Edge 247 discusses a taxonomy for ML interpretability methods; explores Google’s famous paper about the building blocks of interpretability and reviews Microsoft Research’s TensorWatch.
3)      Edge 249 provides some perspectives on the debate between model-intrinsic vs. post-hoc interpretability methods, discusses the research behind OpenAI’s activation atlases which is a very effective ML interpretability method for computer vision models and provides an overview of TensorFlow’s TensorBoard.
4)      Edge 251 provides an overview of the most common global model-agnostic interpretability methods; it discusses OpenAI research on student-teacher models for interpretability and the Lucid Library.
5)      Edge 253 discusses a popular global model-agnostic interpretability method known as partial dependency plots; the research behind temporal fusion transformer model for interpretable time series forecasting and Google’s fairness indicators toolkit.
6)      Edge 255 continues exploring global model-agnostic interpretability techniques with an overview of the accumulated local effects(ALE) method; it also reviews the research behind OpenAI’s Microscope and discusses IBM’s Explainability 360 toolkit.
7)      Edge 257 reviews the concept of local model-agnostic interpretability method; deeps dive into the research behind IBM’s ProfWeight interpretability technique and reviews the InterpretML framework.
8)      Edge 259 discusses a popular local model-agnostic interpretability method known as , SHapley Additive exPlanations(SHAP); it reviews MIT’s taxonomy for interpretable features in ML models and Berkeley’s iModels framework.
9)      Edge 261 explores the popular Local interpretable model-agnostic explanations (LIME) method; it reviews a paper from Meta AI suggesting that interpretable neurons might affect the accuracy of neural networks and discusses the Alibi interpretability framework.
10)   Edge 263 continues exploring local model-agnostic interpretability techniques with an overview of counterfactual explanations; it presents Google’s StyleEx to derive visual explanations and reviews Microsoft’s original implementation of the DiCE method.
11)   Edge 265 explores the states of interpretability methods optimized for deep neural networks; it reviews a research paper that details how OpenAI use interpretability methods to discover unknown properties of the CLIP model and provides an overview of the Eli5 framework.

I hope you enjoyed this series as much as we did. Next tuesday, we will be starting another series about one of the hottest trends in machine learning.

TheSequence

Discussion about this post