Edge 426: Reviewing Google DeepMind’s New Tools for AI Interpretability and Guardrailing

Gemma Scope and ShieldGemma are some of the latest additions to DeepMind’s Gemma stack

Aug 29, 2024

∙ Paid

Google’s Gemma is one of the most interesting efforts in modern generative AI pushing the boundaries of small language models(SLMs). Unveiled last year by Google DeepMind, Gemma is a family of SLMs that achieved comparable performance to much larger models. A few days ago, Google released some additions to Gemma 2 that included a 2B parameter model but also two new tools that address some of the major challenges with foundation model adoption: security and interpretability.

The release of Gemma 2 provides an interpretability tool called GemmaScope and an approach to guardrailing by using an ML classifier called ShieldGemma.

TheSequence

Edge 426: Reviewing Google DeepMind’s New Tools for AI Interpretability and Guardrailing

Gemma Scope and ShieldGemma are some of the latest additions to DeepMind’s Gemma stack

Gemma Scope

This post is for paid subscribers