Edge 426: Reviewing Google DeepMind’s New Tools for AI Interpretability and Guardrailing
Gemma Scope and ShieldGemma are some of the latest additions to DeepMind’s Gemma stack
Google’s Gemma is one of the most interesting efforts in modern generative AI pushing the boundaries of small language models(SLMs). Unveiled last year by Google DeepMind, Gemma is a family of SLMs that achieved comparable performance to much larger models. A few days ago, Google released some additions to Gemma 2 that included a 2B parameter model but also two new tools that address some of the major challenges with foundation model adoption: security and interpretability.
The release of Gemma 2 provides an interpretability tool called GemmaScope and an approach to guardrailing by using an ML classifier called ShieldGemma.