Edge 438: Meet DataGemma: Google DeepMind's Effort to Ground LLMs in Factual Knowledge
The model comes accompanied by DataCommons, a data repository based on factual data.
Grounding large foundation models such as LLMs on factual data is one of the biggest challenge of the current wave of AI systems. From reducing hallucinations to expanding the use cases for LLMs to mission critical applications, validating LLM outputs with trustworthy data is rapidly becoming one of the most important building blocks of LLM applications. This is the topic of a recent research from Google DeepMind which resulted in the creation of DataGemma, a series of open models which validate knowledge with a large factual data repository known as DataCommons. DataGemma is the latest addition to DeepMind’s Gemma models which is their initiative around small language models.