Maybe Two Big Research Breakthroughs or Maybe Nothing
Multi-token prediction and a multi-layer perceptron alternative.
Next Week in The Sequence:
Edge 393: Our series about autonomous agents starts diving into planning! We evaluate the ADaPT planning method from Allen AI and XLang Agents framework.
Edge 394: We discuss the amazing Jamba model that combines transformers and SSMs in a single architecture!
You can subscribed to The Sequence below:
📝 Editorial: Maybe Two Big Research Breakthroughs or Maybe Nothing
Research breakthroughs always command a lot of attention in the generative AI space, as they can drive a new wave of innovation for foundation models. However, most of the so-called breakthroughs in the research sphere rarely make it into real implementations, so it's hard to determine whether a new breakthrough will really be influential or not.
Last week was particularly interesting because we saw two papers that, on the surface, seem to be quite a big deal for generative AI models, but it's quite hard to determine if they are ready for prime-time.
1. Multi-Token Predictions
In a paper titled "Better & Faster Large Language Models via Multi-token Prediction," several AI labs, including Meta AI, proposed a method for multi-token prediction in LLMs. It is a known fact that LLMs work by predicting one token at a time, so this seems like a big leap forward. In theory, the method works by predicting 'n' tokens using 'n' independent output heads. One of the tests with a 13B parameter model seems to solve 12% more problems on the HumalEval dataset than next-token model equivalents. One immediate limitation is that this only seems to be effective in very large models but still seems quite an improvement.
2. MLP Alternative
The multi-layer perceptron (MLP) is, arguably, the most important algorithm in the history of deep learning. In a paper titled "KAN: Kolmogorov-Arnold Networks (KANs)," researchers from MIT proposed an audacious alternative to MLP. The main innovation relies on using learnable activation functions on weights instead of the fixed activation functions on nodes in MLP. While conceptually sound, KANs have only been evaluated in a few targeted use cases.
Both the multi-token prediction and the KAN paper are clear indicators of the amazing speed of progress in AI research. Both papers are trying to disrupt established techniques. They both seem to be relevant breakthroughs on the surface. Or maybe not 😊"
🔥 Announcing Galileo Protect: Real-Time Hallucination Firewall
Can you stop hallucinations in real-time? We’re excited to support Galileo Protect – an advanced GenAI firewall that intercepts hallucinations, prompt attacks, security threats, and more!
See Galileo Protect live in action and learn:
- How Galileo Protect works, including our research-backed metrics
- The key features of an enterprise GenAI firewall
- What makes Galileo Protect unique
🔎 ML Research
LLM Juries vs. Judges
Cohere research published a paper proposing a technique to evaluate LLMs using a panel of diverse models. The method looks to mitigate the bias of single-model evaluation techniques and shows a strong performance across different evaluation benchmarks —> Read more.
Iterative Reasoning
Researchers from MetaAI and New York University published a paper outlining an iteractive reasoning technique for LLMs. The essence of the method is to optimize across different chain-of-thought(CoT) options to lead to the correct answer —> Read more.
Gemini in Medicine
Google DeepMind published a paper introducing Med-Gemini, a multimodal version of Gemini specialized in medical tasks. Med-Gemini was evaluated across 14 medical benchmarks outperforming GPT-4 in several of them —> Read more.
Multi-Token Prediction
Researchers from Meta AI, CERMICS Ecole des Ponts Paris Tech and LISN Université Paris-Saclay published a paper proposing an intriguing technique for multi-token prediction in LLMs. At each position of the training corpus, the method can predict a number of future tokens —> Read more.
MLP Prediction
Researchers from Massachusetts Institute of Technology, California Institute of Technology and Northeastern University published a paper introducing e KolmogorovArnold Networks (KANs), promising alternatives to Multi-Layer Perceptrons (MLPs). The core idea is to switch from the MLP’s fixed actiavation functions on neurons to learnable activation functions on weights —> Read more.
Math Benchmark Contamination
Researchers from Scale AI created a new benchmark for elementary math reasoning similar to GSM8k. The results show that several top models such as Mistral and Phi might be overfitting for existing math benchmarks while others such as Gemini, GPT-4 and Claude aren’t —> Read more.
🤖 Cool AI Tech Releases
XTuner
Xtuner is a new toolkit for fine-tuning LLMs that is gaining rapid traction —> Read more.
Amazon Q
Amazon Q, its Copilot alternative, hit general availability —> Read more.
Claude App
Anthropic released a new iOS App for Claude as well as a new teams subscription plan.
Jamba Instruct
AI21 released Jamba Instruct, an instruction fine-tuned version of their Jamba model that combines SSMs and transformers —> Read more.
🛠 Real World ML
Airbnb Brandometer
Airbnb discusses their AI language techniques to understand brand perception in social media channels —> Read more.
Amazon AI-Enhanced Catalogue Data
Amazon dives into the ML techniques used to evaluate the effectiveness of AI-enhaced catalogue data —> Read more.
📡AI Radar
GPU platform CoreWeave just announced a $1.1 billion Series C at a monster $19 billion valuation.
OpenAI and The Financial Times announced a strategic alliance.
LLM fine-tuning platform Lamini raised $25 million in a new round.
Atlassian unveiled Rovo, a new AI assistant for learning and taking actions across its product suite.
Crypto stablecoin maker Tether invested $200M in a brain computer interface startup.
MongoDB announced a new AI application program, a new release of its vector DB engine and a partnership with Cohere.
Ideogram announced a new pro subscription tier.
Robotics maker Sanctuary AI announced a new partnership with Microsoft.
SafeBase raised $33 million to enhance security reviews using AI.
Yelp launched a new AI assistant to help users connect with service professionals.
Vercel acquired ModelFusion to improve its AI capabilities in TypeScript.