Next Week in The Sequence:
Edge 435: Our series about SSMs continues discussing Hungry Hungry Hippos (H3) which has become one of the most important layers in SSM models. We review the original H3 paper and discuss Character.ai’s PromptPoet framework.
Edge 436: We review Salesforce recent work in models specialized in agentic tasks.
You can subscribe to The Sequence below:
📝 Editorial: Meta AI’s Big Announcements
Meta held its big conference, *Connect 2024*, last week, and AI was front and center. The two biggest headlines from the conference were the launch of the fully holographic Orion AI glasses, which represent one of the most important products in Meta’s ambitious and highly controversial AR strategy. In addition to the impressive first-generation Orion glasses, Meta announced that the company is developing a new brain-computer interface for the next version.
The other major release at the conference was Llama 3.2, which includes smaller language models of sizes 1B and 3B, as well as larger 11B and 90B vision models. This is Meta’s first major attempt to open source image models, signaling its strong commitment to open-source generative AI. Additionally, Meta AI announced the Llama Stack, which provides standard APIs in areas such as inference, memory, evaluation, post-training, and several other aspects required in Llama applications. With this release, Meta is transitioning Llama from isolated models to a complete stack for building generative AI apps.
There were plenty of other AI announcements at *Connect 2024*:
Meta introduced voice capabilities to its Meta AI chatbot, allowing users to have realistic conversations with the chatbot. This feature puts Meta AI on par with its competitors, like OpenAI and Google, which have already introduced voice modes to their products.
Meta announced an AI-powered, real-time language translation feature for its Ray-Ban smart glasses. This feature will allow users to translate text from Spanish, French, and Italian by the end of the year.
Meta is developing an AI feature for Instagram and Facebook Reels that will automatically dub and lip-sync videos into different languages. This feature is currently in testing in the US and Latin America.
Meta is adding AI image generation features to Facebook and Instagram. The new feature will be similar to existing AI image generators, such as Apple’s Image Playground, and will allow users to share AI-generated images with friends or create posts.
It was an impressive week for Meta AI, to say the least.
🔎 ML Research
AlphaProteo
Google DeepMind published a paper introducing AlphaProteo, a new family of model for protein design. The model is optimized for novel, high strength proteins that can improve our understanding of biological processes —> Read more.
Molmo and PixMo
Researchers from the Allen Institute for AI published a paper detailing Molmo and Pixmo, an open wegit and open data vision-language model(VLM). Molmo showcased how to train VLMs from scratch while Pixmo is the core set of datasets used during training —> Read more.
Instruction Following Without Instruction Tuning
Researchers from Stanford University published a paper detailing a technique called implicit instruction tuning that surfaces instruction following behaviors without explicity fine tuning the model. The paper also suggests some simple changes to a model distribution that can yield that implicity instruction tuning behavior —> Read more.
Robust Reward Model
Google DeepMind published a paper discussing some of the challenges of traditional reward models(RMs) to identify preferences in prompt indepdendent artifacts. The paper introduces the notion of robust reward model(RRM) that addresses this challenge and shows great improvements in models like Gemma —> Read more.
Real Time Notetaking
Researchers from Carnegie Mellon University published a paper outlining NoTeeline, a real time note generation method for video streams. NoTeeline generates micronotes that capture key points in a video while maintaining a consistent writing style —> Read more.
AI Watermarking
Researchers from Carnegie Mellon University published a paper evaluating different design choices in LLM watermarking. The paper also studies different attacks that result in the bypassing or removal of different watermarking techniques —> Read more.
🤖 AI Tech Releases
Llama 3.2
Meta open sourced Llama 3.2 small and medium size models —> Read more.
Llama Stack
As part of the Llama 3.2 release, Meta open sourced the Llama Stack, a series of standarized building blocks to develop Llama-powered applications —> Read more.
Gemini 1.5
Google released two updated Gemini models and new pricing and performance tiers —> Read more.
Cohere APIs
Cohere launched a new set of APIs that improve its experience for developers —> Read more.
🛠 Real World AI
Data Apps at Airbnb
Airbnb discusses Sandcastle, an internal framework that allow data scientists rapidly protype data driven apps —> Read more.
Feature Caching at Pinterest
The Pinterest engineering team discusses its internal architecture for feature caching in AI recommender systems —> Read more.
📡AI Radar
Meta introduced Orion, its very impressive augmented reality glasses.
James Cameron joined Stability AI’s Board of Directors.
The OpenAI soap opera continues with the resignation of their long time CTO and rumours of shifting its capped profit status.
OpenAI’s Chief Research Officer also resigned this week.
Letta, one of the most anticipated startups from UC Berkeley’s Sky Computing Lab, just came out of stealth mode with a $10 million round.
Image model platform Black Forest Labs is closing a new $100 million round.
Google announced a new $120 million fund dedicated to AI education.
Airtable unveiled a new suite of AI capabilities.
Enterprise AI startup Ensemble raised $3.3 million to improve the data quality problem for building models.
Microsoft unveiled its Trustworthy AI initiative.
Runway plans to allocate $5 million for producing AI generated films.
Data platform Airbyte can now create connectors directly from the API documentation.
Skills intelligence platform Workera unveiled a new agent that can assess, develop adn verify skills.
Convergence raised $12 million for building AI agents with long term memory.