Next Week in The Sequence:
Edge 425: Our series about SSMs dives into Mamba, the best known SSM model. We review the original Mamba paper by Carnegie Mellon University and Princeton and dive into the GridTape framework for building LLM apps.
Edge 426: We discuss Gemma Scope and ShieldGemma, two new tools for interpretability and guardrailing released by Google DeepMind.
You can subscribe to The Sequence below:
📝 Editorial: Black Forest Labs
One of the ideas I like about The Sequence is that it helps bring awareness to AI labs that may not have the media profile or the billions in fundraising of the big AI incumbents but are truly doing unique work in AI research. Today, I’d like to talk about a small startup called Black Forest Labs, which is loaded with world-class AI talent. Even though you might not have heard of Black Forest Labs, there’s a chance you’ve interacted with their work.
Have you used xAI’s Grok’s new image generation features in X? If so, then you’ve been using Black Forest Labs’ models. Grok-2’s new image generation capabilities are powered by a model called FLUX.1, created by Black Forest Labs. Who are these guys? Well, what if I told you that they are part of the team behind the famous Stable Diffusion model and also contributed to research breakthroughs like VQGAN and Latent Diffusion?
Black Forest Labs’ main model is FLUX, which comes in three main variants:
FLUX.1 [schnell]: The fastest model, mostly used for local development and personal use.
FLUX.1 [dev]: An open-weight model for non-commercial usage.
FLUX.1 [pro]: The largest, state-of-the-art image generation model available via APIs.
The company recently raised $31 million from marquee firms like Andreessen Horowitz and General Catalyst, with participation from renowned angel investors such as Michael Ovitz and Gary Tan. Given their research talent, top-tier backers, and partnership with xAI, Black Forest Labs is one of the new startups likely to make some noise in the near future. For now, Grok-2 images are incredibly entertaining.
🔎 ML Research
Phi 3.5
Microsoft published the technical report around Phi 3.5 family of small language models. The new release includes Phi-3.5-MoE as well as new versions of Phi-3.5-mini, Phi-3.5-vision —> Read more.
FermiNet
Google DeepMind published a paper discussing FermiNet, a neural network architecture that can solve fundamental equations of quantum mechanics. FermiNet is the first neural network applied to computing the energy of atoms and molecules —> Read more.
DeepSeek-Prover-V1.5
DeepSeek-AI published a paper unveileing DeepSeek-Prover-V1.5, an LLM optimized for theorem proving. The model uses DeepSeekMath-Base as a baseline and fine-tunes it in theorem proving adn proof generation usign reinforcement learning —> Read more.
xGen-MM (BLIP-3)
Salesforce Research published a paper introducing xGen-MM, also known as BLIP-3, a framework for developing multimodal LLMs. The model showcases strong in-context learning capabilities and includes versions fine-tuned for instructions and safety —> Read more.
Hermes 3
Nous Research published the technical report behind its Hermes 3 family of models specialized in reasoning and creative capabilities. Hermes 3 scales up to 405B parameters and leverages a 128k context windows —> Read more.
Speculative RAG
Google Research published a paper detailing Speculative RAG, a technique that tries to address the effectiveness vs. efficiency dilemma in RAG solutions. The method uses a RAG fine-tuned LLM to complement a generalist LLM in RAG workflows —> Read more.
🤖 AI Tech Releases
Jamba 1.5
AI21 released Jamba 1.5, an SSM-Transformer model that enables long context handling capabilities —> Read more.
NVIDIA Llama-3.1 Minitron
NVIDIA open sourced Minitron, an 4B and 8B distilled versions of Llama 3.1 —> Read more.
🛠 Real World AI
Google AI Edge's MediaPipe
Google provided a deep dive into the techniques for serving 7B parameter models in the browser —> Read more.
AI Infrastructure Videos
The videos from the @Scale AI infrastructure conference are now online —> Read more.
📡AI Radar
MidJourney released a new web experience for its image generation models.
AI ERP platform Opkey raised a $47 million Series B.
AI-blockchain startup Story raised $80 million in a new round.
AI construction platform Trunk Tools raised a $20 million Series A.
Payments for AI agents platform Skyfire Systems, raised $8.5 million in a new round.
xAI started using SGLang for Grok 2 and became 2x faster.
Vectara released Portal, a new no-code interface for creating generative AI applications.
AI-legend Andrew Ng is transitioning to executive chairman in its role at Landing AI.
AI agent for business travel Otto raised $6 million.
Boston Dynamics new Atlas can do push ups.
AI ad platform Creatory raised a $10 million Series A.
Tidal, Google’s AI agriculture spinoff, raised new funding for its expansion plans.
AI fintech platform Magie raised $4 million.
LambdaTest introduced Kane AI, an AI testing assistant.