More Foundation Models from Stability AI

Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.

Jul 30, 2023

Stability AI Releases Powerful New Image Generator Stable Diffusion XL | PetaPixel — Created with SDXL

Next Week in The Sequence:

Edge 313: Our popular series about foundation models continues with an overview about multimodal chain-of-thought(CoT) reasoning in LLMs. Amazon’s original paper on multimodal CoT and an overview of the hot Open Assistant project.
Edge 314: It’s time for a deep dive about Meta AI’s recently released Llama 2 model.

Go Subscribe!

📝 Editorial: More Foundation Models from Stability AI

Stability AI has established itself as one of the prominent forces in the open-source generative AI space. Most people associate Stability AI with Stable Diffusion, the text-to-image model that opened the floodgates to open-source innovation in foundation models. However, that represents a very constrained version of Stability AI’s real capabilities. For the last few months, Stability AI has been on a release spree that has seen open-source contributions in language, computer vision, datasets, and other key areas of the generative AI landscape.

Last week was a big one for Stability AI. The open-source generative AI champion announced the release of SDXL 1.0, an upgraded version of its text-to-image model with a somewhat unique architecture that leverages a dual-network mode consisting of a 3.5B parameter base model and a 6.6B parameter refiner. The first network generates noisy latents which are then refined by the second network. The results are astonishing. SDXL does not only generate images that match Midjourney quality but can also perform complex tasks such as spatial object positioning and complex concept recognition.

In addition to the SDXL release, Stability AI also open-sourced Stable Beluga 1 and Stable Beluga 2 (codenamed FreeWilly), two instruction-following language models. Both models are based on LLaMA 65B and 70B, respectively, and they were fine-tuned using the methodology pioneered by Stanford University for the Alpaca model. Stable Beluga 2 sits at the top of the Open LLM Leaderboard.

Stability AI's pace of open-source releases across multiple domains is beyond impressive. SDXL and the Stable Beluga models represent important milestones for the open-source foundation models space.

🔎 ML Research

Chain of Hindsight

Researchers from the University of California, Berkeley proposed a new method for fine-tuning LLMs on human preferences called Chain of Hindsight. The technique allows LLMs to learn from any form of feedback regardless of polarity by converting that feedback into sentences used to fine-tune the model —> Read more.

Source Free Domain Adaptation

Google Research published a comprehensive paper exploring source free domain adaptation(SFDA)) methods for bioacoustics environments. SFDA methods typically enable the adaptation of a pretrained model to a new environment using only unlabeled data —> Read more.

RT-2

DeepMind published a paper outlining robotic transformer(RT-2), a vision-to-action method that learns from web and robotic data and translate the knowledge into actions in a given environment. The research builds on its predecessor(RT-1) but shows important improvement in semantic and visual understanding —> Read more.

DECKARD

Researchsers from University of California, Irvine and AI2 published a paper detailing DECKARD, a method that uses LLMs to train reinforcement learning(RL) agents. DECKARD was tested in an LLM-guided exploration of Minecraft with amazing results —> Read more.

Generative AI Best Practices

Google Research published a paper detailing three key principles for building responsible generative AI applications. The principles encompass areas such as design, communication and adversarial testing —> Read more.

ICML Papers

Tech giants published summaries of their papers submitted to the ICML conference. Check out the summaries from Google, Microsoft and Amazon respectively.

🤖 Cool AI Tech Releases

SDXL 1.0

Stability AI open sourced SDXL, a new text-to-image model with significant enhacements in open image generation —> Read more.

Beluga

Stability AI open sourced Beluga 1 and Beluuga 2 two LLaMA-based instruction following LLMs that show state-of-the-art reasoning capabilities —> Read more.

TensorFlow 2.13

A new version of TensorFlow is here with some minor changes for the core stack and Keras —> Read more.

Agent.js

Hugging Face unveiled Agent.js, a library for enabling tool access to LLMs using JavaScript —> Read more.

Quivr

Quivr is an open source framework for storing and retrieving unstructured data using generative AI —> Read more.

🛠 Real World ML

Jupyter at Yelp

The Yelp engineering team discusses the architecture powering experimentation with Jupyter notebooks in their platform —> Read more.

Fast Data Access at Airbnb

Airbnb shares some details about Riverbed, their stack to enable fast data access across their infrastructure —> Read more.

ONNX at SIMZERO

The SIMZERO team shared how they used the ONNX Runtime to enable machine learning for scientific workloads —> Read more.

📡AI Radar

Anthropic, Google, Microsoft, and OpenAI announced the creation of the Frontier Model Forum, a new industry body focused on AI safety.
HumanFirst raised $5 million in new funding to transform existing conversational data into AI agents.
Cybersecurity context-search platform Cyclops, emerged from stealth mode with $6.4 million in funding.
Graft, a lightweight AI infrastructure platform raised $10 million in a new funding round.
X.AI has added AI-risk thought leader Dan Hendrycks to his advisory board.
ServiceNow and NVIDIA announced the creation of the AI Lighthouse program to accelerate generative AI adoption in the enterprise.
Box announced its integration with Microsoft CoPilot.
AWS expanded its Bedrorck platform with additional generative AI models.
In an AI cybersecurity deal, CrowdStrike is inching closer to acquire Bionic.
AutoGenAI, a generative AI platform for product pitches, announced that it has raised $22.3 million.
Beyond Work, a generative AI platform led by the CEO of TradeShift, raised $2.5 million pre-ssed round.
DoorDash is reportedly working on a chatbot for food ordering.

TheSequence

Discussion about this post

Ready for more?