Yann LeCun's Vision Starts Materializing
Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.
A Personal Thank You Note
Today marks the 450th edition of The Sequence which includes 300 edition of The Sequence Edge( Tues-Thu) and 150 edition of The Sequence Scope(Sundays) plus numerous extra editions with interviews, sponsored content, etc. In an era where therea are plenty of newsletters about news in AI, we try to maintain a high bar by focusing on unique, deep technical content about AI research and tech. Today, we could with over 160,000 subscribers including many of the top AI labs in the world. It’s been a privilege to write for you for three years and can’t thank you enough for your trust and support. Now let’s go with today’s newsletter.
Jesus Rodriguez
Next Week in The Sequence:
Edge 301: Our series about new techniques in foundation models continues with an overview of retrieval-augmented foundation models. We discuss Google Research’s paper about REALM, the original retrieval-augmented foundation model and the new version of the Ray platform that includes support for LLMs.
Edge 302: We deep dive into MPT-7B, an open source LLM that supports 65k tokens.
📝 Editorial: Yann LeCun's Vision Starts Materializing
With all the hype surrounding generative AI, we sometimes overlook the thrilling advancements in other areas of the deep learning ecosystem. One area that holds immense promise is self-supervised learning (SSL), which aims to mimic the learning processes of infants who begin with an innate understanding of the world and further develop it through experimentation. No one champions SSL architectures quite like Yann LeCun, Meta AI's Chief AI Scientist, Turing Award winner, and legendary figure in the field of AI. Last year, Mr. LeCun presented a vision for a novel SSL-based AI architecture that enables models to learn rapidly by creating representations of the world and adapting to unforeseen circumstances. Just a few days ago, Meta AI unveiled the first model based on Mr. LeCun's vision.
The Joint Embedding Predictive Architecture (I-JEPA) is a computer vision model that constructs an internal framework of the surrounding environment. This framework involves evaluating abstract depictions of images instead of directly comparing individual pixels. Notably, I-JEPA exhibits remarkable proficiency across various computer vision tasks, surpassing other commonly used models in terms of computational efficiency. The essence of I-JEPA (and similar models) lies in recognizing that humans effortlessly acquire a substantial amount of background knowledge about the world through passive observation alone. This wealth of common sense information is considered crucial for enabling intelligent behavior, including efficient acquisition of new concepts, grounding, and planning. I-JEPA operates on the principle of predicting missing information within an abstract representation that closely aligns with the general understanding of humans. In contrast to generative approaches that make predictions at the pixel or token level, I-JEPA focuses on abstract prediction targets, potentially disregarding unnecessary pixel-level details and enabling the model to grasp more semantic features.
Given the remarkable progress in generative AI, it is challenging to envision architectures that could replace transformers as the next AI model powerhouse. SSL and Mr. LeCun's vision, however, stand as strong contenders, and we can anticipate Meta AI doubling down on the ideas behind I-JEPA.
🔎 ML Research
I-JEPA
Meta AI Research published a paper unveiling e Image Joint Embedding Predictive Architecture (I-JEPA), a computer vision model based on their vision of human-like AI systems. I-JEPA uses self-supervised learning to crate an image of the outside world by comparing abstract representations of images —> Read more.
Imagen Editor and EditBench
Google Research published a paper outlining Imagen Editor and EditBench, two advanced technique for text-guided image inpainting. Imagen Editor is a task masked inpainting technique while EditBench is a method for evaluating the quality of image editing models —> Read more.
Orca
Microsoft Research published a paper detailing Orca, a model for reasoning in LLMs. Orca is a 13 billion parameter model that learns to imitate the reasoning of LLMs through a complex process of sampling and selection —> Read more.
Honest LLaMA
Researchers from Harvard University published a paper detailing Inference-time Intervention (ITI), a technique designed to enhance the truthfulness of large language models. The paper demonstrates how to apply ITI to different versions of the LLaMA, Alpaca and Vicuna models —> Read more.
🤖 Cool AI Tech Releases
MIMIC-IT
AI researchers from Microsoft and the Nanyang Technological University, Singapore open sourced MultI-Modal In-Context Instruction Tuning (MIMIC-IT), a dataset with 2.8 million multimodal instructions for fine tuning foundation models —> Read more.
🛠 Real World ML
Recommendation Systems at Pinterest
Pinterest discusses the ML techniques for multi-task predictions for closeup recommendations —> Read more.
Data Management at Airbnb
Airbnb disclosed some details about Metis, its latest data management platform —> Read more.
📡AI Radar
NLP labeling platform Datasaur unveiled a new set of features supporting the evaluation of LLMs.
Primer Technologies announced a $69 million funding round to accelerate the development of AI solutions for the U.S Government.
Video generation platform Synthesia announced a $90 million investment.
Hugging Face and AMD announced a strategic alliance for accelerating foundation models on GPU platforms.
Accenture announced that is committing $3 billion to accelerate AI adoption.
Salesforce unveiled AI Cloud, a suite of generative AI services and applications for different areas such as content or code generation.
OctoML launched OctoAI, a platform for running, tuning and scaling generative AI models.
Training data platform Refuel AI announced $5 million in new funding.
Data lakehouse platform Dremio unveiled new generative AI capabilities.
AI monitoring platform Deepchecks raised $14 million in a new round.
ML monitoring startup WhyLabs unveiled LangKit, a new platform for safeguarding LLMs.
Microsoft unveiled CoPilot for Dynamics 365.
Integration platform Workato announced new language capabilities powered by OpenAI.