LLM Scaling Laws vs. Everything Else
Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.
Next Week in The Sequence:
Edge 335: Our series about fine-tuning in foundation models continues with one of the most famous techniques ever created: LoRA. We discuss the original LoRA paper and the LoRA for Diffusers framework.
Edge 336: We discuss OPRO, Google DeepMind’s super innvotive prompt optimization method.
You can subscribe below:
📝 Editorial: Next Week in The Sequence: LLM Scaling Laws vs. Everything Else
The prevailing mantra in recent years for large language models (LLMs) has been "bigger is better." Reality has demonstrated that LLMs truly shine at scale, revealing emerging capabilities that were not initially envisioned during their pretraining process. Nevertheless, in recent times, we have witnessed the emergence of counter theories suggesting that LLM growth is reaching a plateau, and a new generation of models will ultimately converge towards more manageable sizes. Techniques such as distillation, RAG, quantization, and, of course, data quality curation have been developed to empower smaller and more efficient models.
The question arises: are the LLM scaling laws approaching their limits? Which school of thought is correct?
While optimization methods undoubtedly contribute to the creation of more efficient, compact models, there is currently no empirical evidence to suggest that we are anywhere near experiencing diminishing returns within the realm of LLM scaling laws. Quite the contrary, new scaling frontiers remain perfectly attainable with the current generation of transformer architectures. Presently, the cost of pretraining a large-scale LLM typically ranges in the double-digit millions. In the near future, we may witness this cost escalate into the hundreds of millions or even billions. At such scales, LLMs are likely to exhibit properties that are difficult to fathom today. This trajectory can be accelerated by ongoing hardware breakthroughs, which seem to occur annually.
Challenging the value of the scaling laws in the current generation of LLMs is not only foolish but also factually incorrect. Similar to any other phenomenon in physics, there will come a day when we reach the limits of the scaling laws. However, that day is not today.
📺 To Watch: Vector Database Fundamentals
These short videos explain vector index types like HNSW, ANNOY, IVF and vector similarity metrics including Euclidian, cosine, inner product and more.
🔎 ML Research
Who is Harry Potter?
Researchers from Microsoft published a paper exploring a technique to fine-tune LLMs to unlearn specific concepts. The paper evaluates the process tuning Llama2-7b to forget all knowledge of Harry Potter’s books —> Read more.
LLaVA
Researchers from the University of Wisconsin-Madison, Microsoft Research and Columbia University published a paer detailing LLaVA, an instruction-tuned language-vision model. LLaVA extends Vicuna with a vision encoder in a very similar architecture to GPT-4 Vision —> Read more.
Stable Signature
Meta AI published a paper introducing Stable Signature, a method for watermarking generative AI images. Stable Signature incorporates information in the image that is invisible to the naked eye but that can be verified —> Read more.
RAG vs. Large Context in LLMs
Researchers from NVIDIA published a paper detailing a study that evaluates the performance of RAG vs. long context windows in LLMs. The research shows that 4k RAG-augmented models can achieve similar performance than 16k context models and other fascinating findings —> Read more.
SCREWS
AI researchers from ETH Zurich and Microsoft Semantic Machines present SCREWS, a new reasoning framework for LLMs. The techniques combines different reasoning building blocks such as sampling, conditional resampling, selection and several others —> Read more.
🤖 Cool AI Tech Releases
SteerLM
NVIDIA open sourced SteerLM, a framework for customizing LLMs during inference —> Read more.
Zephyr-7B
Hugging Face unveiled Zephyr 7B, a fine-tuned version of Mistral that outperformed Llama-70B in across different benchmarks —> Read more.
🛠 Real World ML
Trusted Notebooks at Salesforce
Salesforce engineering discusses their access control solution for securing notebooks in data science workflows —> Read more.
📡AI Radar
Character.AI announced a new feature that group chats with AI agents.
Adobe announced new generative AI tools powered by Firefly.
AMD announced that it has acquired Nod.ai to expands its AI software capabilities.
Data cloud platform Modal came out of stealth mode with a $16 million series A.
Japan is set to propose some AI regulatory guidelines for G7 countries.
China is planning to boost its compute capacity for AI workloads.
Saronic raised $55 million to build autonomous ships.
Conveyor raised $12.5 million to streamline security reviews using LLMs.
Anysphere raised $8 million to build an AI native software development environment.
Didi’s autonomous driving arm announced that it has raised $149 million in new funding.
LLM security startup Lakera came out of stealth mode with a $10 millionin backing.