AWS’ Generative AI Strategy Starts to Take Shape and Looks a Lot Like Microsoft’s
AWS re:Invent was innundated with generative AI announcements.
Next Week in The Sequence:
Edge 349: We are almost at the end of our series about fine-tuning and we are going to discuss the nascent space of reinforcement learning with AI feedback(RLAIF). We review the original RLAIF paper and NVIDIA’s NeMo framework.
Edge 350: We review Hugging Face’s Zephyr model which has quickly become one of the most robust open source LLMs in the market.
You can/should/must subscribe below:
📝 Editorial: AWS’ Generative AI Strategy Starts to Take Shape and Looks a Lot Like Microsoft’s
The AWS re:Invent conference has long been regarded as the premier event of the year for cloud computing. The 2023 edition, however, was notably dominated by generative AI announcements, shedding light on AWS’s strategy in this area, which had previously been questioned. For years, Amazon was perceived as lagging behind cloud computing rivals Microsoft and Google in generative AI. In fact, in many earnings calls, generative AI has been highlighted as a trend through which Microsoft could surpass AWS as the leading cloud computing platform. re:Invent demonstrated that AWS is determined to be competitive; and while its strategy may not be unique, it appears to be robust.
The re:Invent announcements spanned a broad spectrum. Bedrock has emerged as the cornerstone of AWS's generative AI strategy, now supporting Anthropic’s Claude 2.1 and open-source models like LlaMA. AWS also unveiled smaller, specialized models such as Titan TextLite, Titan TextExpress, and Titan Image Generator, which focus on summarization, text generation, and image generation, respectively. The support for Large Language Models (LLMs) became even more compelling with the release of Titan Multi-model Embeddings, enabling multimodal search capabilities.
An area that caught my attention was the enhanced support for RAG and agents. Bedrock now allows developers to integrate their own data sources to build RAG applications. Additionally, AWS Q, an agent capable of performing various developer and devops operations, supports native integration with AWS services. AWS also introduced capabilities in model evaluation and data sharing, crucial for generative AI applications. Notably, there was also news on AI chips, with the launch of AWS Graviton4 and AWS Trainium2, optimized for generative AI workloads.
In summary, re:Invent showcased AWS's strength in the generative AI sector. Its strategy seems quite similar to Microsoft's, except that the latter benefits from broader distribution through Windows and Office. Among the three cloud giants, Google now appears to have the weakest offering, but this could change at the next conference.
🎁 Learn AI skills, win swag!
Join Zilliz (the creators of the Milvus vector database) and 23 other open source projects for the 2023 Advent of Code as we count down to the holidays! Earn points by starring repos and trying new technologies to win an exclusive swag pack.
Get all the contest details ->
🔎 ML Research
GAIA Benchmark
Researchers from Meta, HuggingFace, GenAI and AutoGPT published GAIA, a benchmark for general AI assistants. The benchmark measures tasks such as reasoning, multi-tasking, multimodality, web browing and many others —> Read more.
Inflection-2
Inflection unveiled the initial results of the training of Inflection-2, its next generation LLM. The model performs extremenly well in benchmarks ranging from question-answering to reasoning —> Read more.
GNoME
Google DeepMind published a paper detailing Graph Networks for Materials Exploration (GNoME), a deep learning model that was able to discover new materials. Specifically, GNoME discovered 2.2 million new crystals and 380,000 stable materials —> Read more.
The Power of Prompting
Microsoft Research published a paper demonstrating how generalist models like GPT-4 can perform as well as highly specialized models using the right prompts. The model compares GPT-4 against fine-tuned models in the medical space —> Read more.
LQ-LoRA
Researchers from Carnegie Mellon University, MIT and others published a paper unveiling LQ-LoRA, a method for efficient memory adaptation in LLMs. LQ-LoRA outperforms other quantization methods like QLoRa or GPTQ-LoRA in well established benchmarks —> Read more.
System 2 Attention
Meta AI published a paper detailing System 2 Attention(S2A) , a method for improving reasoning in LLMs. Borrowing terminology from behavioral psychology, S2A leverages native capabilities of LLMs to determine which parts of the context to attend to —> Read more.
🤖 Cool AI Tech Releases
AWS Gen AI
Amazon unveiled a dozen of generative AI releases at its re:Invent conference —> Read more.
PPLX Models
Perplexity introduced two new LLMs that can deliver up to date, factual responses —> Read more.
SDXL Turbo
Stability AI announced SDXL Turbo, a super fast text-to-image model —> Read more.
GPT Crawler
A cool framework that can crawl a website and create a custom OpenAI GPT based on the data —> Read more.
🛠 Real World ML
Content Moderation at LinkedIn
LinkedIn discusses the ML architecture powering its content moderation policies —> Read more.
Data Quality at Airbnb
Airbnb shares details about their ML methodology for scoring and enforcing data quality —> Read more.
RAG at NVIDIA
NVIDIA shared a reference architecture for retrieval-augmented generative apps —> Read more.
📡AI Radar
OpenAI published a blog post announcing management and board changes.
Cradle raised $24 million to apply generative AI to digital biology.
Tola Capital announced a new $230 million fund to invest in AI.
Together, a platform for open source generative AI, announced a $102.5 million series A.
Pika, a generative video platform, announced a $55 million fundraise.
PhysicsX, an AI platform for the engineering sector, raised $32 million in new funding.
Layla, an AI travel recommendation app, raised $3 million in a new round.
AI legal platform Solve Intelligence raised a $3 million seed round.