The Sequence Pulse: How Uber Eats is Using Embeddings?
Two-Tower Embeddings has been the technique of choice to power recommendations at Uber Eats
Embeddings are one the machine learning(ML) techniques that have been popularized with the raise of foundation models. The embeddings space has created entire new markets such as vector databases and has become one of the omnipresent components of modern ML solutions. However, the best practices for using embeddings at scale are still in very early stages. Recently, the Uber engineering team disclosed some details about the architecture powering a very unique form of embeddings called Two-Tower Embeddings(TTE) for powering recommendations at Uber Eats. Today, we would like to detail the techniques, architecture and best practices used by Uber in their journey to adopt TTE.
Uber has been one of the tech companies at the forefront of ML architecture innovations in the last few years. Uber’s internal ML platform known as Michelangelo has served as the inception point for ML platforms like Tecton( feature store) or Predibase( low-code ML) and over a dozen open source ML projects. Throughout this article, we are going to refer to Michelangelo multiple times.