♦️⚡️♦️ Databricks' New ML Announcements
📝 Editorial
Databricks has been one of the companies that have been at the center of the big data movement, pioneering technologies such as Apache Spark. Machine learning (ML) has been a native component of Spark since the early days, and, as a result, Databricks, little by little, has become an important force in the ML space with stacks such as MLflow. Last week, at its Data + AI Summit, Databricks unveiled a series of new releases that significatively enhance the ML capabilities of its platform.
Databricks’ ML announcements are about what Databricks does well: scalability and operational management. MLflow 2.0 was the highlight of the new releases, with capabilities that streamline the management of ML models’ lifecycle. Serverless Model Endpoints allow the serving of ML models as serverless functions, simplifying the infrastructure requirements for real-time ML applications. Model Monitoring is another addition to the Databricks stack that provides key performance metrics about the runtime behavior of ML models. Other complementary releases include capabilities such as Project Lightspeed, which simplifies the data streaming experience in Spark clusters.
Databricks is known as a big data infrastructure company, but this week it made it clear that ML is a central element of its value proposition. Given the massive distribution of the Databricks platform, we should expect relevant adoption of these new ML capabilities.
🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻
🗓 Next week in TheSequence Edge:
Edge#205: we explain graph attention networks; discuss the original GAT paper; explore TF-GNN, a library for implementing GNNs in TensorFlow.
Edge#206: we deep dive into OpenAI’s paper detailing the new Transformer model that mastered Minecraft by using unlabeled videos
Now, let’s review the most important developments in the AI industry this week
🔎 ML Research
TabTransformer
Amazon Research published details about TabTransformer, a transformer model optimized for tabular datasets →read more on Amazon Research blog
Minerva
Google Research published a paper introducing Minerva, a language model able to solve mathematical and scientific problems using step-by-step reasoning →read more on Google Research blog
DALLE-2 Pretraining
OpenAI discusses some of the pretraining guidelines used to prevent risks in image generation models like DALLE-2 →read more on OpenAI blog
MiCS
Amazon Research published a paper unveiling MiCS, a communication optimization method for distributed training →read more on Amazon Science blog
💬 Useful Tweet
On our Twitter, we recommend books, share the most interesting events, highlight the week's research, and post hot intern positions!
🤖 Cool AI Tech Releases
Databricks ML Announcements
At its Data + AI Summit, Databricks unveiled a series of projects that are starting to paint the picture of a complete ML platform →read more in this blog post from Databricks
Project Lightspeed
Databricks also announced Project Lightspeed, a simpler stream processing architecture for Apache Spark →read more in this blog post from Databricks
🛠 Real World ML
Taxonomy Classification at Airbnb
Airbnb discusses some of the ML practices used for its taxonomy classification systems →read more on Airbnb Engineering blog
Forecasts at Spotify
Spotify details the architecture used to run user forecasts models at scale →read more on Spotify Engineering blog
💸 Money in AI
AI-powered talent marketplace Gloat raised a $90 million Series D round led by Generation Investment Management. Hiring in New York/US, Tel Aviv/Israel, Singapore, and India.
Speech recognition startup Speechmatics raised $62 million in a Series B funding round led by Susquehanna Growth Equity. Hiring globally.
AI infrastructure startup Modular emerged from stealth with a $30 million seed round led by GV. Hiring remote.
Research assistant HeyDay raised a $6.5 million seed round led by Spark Capital. Hiring in San Francisco/US.
Low-code AI integration platform AI Squared raised a $6 million seed financing round led by NEA.
Business intelligence app Zing Data raised a $2.4 million round led by Kindred Ventures. Hiring remote in the US.
Acquisitions
Natural language processing company MeaningCloud was acquired by Reddit for an undisclosed amount.