TheSequence

Share this post

♦️⚡️♦️ Databricks' New ML Announcements

thesequence.substack.com

Discover more from TheSequence

The best source to stay up-to-date with the developments in the machine learning, artificial intelligence, and data science world. Trusted by 144,485 professionals from the main AI labs, universities, and enterprises
Over 164,000 subscribers
Continue reading
Sign in

♦️⚡️♦️ Databricks' New ML Announcements

Jul 3, 2022
20
Share this post

♦️⚡️♦️ Databricks' New ML Announcements

thesequence.substack.com
Share

📝 Editorial 

Databricks has been one of the companies that have been at the center of the big data movement, pioneering technologies such as Apache Spark. Machine learning (ML) has been a native component of Spark since the early days, and, as a result, Databricks, little by little, has become an important force in the ML space with stacks such as MLflow. Last week, at its Data + AI Summit, Databricks unveiled a series of new releases that significatively enhance the ML capabilities of its platform.  

Databricks’ ML announcements are about what Databricks does well: scalability and operational management. MLflow 2.0 was the highlight of the new releases, with capabilities that streamline the management of ML models’ lifecycle. Serverless Model Endpoints allow the serving of ML models as serverless functions, simplifying the infrastructure requirements for real-time ML applications. Model Monitoring is another addition to the Databricks stack that provides key performance metrics about the runtime behavior of ML models. Other complementary releases include capabilities such as Project Lightspeed, which simplifies the data streaming experience in Spark clusters.  

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Databricks is known as a big data infrastructure company, but this week it made it clear that ML is a central element of its value proposition. Given the massive distribution of the Databricks platform, we should expect relevant adoption of these new ML capabilities.  


🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻

🗓 Next week in TheSequence Edge:

Edge#205: we explain graph attention networks; discuss the original GAT paper; explore TF-GNN, a library for implementing GNNs in TensorFlow. 

Edge#206: we deep dive into OpenAI’s paper detailing the new Transformer model that mastered Minecraft by using unlabeled videos


Now, let’s review the most important developments in the AI industry this week

🔎 ML Research

TabTransformer

Amazon Research published details about TabTransformer, a transformer model optimized for tabular datasets →read more on Amazon Research blog

Minerva

Google Research published a paper introducing Minerva, a language model able to solve mathematical and scientific problems using step-by-step reasoning →read more on Google Research blog

DALLE-2 Pretraining 

OpenAI discusses some of the pretraining guidelines used to prevent risks in image generation models like DALLE-2 →read more on OpenAI blog

MiCS

Amazon Research published a paper unveiling MiCS, a communication optimization method for distributed training →read more on Amazon Science blog


💬 Useful Tweet

On our Twitter, we recommend books, share the most interesting events, highlight the week's research, and post hot intern positions!

Twitter avatar for @TheSequenceAI
TheSequence @TheSequenceAI
3 free books, the most popular ones! 1. Fundamentals of Data Visualization 2. Hands-On Data Visualization 3. Reinforcement Learning: An Introduction Share this post with your friends to spread the word! Links⬇️
Image
Image
Image
3:50 PM ∙ Jun 30, 2022
1,930Likes570Retweets

FOLLOW US ON TWITTER


🤖 Cool AI Tech Releases

Databricks ML Announcements 

At its Data + AI Summit, Databricks unveiled a series of projects that are starting to paint the picture of a complete ML platform →read more in this blog post from Databricks 

Project Lightspeed 

Databricks also announced Project Lightspeed, a simpler stream processing architecture for Apache Spark →read more in this blog post from Databricks 


🛠 Real World ML 

Taxonomy Classification at Airbnb 

Airbnb discusses some of the ML practices used for its taxonomy classification systems →read more on Airbnb Engineering blog

Forecasts at Spotify 

Spotify details the architecture used to run user forecasts models at scale →read more on Spotify Engineering blog


💸 Money in AI

  • AI-powered talent marketplace Gloat raised a $90 million Series D round led by Generation Investment Management. Hiring in New York/US, Tel Aviv/Israel, Singapore, and India.

  • Speech recognition startup Speechmatics raised $62 million in a Series B funding round led by Susquehanna Growth Equity. Hiring globally.

  • AI infrastructure startup Modular emerged from stealth with a $30 million seed round led by GV. Hiring remote.

  • Research assistant HeyDay raised a $6.5 million seed round led by Spark Capital. Hiring in San Francisco/US.

  • Low-code AI integration platform AI Squared raised a $6 million seed financing round led by NEA.

  • Business intelligence app Zing Data raised a $2.4 million round led by Kindred Ventures. Hiring remote in the US.

Acquisitions

  • Natural language processing company MeaningCloud was acquired by Reddit for an undisclosed amount.

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

20
Share this post

♦️⚡️♦️ Databricks' New ML Announcements

thesequence.substack.com
Share
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing