🧬 DeepMind’s AlphaFold Database
Weekly news digest curated by the industry insiders
It’s hard to think that it has been an entire year since DeepMind open-sourced AlphaFold, the model that astonished the machine learning (ML) world by predicting the structure of proteins based on a sequence of amino acids. AlphaFold can easily be considered the most relevant ML contribution to the world of science in the last decade. The initial release of AlphaFold was accompanied by a less promoted project known as AlphaFold Protein Structure Database (AlphaFold DB). This project aimed to provide an open dataset with the structure of proteins. Last week, DeepMind doubled down in AlphaFold DB with a new and incredibly impressive release.
In collaboration with EMBL’s European Bioinformatics Institute (EMBL-EBI), DeepMind upgraded AlphaFold DB with the structure of nearly all catalogued proteins known to science. The number is about 200 million protein structures from plants, animals, bacteria, fungi, and other organisms. The dataset has also been released as part of Google Cloud Public Datasets making it even more accessible to researchers. By providing access to the structure of proteins in a data-structured, searchable format, AlphaFold DB can drastically advance research across different scientific areas ranging from biology, pharmaceuticals or food safety. Another impressive open source contribution by DeepMind.
🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻
🗓 Next week in TheSequence Edge:
Edge#213: we overview the fundamental types of tests to be applied on trained models; explain how Meta uses Bayesian Optimization to conduct better experiments in ML models; explore TensorFlow’s What-If Tool.
Edge#214: we deep dive into NLLB-200, Meta AI’s new super model that achieved new milestones in machine translations across 200 languages.
Now, let’s review the most important developments in the AI industry this week
🔎 ML Research
AlphaFold DB vNext
DeepMind expanded AlphaFold DB with predicted structures of all proteins known to science →read more on the DeepMind blog
ML Code Completion
Google Research published an insightful blog post about the use of large language models and semantic rules engines to improve developer productivity →read more on the Google Research blog
Causal Inference and ML
Amazon Research published a paper proposing a technique to apply causal inference to scenarios with continuous variables →read more on the Amazon Research blog
💬 Useful Tweet
🤖 Cool AI Tech Releases
Meta AI open-sourced Theseus, a library for incorporating domain knowledge in ML models →read more on the Meta AI blog
ML monitoring platform Fiddler announced major improvements to its Model Performance Management (MPM) platform, achieving enhanced scalability and a deeper understanding of unstructured model behavior and performance →read more on the Fiddler’s blog
🛠 Real World ML
ML Education at Uber
Uber discusses some principles about its internal ML education program →read more on the Uber Engineering blog
Load Testing TensorFlow Serving
The TensorFlow team details the techniques and results used to load test the TensorFlow Serving REST interface →read more on the TensorFlow blog