đ Size MattersÂ
Weekly newsletter that discusses impactful ML research papers, cool tech releases, the money in AI, and real-life implementations
đ EditorialÂ
The recent emergence of pre-trained language models and transformer architectures pushed the creation of larger and larger machine learning models. Googleâs BERT presented attention mechanism and transformer architecture possibilities as the ânext big thingâ in ML, and the numbers seem surreal. OpenAIâs GPT-2 set a record by processing 1.5 billion parameters, followed by Microsoftâs Turing-NLG, which processed 17 billion parameters just to see the new GPT-3 processing an astonishing 175 billion parameters. To not feel complacent, just this week Microsoft announced a new release of its DeepSpeed framework (which powers Turing-NLG), which can train a model with up to a trillion parameters. That sounds insane but it really isnât. Â
What we are seeing is a consequence of several factors. First, computation power and parallelization techniques have evolved to a point where it is relatively easy to train machine learning models in large clusters of machines. Second and most importantly, in the current state of machine learning, larger models have regularly outperformed smaller and more specialized models. Knowledge reusability methods like transfer learning are still in very nascent stages. As a result, itâs really hard to build small models that can operate in uncertain environments. Furthermore, as models like GPT-3 and Turing-NLG have shown, there is some unexplainable magic that happens after models go past a certain size.
Many of the immediate machine learning problems might be solved by scaling the current generation of neural network architectures. Plain and simple, when it comes to machine learning, size matters. Â
We would love to hear your opinions about the debate between broader-larger vs. smaller and more specialized models. Â
Now, to the most important developments in the AI industry this week Â
đ ML Research
GPT-3 Falls Short in Machine Comprehension
Proposed by researchers from a few major American universities, a 57-task test to measure modelsâ ability to reason poses challenges even for sophisticated models like GPT-3 ->read more in the original paper
Better Text Summarization Â
OpenAI published a paper showing a reinforcement learning with human feedback technique that can surpass supervised models ->read more on OpenAI blog
Reinforcement Learning with Offline DatasetsÂ
Researchers from the Berkeley AI Research (BAIR) Lab published a paper unveiling a method that uses offline datasets to improve reinforcement learning models->read more on BAIR blog
đ¤ Cool AI Tech Releases
New Version of DeepSpeed
Microsoft open-sourced a new version of DeepSpeed, an open-source library for parallelizable training that can scale up to models with 1 trillion parameters->read more on Microsoft Research blog
đŹ Useful Tweet
AI is Reinventing Chess. Great read.Â
đ¸ Money in AI
AI-powered customer experience management platform Sprinklr has raised $200 million (kudos to our subscribers from Sprinklr đ). Sprinklr's âAI listening processingâ solution allows companies to get structured and meaningful sentiments and insights from unstructured customer data that comes from public conversations on different websites and social platforms.
Xometry, an on-demand industrial parts marketplace, raises $75 million in Series E funding. The company provides a digital way of creating the right combination of buyers and manufacturers.
Another example of AI implementation into matching two sides for a deal. Real estate tech company Orchard raises $69 million in its recent funding round. Orchard aims to digitize the whole real estate market, by developing a solution that combines machine learning and rapid human assistance to smooth the search, match the right deal, and simplify buying and selling relationships.
Cybersecurity startup Pcysys raised $25 million in its funding round. Pcysysâ platform, which doesnât require installation or network reconfiguration, uses algorithms to scan and âethicallyâ attack enterprise networks.
Robotics farming company Iron Ox raised $20 million in a funding round. The system of farming robots is still semi-autonomous, the companyâs goal is to become fully autonomous.Â
Insurtech company Descartes Underwriting raised $18.5 million. The company applies AI and machine learning technologies to climate risk predicting and insurance underwriting.
Legaltech startup ThoughtRiver raised $10 million in its Series A round. Its AI solution applied to contract pre-screening aims to boost operational efficiency.
Medtech startup Skin Analytics raised $5.1 million in Series A funding. Skin Analytics has developed a clinically validated AI system that can identify not only the important skin cancers but also precancerous lesions that can be treated, as well as a range of lesions that are benign.
Amazon, along with several government organizations and three other industry partners, helped fund the National Science Foundation, a high-priority AI research initiative. The amount of funding is not disclosed.
đşđť TheSequence Scope â our Sunday edition with the industryâs development overview â is free. To receive high-quality educational content every Tuesday and Thursday, please subscribe to TheSequence Edge. đşđť
đ Next week in TheSequence Edge:
Edge#21: the concept of Machine Text Generation; the research behind Microsoftâs Turing-NLG, one of the largest language pre-trained models in history; AllenNLP, an open-source framework for advanced natural language research. Â
Edge#22: the concept of Question-Answering Models; the paper in which the Google Research team presents a new dataset for training and evaluating question-answering systems; DeepPavlov open-source framework for advanced NLU methods including question-answering.Â
5 minutes of your time, 3 times a weekâ with TheSequence you will steadily become knowledgeable about everything happening in the AI space.Â
does anyone have a theory or an hypothesis on why passed a certain parameters size, magic happens?