Security: The Most Ignored Area of MLOps

Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.

Dec 18, 2022

In the last few years, we have seen remarkable levels of innovations across most areas of the MLOps stack. Model serving, monitoring, interpretability, testing are some areas that have quickly become incredibly fragmented with numerous innovative startups and incumbents launching incredibly compelling offerings. Security seems to be the one area lacking behind in innovation in the ML space. This might seem surprising as, in the traditional DevOps space, security have become an integral part of the lifecycle of applications. In the case of ML, security is often treated as an afterthought or try to be addressed by using traditional stacks which don’t quite adapt to the dynamics of ML applications.

Securing ML pipelines is not only different but quite challenging. The nature and surface of attacks in ML solutions doesn’t share the DNA of traditional applications often involving areas such as data or policy manipulation. This problem is even worse in the era of large foundation models which are currently dominating the ML landscape. If we don’t even understand how a large model makes predictions, how can we protect it?

Just like MLOps was the evolution of DevOps for the ML era, ML security needs a new stack. The ML space needs a new generation of ML-first security platforms. Most of the innovation in ML security has been constrained to research and experimental efforts but that’s starting to change. Last week, ML security Protect.ai came out of stealth mode announcing a new series A and one of the most complete and pragmatic ML-first security stacks ever released. Their initial platform is segmented in two fundamental product. NB Defense is a tool that scans for security vulnerabilities directly in Jupyter networks which incetivizes data scientists to incorporate security since the experimentation phase of an ML solution. AI Radar will be the second product of Protect.ai with a focus on enabling more comprehensive suite for testing and discovering vulnerabilities in ML pipelines.

From networking to cloud computing, every software trend in history have created a parallel cyber security industry. ML needs ML-first security. Protect.ai is a good starting point but we are likely to see security evolve into its own market in the ML space.

🗓 Next week in TheSequence Edge:

Edge#253: Our series about ML interpretability continues by discussing the partial dependency plot methods. The research section dives into interpretable time series forecasting transfromers and the technology section is dedicated to Google’s fairness interpretability indicators.

Edge#254: We review InstructGPT, one of the key models behind the ChatGPT phenomena.

🔎 ML Research

Data2vec 2.0

Meta AI published a paper discussing Data2vec 2.0, a self-supervised learning model that can learning in three different modalities: speech, vision and text —> Read more.

Recorder’s Speaker Labeling

Google Brain published a paper detailing the technique used to label speaker in the Pixel Recorder app —> Read more.

Robotics Transformer

No, this is not the movie but a research paper published by Google Brain detailing RT-1, a transformer model that can handle robotics inputs —> Read more.

🤖 Cool AI Tech Releases

Text-Embedding-Ada-002

OpenAI released text-embedding-ada-002 , a new embedding model that is significatively smaller and more efficient than other embedding methods in the OpenAI API including marquee Davinci model —> Read more.

Five Years of SageMaker

Amazon SageMaker just turned five and the team has some interesting reflections about the past and future roadmap —> Read more.

🛠 Real World ML

Causal Inference at LinkedIn

LinkedIn discussed Ocelot, their internal platform for observational causal inference —> Read more.

💸 Money in AI

Protect AI just came out of stealth mode with its ML attack prevention platform and a $13.5 million series A.
Robco raised $14 million to expand its modular robotic solution to industrial SMBs.
Shield AI raised $60 million to complete a $225 million series E to build its autonomous fighting pilot technology.
AI powerhouse Dataiku completed $200 million series F to continue expanding its ML platform.
AI-driven market research platform Zappi raised $170 million to expand its consumer insights solution.
EnCharge AI raised $21 million to develop AI accelerator hardware.
Sana Labs announced a $34 million series B for its corporate knowledge learning platform.
Vic.ai raised $52 million for its accounting automation platform.
LexCheck announced $17 million series A for its contract acceleration solution.
Synthetic video generation platform Infinity AI raised $5 million in seed funding.
AI-powered talent management company Beamery raised $50 million series D.

TheSequence