🎙Yinhan Liu/CTO of BirchAI about applying ML in the healthcare industry
It’s so inspiring to learn from practitioners. Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work is an excellent source of insight and inspiration. Share this interview if you find it enriching. No subscription is needed.
👤 Quick bio / Yinhan Liu
Tell us a bit about yourself. Your background, current role and how did you get started in machine learning?
Yinhan Liu (YL): I started my undergrad as a Chemical Engineering major and added a math major – not focused at all on CS. I didn’t get my start in the field until I took an ML class during my first semester of grad school, which inspired me to spend a lot of personal time reading AI-related papers. I eventually made my way to Facebook AI Research, where I had the opportunity to work with some great people at an important time in NLP history. But, while I enjoyed the research side of things, I wanted to have a more direct impact on people. So, I decided to co-found BirchAI at AI2 with trusted colleagues I had known for 5 to 10 years. I’m now its CTO, leading Engineering and Science.
🛠 ML Work
BirchAI is focused on applying cutting edge natural language processing (NLP) and speech analysis techniques to the healthcare space. Could you tell us a bit more about the vision behind the company.
YL: BirchAI is focused on applying AI to complex audio processes in healthcare – an area that Sumant (COO), Kevin (CEO), and I have been thinking about for a long time. There’s much more beyond this, but our initial focus is on automating complex After Call Work in healthcare call centers – think of a patient calling in about an issue with her pacemaker. The healthcare industry faces several related business challenges that drive our ML challenges. For example, humans vary a lot in terms of how they understand, classify, and summarize detailed healthcare conversations. For BirchAI, that means that IF the data is labeled, it is usually labeled poorly. We have developed effective workarounds that have allowed us to achieve very high accuracy at scale. That leads us to another point: the notion of “Explainable Human”. Many customers initially maintain that their call center teams already achieve consistency and accuracy of 98 or 99%. Invariably we see that is not true. Companies think they know how employees are doing the work. But it is based on crude, low-volume, and manual sampling methods of Quality Assessment that fail to understand the semantic richness of conversations at scale and how that dialogue should be characterized. The BirchAI product highlights this variance and gives us the means to drive and maintain consistency and accuracy at a previously unattainable scale. Healthcare companies spend tens of billions of dollars trying to address these questions – we are addressing those at scale.
How did you achieve high accuracy at scale and what are other ML challenges you are trying to address?
YL: Our first challenge is that our data is not labeled – and large-scale pre-trained models do not work out of the box. We have built a complex AI-based pipeline to label data we use to train at scale and then reach a high degree of accuracy.
Another challenge is that these problems cannot be met with a single module – so we’ve used a multi-modal approach to create a robust pipeline of models for our product.
In recent years, techniques such as language pretrained models and transformers have dominated the NLP space. What’s the main value proposition of these techniques compared to previous NLP techniques relating to the healthcare industry?
YL: Previous NLP technology was essentially as developed as it was going to get, yet it was not accurate or robust enough to meet customer needs for most healthcare use cases. As a result, many processes are still done manually. But pre-trained models with a transformer architecture now provide a higher performance starting point, and there is much more to be discovered. We intimately understand those opportunities in areas like voice and document AI, and we are actively exploiting those to build game-changing products in healthcare.
What sort of problems are you solving using artificial intelligence?
YL: 1. The first big problem we needed to overcome was Speech to Text – we have found that the off-the-shelf APIs do not provide a good enough input for our downstream models. That’s why we built our own STT that consistently outperforms the other STT models we can see. Of course, we will continue to improve this model, which is flexible enough to allow that.
2. Another problem has been how to optimize our models. We are not a consulting shop – we are a product company. How do we maximize production performance with the fewest possible models? For example, we have a large medical device customer with a single, high-quality dialogue summarization model working across four different products. We are starting to deploy that and are excited to see how we can extend that across all their products.
3. At the core of our capabilities has been the ability to use AI itself to create high-quality, large-scale, labeled data. This is similar in concept to back-translation, where we use AI models to create labels at scale, and we then train other AI models using those labels and other data. We’ve had great success with this and see many possibilities for the approach.
Deployment in the real world of healthcare raises a whole other set of challenges. How do you tackle this?
YL: It’s not enough to create a huge model that uses massive amounts of expensive computing to create a result out in dev. We’ve been lucky enough to recruit a great founding engineer, Gaurav Shegokar, who really understands how to optimize inference time and accuracy – or infrastructure and performance. That blend of traditional software and AI at scale is a key characteristic we look for in new engineering hires.
💥 Miscellaneous – a set of rapid-fire questions
Favorite math paradox?
What book would you recommend to an aspiring ML engineer?
Introduction to Statistical Learning. It tells you everything you need to get started!
Is the Turing Test still relevant? Any clever alternatives?
We use a bit of a Turing Test approach when we show people our dialogue summaries. After more than 50 interactions, more people identify our BirchAI-generated summary as created by a human than the correct one. So yes, I guess it is still relevant.
Does P equal NP?
No, I don’t believe so.