We sat down with Achin Bhowmik, Ph.D., adjunct professor at Stanford and CTO and Executive Vice President of Engineering at Starkey, one of the biggest hearing aid manufacturers in the world.
In this Q&A, Achin Bhowmik shared his insights on perceptual computing, explored how tech is changing people’s lives, and explained how AI and healthable tech are revolutionizing healthcare.
Q: Could you share your journey in AI and tech?
My journey starts with perceptual computing. The whole idea behind this field is to look at how human perception works. Us humans have an amazing sensory perception system, natural biological sensors that include eyes and ears, a sense of balance, motor sensation, the sense of touch, taste, and smell. All sensors are sending information into the cerebral cortex, where we have a really efficient computational system that can help us make sense of things. It helps us create a model of the world around us. This enables us to not only understand what's going on but navigate it, and interact with each other and the world.
So why, as an engineer, am I interested in that?
Well, my journey is split into three phases:
1. Human-computer interaction
The first phase was human-computer interaction. The way that we interact with computers has evolved over time from the command prompt interface with old operating systems, to graphical user interfaces. But the vision has always been about interacting with the computer as we interact with each other. We want the computing systems to understand what we tell them, to recognize our faces, expressions, and gestures in the same way we make sense of each other's interactions.
2. Interactive computers
That leads to the next phase, which is about making those computers interactive. There's a slight difference in interacting with the computers through natural interactions, rather than having to do keyboard and mouse. The next step is to use the same technologies, but how about unleashing interactivity with those computers? Now we're talking about cars that can not only recognize things around them, but drive without a driver. We’re also talking about, robots that can come out of the industrial plants and be in your home, helping you in your day-to-day life. Another example of ‘next level’ interactive computers are drones that know where they’re going. This type of drone can fly and navigate its surroundings without crashing into things like trees and houses.
3. Technology to enhance human perception.
The third phase is what I'm now on, using pretty much the same technologies (sensors, sensing, computing, connecting, etc.) but the current focus is to use technology to enhance human perception. To summarize this, my transition into this world is that I moved from using sensing AI and sensing computing connectivity, making devices smart and computers interactive, helping robotic perception to enhance human perception. How do we help humans perceive and understand the world better? That is the journey that I’ve been undertaking for the past 20+ years.
Q: What is healthable tech?
Starkey is a privately held company, yet we have more than 5000 employees and close to a billion dollars in revenues. We are singularly focused on using technology to help people hear better and live better.
First of all, being able to hear and understand is central to who we are as human beings. Helen Keller, who was both deaf and blind, was credited for the quote:
“Blindness cuts us off from things, but deafness cuts us off from people”.
We take the sense of hearing for granted as it just happens naturally, but there's a tremendous number of people that need help with hearing.
According to the World Health Organization, 466 million people live with disabling hearing loss. Disabling hearing loss means you have to amplify sound by over 40 decibels and that's a tremendous amount of amplification to barely hear the sound around you.
A great number of people between the age of 70 and 80 years old experience hearing loss. And, 80% of us will have hearing loss issues when we are 80 years old or above. This is something that we should all take very seriously. After all, people are living longer now, and 85-plus years is the fastest-growing population in the world. Using technology to help people hear better, is an attempt to help them live better.
At Starkey, we've undertaken this challenge to incorporate new technology (sensors and artificial intelligence) to transform what's known to be a traditional hearing aid into your health device. With embedded sensors and AI, these hearing aids can track your physical activities, detect if you fall, track your social engagement versus loneliness, and alert loved ones or caregivers. We are also working on technology to measure multiple biometric sensors. The goal here is to develop a system that knows about impending problems in your health before you have symptoms or a fit and warn you in advance.
Q: Could you tell us about the challenges or the innovation in AI that you're actually enabling in these devices?
So, first of all, they’re small devices. The goal for us in designing these devices is to make them discreet and comfortable so that you can have them on and you forget that you have them on., I use hearing aids on my devices all the time, even though I don't have hearing loss. Even when I'm in the room with somebody, I walk up close and tell them, you know, I use hearing aids all the time, and they'll say, “but you don't have them on right now”. If I were close to you, you won't realize I'm using my device until I really take it out of my ear.
You're all familiar with Airpods and many other earbuds but unlike those, these devices are so discreet and comfortable that I can put them on in the morning and have them on all day even to take shower or go to sleep.
So why do I use it? It's my Bluetooth streaming device. If I get a phone call from you, I don’t need to look around for my Airpods. If I'm streaming an audiobook, watching a YouTube video, or watching a movie, it has more than 20 hours of battery life.
These sorts of devices are fully custom, they were made for my ears because every ear has a unique geometry. We create a physical 3D impression of the ear canal and make one that perfectly fits you.
What do we do with AI here? First, we want to use AI to understand what sound is what. In other words, we want to break up this cacophony of sound and complex acoustic waves into different kinds of sound. Today with AI and machine learning, we're doing amazing work in recognizing the environment around us. In these devices, the sound processing engine is automatically adjusted every few milliseconds, doing a classification of the world around me and making parameter adjustments to optimize my listening experience.
Q: Could you tell us more about the latest innovation you’ve worked on that lets you hear speech muffled by masks?
We have also been using AI to help people in situations where others are wearing masks. We have such amazing technology, including lightweight AI algorithms, which means our models can run on less than a milliwatt. Since we're able to classify sound so accurately, we challenge ourselves to see if we can automatically determine what dB of attenuation is not only just in the sound level, but also in the frequencies scale.
Masks are important to the speed signals around us. I collected enormous amounts of data, and we were able to come up with a solution we call ‘Edge Mode’. If you find it challenging to understand conversations, particularly with hearing loss, and people have masks on around you, it's really about two things One, we have collected a lot of data for all kinds of masks in the lab to see what kind of attribution as a function of frequency goes on for the mask is invading the sound.
Secondly, you can’t see the lips when somebody has a mask on, which also poses difficulty in understanding conversation. With Edge Mode, all you have to do is double-tap our hearing aid, as we have embedded sensors to recognize the double-tap, and then it takes a quick snapshot of the acoustic environment and makes automatic adjustments to clarify speech.
Q: What kind of data collection, prep mechanism, and data lifecycle do you do to train your models?
There are three components to this: the data, the model itself, and our life in an ecosystem of devices.
Unlike traditional discussion on AI, which tends to just hop around computer vision, we are going to engineer products for millions of people that want to use these devices all day on a single charge, so the model becomes important. The whole idea here is that, with a device like this, it's impossible for you to pre-program. Even with hearing aids that have been out there for decades, they're getting transformed with AI.
If you are going to daisy chain of, you know, if A and B, if not, then C of D, it has millions and millions of conditions that we have no way to tell, even when you program it, I could be exposed to a completely different environment that I haven't been exposed to before. Data becomes important, I don’t need to know these rules but I'll collect enormous amounts of information, data, acoustic data, sound data, from all walks of life, all scenarios of possibility that I might be exposed to train the AI model.
The model itself is unique because I don't have a laptop class memory on these devices. So, we super-optimize AI models in a way that you can take something off the shelf and use it in hearing aids.
During the day, my hearing aids will do everything for me, including classifying sound, enhancing the sound, clarifying speech, tracking physical activity, and detecting if I fall. We have worked with both Apple and Google to develop a proprietary custom audio streaming protocol and a handshake between these devices. When the device is connected with a smartphone wirelessly, they are able to take advantage of increased computation power and memory footprint on the phones and connect to the Cloud.
We are now able to take advantage of this distributed computing. Think of the milliwatts in my ear, to watts in my hand or pocket, to literally gigawatts in the Cloud. We have integrated a personal assistant with our hearing aid. You can double-tap and our personal assistant will accept and answer your questions. Our older patients are able to program to take medicine at 8 pm and the hearing aid will talk to you at 8 pm to tell you to take your medicine. Or, it could be simply asking what's the weather outside or who won the Superbowl in 1982, and the device will answer your question.
You could also use it to translate using the power of the Cloud, as we are translating between 27 languages. How well does it work? Actually, because the translation engine itself is running on the Cloud, you're reliant on the connectivity between the phone and the Cloud. In my experience, using in a one-on-one setting works very well but it doesn't work very well if you are in a group conversation where everybody is interacting and interrupting each other.
Q: Where do you think AI is heading? What are some of the challenges that you're seeing?
A: As a practitioner in the field, for me, it's always about making an impact. I teach at Stanford but I’m not a full-time academic. what I'm most passionate about is using technology to solve the problem of today.
AI is doing amazing. We can now solve problems that were previously thought to be impossible, with the developments in machine learning and computer vision. In 2010, a book on computation said: “the computer's ability to recognize objects is inferior to a two-year-old human child”. Now, they’re performing better than humans, in narrow domains. In the area of sound, extremely lightweight engines can tell the difference between what’s speech and what's noise and treat them differently.
With these technologies, from entertainment to productivity, finances, augmenting human senses, and human health, I believe we’re at the onset of a major impact that can positively
change how people live their lives and on society in general.
I'm extremely excited because we already have great features and technologies in the product that we're shipping now. However, I'm even more excited about things we’re working on in the lab for the future. There'll be the product, a year from now, two years from now, two years from now, that will make today's products look like toys.