Subscribe now

Technology

Eavesdropping app will turn your smartphone into a virtual PA

By Hal Hodson

19 August 2015

New Scientist Default Image

Listening in (Image: Andrew Kelly / Reuters)

YOUR phone is listening to you. A program that runs in the background of your smartphone can digest and understand the sounds of the world around it. The information it gleans could inform the next generation of virtual assistants. It could also make search engines far more useful.

Neural networks – computer models that ape the complexity of the brain – dominate our online lives: Google Translate can whip English text into Russian, for example, while Facebook’s DeepFace can pick one face out of millions.

Now that power is moving offline. Nic Lane and colleagues at Bell Labs in Murray Hill, New Jersey, have built a listening neural network called DeepEar that runs on a phone without being connected to the internet.

The idea of continuous audio sensing isn’t new, but until now the algorithms haven’t worked well in a noisy environment. They were also too power hungry. Lane’s system uses about 6 per cent of a smartphone’s battery for a whole day of listening.

DeepEar works by training a neural network to listen for and recognise different kinds of aural scenes, identify human speakers, recognise emotion and detect stress. In the roar of a busy train station, for instance, DeepEar might be able to hear the time of the next train announced over the PA system, as well as the emotion in its owner’s voice as they argue with a slow ticket-seller. That information could help software assistants learn to remind the phone user to arrive at this particular station with plenty of time to spare in future.

“The system is trained to identify individual human speakers, recognise emotion and detect stress”

“This means a very responsive virtual assistant that can understand both you and your environment and respond to you immediately,” says Dimitrios Lymberopoulos at Microsoft Research in Washington State. “This experience could make virtual assistants way more useful.”

Unlike the major commercial neural networks, DeepEar doesn’t rely on powerful computers accessed via an internet connection to do its heavy lifting. Instead it uses only the processors in a smartphone. This saves battery life and keeps the user’s personal information on their own device rather than in the cloud.

It’s not just audio that offline neural networks are targeting. Earlier this year, Lane and colleagues built a prototype device designed to capture high-resolution lifestyle data from wearable sensors. The system, worn on the lapel, could make inferences about what the person was doing by interacting with sensors such as FitBits. It also listened to the wearer’s voice to detect issues such as stress.

Lymberopoulos says endowing mobile devices with the ability to continuously analyse data allows them to understand the environment someone is in, and adjust their interface accordingly.

“Because of the ubiquity of mobile devices and the robustness and accuracy of neural networks, we could use these systems to understand the physical world and index it in real-time,” says Lymberopoulos.

He also envisages adding the findings of neural nets like DeepEar to search engines, letting us search for cafes by ambience, for instance.

“Right now, there is no search engine that can understand what a ‘crowded bar’ or a ‘bar playing loud pop music’ is. With this type of sound classification, we can make this real,” he says.

Sign up to our weekly newsletter

Receive a weekly dose of discovery in your inbox! We'll also keep you up to date with New Scientist events and special offers.

Sign up