Microsoft Leverages Hologram and Audio Cues to Help the Blind

Peter Bosher, middle, an audio engineer who is blind, checks out the latest iteration of the system at Microsoft’s research lab in Cambridge, UK. Bosher worked with the Project Tokyo team early in the design process. He is joined in the picture by researchers Martin Grayson, left, and Cecily Morrison, right. (From Jonathan Banks/Microsoft)
Peter Bosher, middle, an audio engineer who is blind, checks out the latest iteration of the system at Microsoft’s research lab in Cambridge, UK. Bosher worked with the Project Tokyo team early in the design process. He is joined in the picture by researchers Martin Grayson, left, and Cecily Morrison, right. (From Jonathan Banks/Microsoft)

It’s like seeing Star Trek Lieutenant Commander Geordi La Forge’s visor come to life.

Ed Cutrell, a senior principal researcher with Microsoft’s research organization in Redmond, Washington, is a co-leader of Project Tokyo. (From Jonathan Banks/Microsoft)
Ed Cutrell, a senior principal researcher with Microsoft’s research organization in Redmond, Washington, is a co-leader of Project Tokyo. (From Jonathan Banks/Microsoft)

Microsoft has broken down how far its Project Tokyo has progressed in helping the visually impaired go about their daily lives more easily, particularly in providing varying levels of information about who is in the user’s environment.

Project Tokyo was a project in which researchers at the Silicon Valley giant looked into how technologies such as artificial intelligence (AI) and augmented reality can achieve the project’s core objective.

Ed Cutrell, a senior principal researcher with Microsoft’s research organization in Redmond, Washington and a co-leader of Project Tokyo said they started by observing how they interacted with other people as they navigated airports, attended sporting venues and went sightseeing, among other activities. This was followed by a series of consultations with the community of the visually impaired.

A key learning was how an enriched understanding of social context could help people who are blind or with low vision make sense of their environment, according to Microsoft’s Cutrell.

Blind Audio Engineer Plays Role in Building Project Tokyo

Peter Bosher, an audio engineer in his mid-50s who has been blind most of his life and worked with the Project Tokyo team early on the design process.

Bosher said:

“Whenever I am in a situation with more than two or three people, especially if I don’t know some of them, it becomes exponentially more difficult to deal with because people use more and more eye contact and body language to signal that they want to talk to such-and-such a person, that they want to speak now. It is really very difficult as a blind person.”
Peter Bosher, middle, an audio engineer who is blind, checks out the latest iteration of the system at Microsoft’s research lab in Cambridge, UK. Bosher worked with the Project Tokyo team early in the design process. He is joined in the picture by researchers Martin Grayson, left, and Cecily Morrison, right. (From Jonathan Banks/Microsoft)
Peter Bosher, middle, an audio engineer who is blind, checks out the latest iteration of the system at Microsoft’s research lab in Cambridge, UK. Bosher worked with the Project Tokyo team early in the design process. He is joined in the picture by researchers Martin Grayson, left, and Cecily Morrison, right. (From Jonathan Banks/Microsoft)

Holograms and Audio Cues

With the inputs gathered, the team worked on an original Microsoft HoloLens, a mixed reality headset that projects holograms into the real world for users to tailor-fit to their preferences.

Microsoft described the device as having:

“…an array of grayscale cameras that provide a near 180-degree view of the environment and a high-resolution color camera for high-accuracy facial recognition. In addition, the speakers above the user’s ears allow for spatialized audio – the creation of sounds that seem to be coming from specific locations around the user.”

It added:

“One model, for example, detects the pose of people in the environment, which provides a sense of where and how far away people are from the user. Another analyzes the stream of photos from the high-resolution camera to recognize people and determine if they have opted to make their names known to the system. All this information is relayed to the user through audio cues.”
”For example, if the device detects a person one meter away on the user’s left side, the system will play a click that sounds like it is coming from one meter away on the left. If the system recognizes the person’s face, it will play a bump sound, and if that person is also known to the system, it will announce their name.”

Bosher said his favorite is how the technology gives the user the angle of gaze, making for “a great tool for learning body language.”

The Future of Project Tokyo

The team said it will continue to work with people who are blind or with low vision, including more children who have been blind since birth to be able to enrich their learning on what more needs to be done.

Leave a Reply