Voice training with depth images

Depth sensors allow us to precisely record and evaluate the mouth and facial movements of patients. Physical examination by a speech pathologist is not necessary. Therefore, the technology can also provide new impulses in voice therapy. Read on to learn more about how LAOLA uses depth imaging for voice therapy.

Imagine you could scan an object with your smartphone or tablet and insert it as a three-dimensional image into the recording of a real environment–and no one would notice it wasn’t part of the original scene! Or you could film your face and convert it into a three-dimensional emoji. Things like this are possible with smartphones and tablets equipped with depth sensors.

In LAOLA, we rely on so-called depth images because they will allow us to precisely analyze the mouth and facial movements of the users of our app. Until now, voice exercises had to be personally monitored by therapists. The LAOLA app will provide patients and therapists with automatic feedback on the pronunciation and articulation that form their voice exercises. This will make therapy more effective for everyone involved.

How depth images are created

A depth sensor is a special camera that detects the distance between objects and the camera. These distances are determined by methods such as time measurement of light or sound signals (time-of-flight) or so-called stereovision. The resulting depth images contain the corresponding distance data for every point in the image. In contrast, conventional digital photos consist of “flat” color areas. The impression of depth results from subtle gradations within individual color tones.

There are several types of depth sensors. Some commercially available consumer devices also have the technology, for example Apple’s iPhone 12 Pro, iPhone 13 Pro and iPad Pro. Their built-in TrueDepth system is fast and accurate, and is meant for facial recognition, an enhanced portrait mode or augmented reality features.

Data protection is a priority

From a privacy perspective, deepth images are no more dangerous than regular videos. Deep fakes or even the appearance of a person cannot be easily reconstructed from depth data. Nevertheless, there are ways in which depth images can be misused.

To protect the privacy of all LAOLA users, we follow a strict data protection concept. We inform all subjects of our study comprehensively about our data collecting method as well as the nature and volume of said data, how it’s stored and used within the framework of LAOLA. This means that users know exactly what happens to their data. Would you like to learn more about our scientific studies? Please feel free to contact us to become a test person.

Voice training with depth images

How depth images are created

Data protection is a priority

Leave a Reply Cancel reply