As Mr. Vedantam begins the podcast, "At some point in our lives, many of us realize that the way we hear our own voice isn't the way others hear us. And we begin to realize that our voices communicate so much more than mere information: they reveal our feelings, our temperament, our identity."
This sets the tone for the next 30 minutes in which voice as identity is looked at from several angles, including a transgender woman's struggle with hearing herself in her voice, a woman who experienced a drastic change in her voice after surgical intubation damaged her vocal cords, and in the case of speech disorders requiring speech generating devices to communicate, how the use of modern speech synthesis technology can provide these individuals with their own unique identifiable vocal identities.
"Voice is about who you are. Our voice signals how old we are. Our voice signals our gender. Our voice signals, you know, things about our personality."
An important part of any technology conversation is "how do you mitigate the unintended outcomes?" and this is something Rupal and Shankar briefly touch upon. With the increase in deep fakes across media, Ms. Patel discussed the vulnerabilities and risks of new voice technologies, from political to financial impacts. She further stated that along with advances, there are ethical responsibilities that companies building these technologies must consider, and how VocaliD has designed ethical AI into our business.
In summary, this podcast is a wonderful introduction into the concept of voice as identity. Be sure to subscribe to Hidden Brain for more fascinating episodes as Shankar Vedantam uses science and storytelling to reveal the unconscious patterns that drive human behavior, shape our choices and direct our relationships.
This one hour in depth interview was a deep dive into VocaliD, as well as, the history and science of speech synthesis, providing the listener with a solid understanding of the hows and whys of modern voice AI.
During the podcast, Rupal and Bret delved into the future of computer-generated voice and how it relates to the surge in voice-first products we are seeing (and hearing). The technological advances in machine learning will undoubtedly offer numerous benefits from both a consumer and brand standpoint.
One of the many interesting take aways was the impact that today's advances in speech synthesis will have on inclusivity and allowing communities to feel less disenfranchised.
Rupal explained that if you look at the past - the prototypes for radio and television broadcasting were a very limited voice or face. There wasn't much diversity in the beginning, but now you are seeing, and hearing, a far more diverse range of communities in these two mediums. This hadn't yet caught on in the synthetic voice world however, and Rupal is eager for what will come now that VocaliD can offer unique high quality diverse voices.
"Our world is diverse. From age, gender, sexual orientation, and accents, and we don't hear much of that at all in the synthetic voices we hear around us."
-Rupal Patel, CEO of VocaliD
Wrapping up this educational podcast, Ms. Patel discussed the ethical responsibilities that technology companies must be aware that they hold when creating new technologies that may bring unintended consequences - and how it is important to consider ways in which to build safeguards into the design of your technology to mitigate these risks.
In the past few decades, we have seen an explosion of voice interfaces. There are 500M speaking devices today and by 2021 they outnumber us. We are using voice to access financial accounts, health records, and other personal information. All of this is exists today because of a wild idea, a moonshot. Voice is changing, evolving, exploding and the opportunities and risks are limitless. Before we share our founder Rupal Patel’s vision for the Voice AI Moonshot, let’s have a look at the future of Voice AI.
Despite the sheer number of voice interfaces we have access to today, we are still treating voice as functional modality today → a way to transmit information. Even today’s spherical hardware devices that we refer to as conversational agents are merely for timely and topical information exchange. One monolithic voice – a butler of facts. How can we move past this and harness the true power of Voice AI?
"Our Voice AI Moonshot is a world where voice benefits all, not just some."
The future of voice AI lies in tapping into the intrinsic, human characteristics of voice as a social connector.
There is an evolutionary reason that we each have a unique voice. Our voice defines us – our age, size, cultural background, habits, sexual orientation, socioeconomic level and more. Specifically, voice is biometric data that can be used to predict and monitor physical and mental health, while also offering a window into cognition and learning readiness. This is the untapped power of voice.
The future of voice AI is about connection. To create contextually adapting voices that can calm or inspire with the flip of a bit.
The future of voice AI is about TRUST. To create relatable, compassionate voices that can engage a toddler and the aging lonely. It is important to note that these voices would not substitute for human contact, they would be an augmentation.
The future of voice AI is not one voice for all. It will be a multitude of vocal persona that capture the full range of human expression. Brands will design vocal persona that speak to their diverse audience, not just a few users. As individuals, will each have our own vocal avatars.
As we harness and emulate this awesome and powerful human trait, we must anticipate the unintended consequences. We must proactively identify and protect against the potential for nefarious use. Voice is identity that cannot be swapped like passwords PINs, it must be secured from the start.
The Voice AI Moonshot
For this reason, as technologists, it is important that we take seriously our role in the creations of new technologies. While our technology provides great social benefit, In every new advance in our technology we are proactively creating measures to ensure that it can not be misused. We are committed to being active in the shaping and realization of the Voice AI future and the Voice AI Moonshot.
To our founder, Rupal Patel, the Voice AI Moonshot is a universe of voices that are convincing without being deceptive. The Voice AI future she envisions is one in which voices connect us rather than divide us, and where these voices will celebrate our diverse yet common humanity. Finally, our Voice AI Moonshot is a world where voice benefits all, not just some.
To learn more about Rupal Patel and VocaliD, please read our company page.