ReadSpeaker is dedicated to creating the future of text to speech (TTS). Today, we are excited to offer brands our innovative DNN voices with Prosody Transfer. This state-of-the-art capability within the ReadSpeaker Neural TTS Engine enables our customers to achieve even more with our advanced, lifelike text-to-speech voices.
Prosody – An Important Aspect of Communication
“In modern phonetics the word ‘prosody’ and its adjectival form ‘prosodic’ are typically used to refer to properties of speech that cannot be derived from the segmental sequence of phonemes underlying human utterances. Examples of such properties are the controlled modulation of the voice pitch, the stretching and shrinking of segment and syllable durations, and the intentional fluctuations of overall loudness. On the perceptual level these properties lead amongst other things to perceived patterns of relative syllable prominences, coded in perceived melodical and rhythmical aspects of speech.”Nooteboom, Sieb. (1997). “The prosody of speech: Melody and rhythm. The Handbook of Phonetic Sciences. 5.”
Prosody Transfer – A New Milestone for Neural Text to Speech
Our cutting-edge technology enables us to transplant prosody models from one ReadSpeaker DNN voice to another – in a given language. In other words, we can now take the prosodic style of a source voice, and transfer it to a target voice, leaving the latter’s voice color unchanged.
Prosody Transfer – a capability that is exclusive to ReadSpeaker – is achieved through a pioneering prosody and voice color modeling technique.
ReadSpeaker’s Prosody Transfer helps brands speak to their target audiences in a more tailored way. For instance, a brand may want a neural text-to-speech voice based on a testimonial whom consumers know and love in their advertising campaigns. However, that actor’s voice does not sound quite right due to their speaking style. With Prosody Transfer, a bespoke neural TTS voice can sound exactly as our customers want it to. The ReadSpeaker DNN engine enables the actor’s voice characteristics to acquire the more successful prosodic traits of another voice – while maintaining its unique voice color. The outcome is an even better experience in the Voice User Interface.
ReadSpeaker’s Neural TTS Engine is our state-of-the-art synthetic speech engine. It is built upon our foundational voice enablement platform, which today provides 90 text-to-speech (TTS), off-the-shelf voices available in 40 languages.
The ReadSpeaker VoiceLab provides superior-quality text-to-speech voices using DNN-trained models. Not only do we produce voice solutions faster—in six to eight weeks, versus six months of legacy offerings — we also customize these voices to meet the unique needs of world-class brands.
If you’d like to find out more about how the ReadSpeaker VoiceLab can enhance your brand’s success in the Voice User Interface, please contact us today.