Following on from the study into the voice effect and the design of virtual humans carried out by Scotty D. Craig from Arizona State University and Noah L. Schroeder from Wright State University, the researchers have done follow up investigations that extend their findings to instructional multimedia design and narration voices, without an avatar or other virtual human. The conclusion for this new piece of research matches that of earlier research:

“In most respects, those who learned from the modern text-to-speech engine were not statistically different in regard to their perceptions, learning outcomes, or cognitive efficiency measures compared with those who learned from the recorded human voice. Our results imply that software technologies may have reached a point where they can credibly and effectively deliver the narration for multimedia learning environments.” (1)

The ‘voice effect’ comes from earlier research and suggests that using recorded human voices to provide narration in multimedia learning environments provides better learning outcomes than using computer-generated voices (2). However using randomized tests with latest generation synthetic voices, it was found that there are minimal differences in the ways that participants perceived and learned from a modern computer-generated voice compared with a recorded human voice.

This has wide reaching repercussions for educational institutions and instructional designers, considering the cost and time efficiency of using dynamically updated text-to-speech (TTS) solutions versus recording human voices or finding experts willing to record numerous narratives.

Ongoing studies are now showing that voice engines have reached an acceptable level of performance for use within learning technologies. These findings point to the opportunity to use synthetic voices to develop more dynamic and less expensive learning technologies for improved learning outcomes.

(1) Craig, S. D., & Schroeder, N. L. (2018). Text-to-Speech Software and Learning: Investigating the Relevancy of the Voice Effect. Journal of Educational Computing Research, 0735633118802877. 
(2) Mayer, 2014b