The original voice in education was, of course, human speech. Plato taught Socrates through conversation more than 2,500 years ago, and he was late to the game. There’s no doubt that the earliest hominids to develop speech used it to transmit skills and ideas to the next generation—probably around 1.75 million years ago.

Of course, when we talk about voice technology in education today, the emphasis is on technology, generally incorporating some element of synthetic speech, which is to say talking computers. How did we get from there to here? These milestones of voice educational technology (sometimes called voice edtech) trace the path of historical development—and suggest future ways to improve learning for every student.

Voice Technology in Education: A Brief Timeline

Here are the major technologies that brought voice edtech into the classroom, along with the ways these technologies lead to stronger learning outcomes.

A Brief History of Voice Technology in Education

1. Early 1930s: Audiobooks appear.

In the 1930s, long-playing records (LPs) didn’t provide enough audio fidelity to satisfy music-lovers—but they were great at recording voices. In 1932, the American Foundation for the Blind made deals with publishers and recording companies to produce “talking books” for readers with blindness or visual impairments. These talking books were used in education, as this 1944 photograph of blind students learning by audiobook demonstrates.

2. 1980s: Text-to-speech (TTS) readers become widely available.

An early TTS reader, the Kurzweil Reading Machine, was introduced in 1976—but its top-tier model carried a hefty price tag ($30,000) and weighed hundreds of pounds. By the 1980s, these machines were more affordable, and they started showing up in classrooms around the country.

At this point, TTS readers were primarily used as assistive devices by students with blindness, dyslexia, or learning disabilities. That narrow application of voice technology in education would broaden soon.

3. 1984: The Center for Applied Special Technology (CAST) is founded, laying the groundwork for Universal Design for Learning (UDL) principles.

A group of clinicians at North Shore Children’s Hospital in Salem, Massachusetts founded CAST to research the use of computers to more effectively teach students with learning disabilities. CAST’s work eventually led to the UDL framework, the first full version of which was published in 2002.

Universal Design for Learning popularized the theory of multimodal education, which states that students (with and without disabilities) learn best when information is presented in multiple media—not just text on the page, but text and audio reading that text, ideally while highlighting words along the way. This theory led to broad, ongoing adoption of TTS web- and screen-reader tools in classrooms at every grade level.

Whereas previous applications of voice technology in education were designed to assist students with disabilities, the rise of UDL shows that these tools are effective in teaching all learners, regardless of ability.

4. 2010s: Deep neural networks create more lifelike TTS voices.

Deep neural networks (DNNs) are an AI-based software architecture with astounding processing power, modeled on the neuronal connections in the human brain. In the 2010s, new hardware like GPUs and more efficient algorithms made DNNs commercially viable, allowing TTS pioneers like ReadSpeaker to craft neural TTS voices—synthetic voices built on DNNs—to achieve increasingly lifelike synthetic speech.

Research from 2017 and 2018 has shown these advanced TTS voices to match or exceed the performance of human voice recordings, making voice technology in the classroom even more effective, especially in the context of multimodal education.

5. 2020s and beyond: In the wake of a global pandemic, eLearning becomes more widespread—and TTS tools go online to support teaching at a distance.

The COVID-19 pandemic forced a whole generation of students into remote learning scenarios, heightening the importance of digital TTS tools to enable multi-modal learning. Teachers turned to web readers that integrate with learning management systems (LMS), the digital platforms they use for online education.

These online TTS tools continue to promote educational inclusivity, helping to keep students at peer-level in every subject. They support a larger population of students with dyslexia and learning disabilities than we typically recognize. And they provide these benefits in traditional classrooms and online environments equally. We expect to see the use of voice technology in education continue to grow as the capabilities of the field advance.

