My name is Fredrik Larsson and I am the CTO of ReadSpeaker. My first contact with TTS (short for Text To Speech) was in the mid 80s. I was 14 years old, doing some period of practical occupational experience in school, together with a blind researcher in history. He used it together with software that could read the screen aloud to him. It was a quite robotic voice but I could hear what it was saying and I got some idea that I maybe could use computers like my friends. My next experience was a kind of Nim game played with a computer as opponent, and the computer talked to me. It was the same kind of synthetic voice but it was almost like playing a game with a human being. This meant that I was even able to play those text adventure games and some more simple ones not requiring graphics. It was now time to do some serious investigations and try to get hold of a computer with Text-To-Speech for use in school. I got my first IBM computer, about 2 dm high, and 5 dm wide and deep. It was not an ABC80 or C64, but my friends thought this was even cooler – and it could talk. Back in that time, a TTS system was a 4 dm long ISA card that one plugged into a computer and connected it to a standard car speaker. It was controlled by a separate box with thick knobs for volume and on/off. Some years later, external Text-To-Speech systems came to market. They were boxes, about the size of an iPad, but 2 inches thick, and you could carry them between different computers and connect them, without the need for a screwdriver. I used TTS quite heavily in the early 90s, mostly at home, since my university studies were more convenient to do using braille. However, I used quite a lot of talking books, mostly in science, so I realized that having something that could talk is not enough. It also matters how the information is read to make it understandable. Reading everything from left to right, top to bottom, is often not the best way to convey something aloud to someone. I then stayed away from TTS until 2001, when my work with ReadSpeaker technology started. TTS had then made a dramatic change. The voices sounded almost like humans, but unfortunately they also made some of the mistakes as people tend to do, such as speaking a bit more sloppy. The old classic voices appeared very clear, you could even hear spelling errors in the original text. This was not as simple to do with the more natural sounding voices, but they attracted new target groups, both people that listened to text while reading it themselves simultaneously, and people that wanted to consume information by listening to it, instead of reading it on paper or screen. TTS in itself is not enough to produce a nice user experience when listening to complete tables, formulas, and image descriptions. It must also be complemented with good automatic processing of documents and web pages to produce a high-quality narration. Listening to all the audio books and teachers describing what they wrote on the blackboard during all these years in school and university has definitely made it clear to me that quality is more than voice and audio. One of our greatest challenges for the future is to make new Text-To-Speech systems sounding as clear and exact as the classic ones, still having as nice sounding voices as real people, but without the sometimes too sloppy reading.