Speech SynthPosted: November 18, 2025 Introduction This was a quick speech synthesis experiment. I recorded several .wav files of me saying different "letters" and strung them together in a .wav file to make words. Related Projects @mikekohn.net Explanation This was a quick, two part project. The first part was to take some .wav file read / write code that had been done for different projects over the years and consolidate them into a single library of code. For the second part, a test program was made that can take a series of .wav files, trim the empty space from the start / end, normalize it to the loudest volume, and append into a single output file. To start with, the original create_speech sample program simply had code that looked like this:
append_sound(write_wav, "assets/m.wav");
append_sound(write_wav, "assets/iy.wav");
append_sound(write_wav, "assets/k.wav");
append_pause(write_wav);
append_sound(write_wav, "assets/k.wav");
append_sound(write_wav, "assets/oh.wav");
append_sound(write_wav, "assets/n.wav");
This would string enough sound files to say my name. It was improved to read a kind of script from a text file:
./create_speech samples/hello.txt out.wav
Where hello.txt has this in it:
m
iy
k
-
k
oh
n
Each line would load a .wav file from the assets/ directory that has the name of the line with .wav at the end. A longer example of this that says "hello my name is mike kohn" is:
hello.mp3
Source code
Copyright 1997-2025 - Michael Kohn
|