Mike Kohn!

CONTENTS

YouTube
BlueSky
GitHub
LinkedIn

Speech Synth

Posted: November 18, 2025

Introduction

This was a quick speech synthesis experiment. I recorded several .wav files of me saying different "letters" and strung them together in a .wav file to make words.

Related Projects @mikekohn.net

Related pages on www.mikekohn.net: MIDI guitar, Speech Synthesis

Explanation

This was a quick, two part project. The first part was to take some .wav file read / write code that had been done for different projects over the years and consolidate them into a single library of code. For the second part, a test program was made that can take a series of .wav files, trim the empty space from the start / end, normalize it to the loudest volume, and append into a single output file.

To start with, the original create_speech sample program simply had code that looked like this:


  append_sound(write_wav, "assets/m.wav");
  append_sound(write_wav, "assets/iy.wav");
  append_sound(write_wav, "assets/k.wav");
    
  append_pause(write_wav);
    
  append_sound(write_wav, "assets/k.wav");
  append_sound(write_wav, "assets/oh.wav");
  append_sound(write_wav, "assets/n.wav");

This would string enough sound files to say my name. It was improved to read a kind of script from a text file:


./create_speech samples/hello.txt out.wav

Where hello.txt has this in it:


m
iy
k
-
k
oh
n

Each line would load a .wav file from the assets/ directory that has the name of the line with .wav at the end. A longer example of this that says "hello my name is mike kohn" is:


hello.mp3

Some things that might be interesting to try:

Record full sentences instead of sounds, and cut sound parts out.
Make sure the sound samples that are appended together start / end around digital 0 to reduce pops.

Source code
git clone https://github.com/mikeakohn/simple_wav.git