MIDI Guitar (Frequency Analysis)
Posted: February 14, 2015
IntroductionA long time ago I played on a guitar in a music store that had a special "midi" pickup and a midi device so as I was playing guitar, the sound coming out of the speaker was piano, violin, drums, etc. I always wanted one of those but luckily didn't have the money to waste on that.
Over the past week or so I playing around with the idea of using DCT (discrete cosine transform) to convert sound samples from an uncompressed .wav file into a list of frequencies. From those frequencies, midi notes are written to a .mid file so they can be played back as a piano. Below on this page is the source code, sound samples, and explanation of what was done.
Related Projects @mikekohn.net
Here's an example of me playing Jan Johansson's Polska Från Medelpad on guitar followed by the generated midi output of the same song. There are obvious errors in the midi version of the song which are explained below. Also, I tune my guitar down a 1/2 step. When I play an A here it's really an A flat.
polska_fran_medelpad.mp3 (original mp3 of guitar)
The original Jan Johansson recording can be heard here: https://www.youtube.com/watch?v=B5dzy7G1yUc
I started by working with the MATLAB clone called Octave. Using the following piece of code I generated a wave of 440 Hz and used Octave's DCT function and plotted it:
a=cos(2*pi*440*[0:0.001:1]); b=dct(a) plot(b)
After getting a visualization of how my input is probably going to look like after the DCT, I started writing code. I originally started with flat C, but decided it might end up looking a little cleaner as C++. I used the version of DCT as described on this page: Octave DCT. Unfortunately, the complexity of this algorithm is O(N^2) so it's quite slow. There is a O(N * log(N)) way to do this called FCT (fast cosine transform) but I haven't found a nice web page describing this yet, so for now I can't do the calculations real-time. The test_wav.cxx program expects a .wav file recorded at a 44.1kHz sampling rate mono. Samples are read in 8192 at a time and run through the DCT algorithm producing 8192 DCT coefficients. Based on the values of the coefficients will tell what frequencies are present and how strong. The MidiMap.cxx module is used to figure out which notes are present by filtering on a threshold and rounding the frequency to the closest midi note. I tried first outputting the lowest note in the result from MidiMap, but then used the loudest frequency instead. I might try some kind of sliding window at a later time, but for now this is how it works. Since my input sample rate is 44100 samples a second and I'm processing 8192 samples at a time, I have the possibility of only 5.3 notes per second. If I'm playing guitar too fast it means the notes are going to get garbled.
Below is a chart of a 4 seconds of sound done on several platforms. I ran a disassembly on DCT.o to kind of get an idea what's going on and why one platform beats another. I can see on x86 and ARM platforms that SIMD is being used, but not on PPC. The two columns labaled "Lookup" are the code running through a lookup table that replaces the cos() along with a multiplication. The top number represents the entire time the program ran and the bottom number is just the time it took to compute the DCTs with the look-up table.
Copyright 1997-2021 - Michael Kohn