Sound to Graph to Sound with JavOICe

Javoice--an antidote to skepticism

If you have suspected that a sound spectrogram merely extracts information about a sound and does not contain enough information to synthesize it, then browse on.

Peter Meijer in Eindhoven, NL has developed a Java applet which "sonifies" 64 x 64 gifs or jpegs (256 colors max) by treating them as graphs of frequency against time and assigning each pixel in the grid a tone with brightness realized as loudness. The applet allows you to draw in the grid (and to sonify it) and to convert a RIFF .wav file into a .gif, which it can then sonify, so that we are back to where we started--minus considerable loss, since the conversion to the .gif is not of high resolution. But the .gif resulting from a one-second .wav file is the size of a postage stamp and can be as little as 1/5th the size of the .wav file in kB. Furthermore, the applet will make a spectrogram from a sound and also display its waveform--it is a truly remarkable wormhole connecting sound and sight. --It is also slow to load and initialize, and may require reloading, but worth the effort. If you use Netscape, you will probably have to download the 2 .class files and place them in your "java/classes" directory.

The JavOICe, or The_vOICe applet will run through MSIE or Netscape (if it supports Java). It is the most fast, stable, and full-featured when run as a free-standing application (free standing if you have a Java Virtual Machine on your computer). Information on how to set it up on your local machine can be had from Meijer's javoice page.

Next is the applet, and following that a set of spectrograms made of the British hVd vowel set. The front vowels are in the left column and the back vowels in the right; between them are the three Vr sounds for this non-rhotic dialect (hed="beard"; heed="bared"; hied="heard")and the three open diphthongs. Clicking on any one will cause it to be displayed in the JavOICe window. The display ranges from 250 to 2500 Hz (or at least it does if you leave it alone). This rather low upper limit cuts off the higher frequency bands distinctive of the sibilants (around 4 kHz.) but shows the distinctive vowel formants in more detail. You may need to Reset the display (caution to Linux users: the Reset button currently causes segfaulting. Liegen lassen. Use the Restart selection in the File pulldown instead.) You should also hear the sound as it is synthesized--if you have sound enabled and the appropriate transducers attached.

Here are Peter Meijer's instructions for use. They assume you have pulled the window quite wide.

For warmup, try sonifying the images:

"Say Phonetics" "Phon Res.(rsynth)" "The Rain in Spain" "Sound Check" "Good Day to Die"
spectrogram spectrogram spectrogram spectrogram spectrogram
These are all 1.05 sec. windows, except for "Good Day". So :
Vowel spectrograms for Sonification

1)The bands are formants, mainly F1 and F2. Front vowels have higher F2s than their back counterparts; high vowels have lower F1 than low counterparts.

2) All samples show tailing down across the board, reflecting "sentence" intonation with fall at end. The key is that all bands decline in parallel. But many also show diphthong movement, especially toward [i], even had. But haud clearly moves toward [u].

3)If you prefer, you can pick vowels (mostly front) out of the phrasal examples























Back to Main: Acoustic Phonetics

George L. Dillon