MENU
Simple Vowel Synthesis in MATLAB

Simple Vowel Synthesis in MATLAB

 

 

Why speech synthesis? 

 

Speech synthesis is a fun way to experiment with sound. If you’re looking for new ways to inject voice-like features to your synths, guitars and pads this is a great way to get started. We can use technology to generate synthetic speech and to apply the resonant characteristics of vowels to other recorded sound sources. There are plenty of combinations to explore and it’s an endless journey of discovery. Technology for speech synthesis is an amazing platform for sound transformations. Let’s start with a simple instrument that allows us to shape the frequency content of our recordings with the timbre of speech sounds. MATLAB is great for experimenting with sound and testing new ideas for audio processing. We can easily build and prototype our sound design instrument. 

 

How?

 

One method to synthesize vowels is to connect in cascade three bandpass resonators.  Our synthesizer’s model should look like this:

 

Figure 1

 

 

In order to build our synthesizer we need to complete the following tasks:

  • Implement a resonator: this will be our building block.
  • Connect in cascade three resonators according to the schematics in Figure 1.
  • Connect a test signal / audio file to the input of our series of resonators.
  • Go to the chart with the values of the formant frequencies (Figure 2).
  • Set the center frequencies of the resonators to the formant values in Figure 2.
  • Play the result.
  • Render the output to a.wav file.

 

The model in Figure 1 is a simplified version of a cascade formant synthesizer. However this will be a good starting point to begin filtering our recordings with vowels. The following MATLAB code allows us to give our sounds the sonic character of vowels. We can experiment with the effect of different vowel sounds by changing the center frequencies of the filters. We can implement this by using the formant frequencies in Figure 2. This is an example of values  we can set for the variables fc1, fc2 and fc3. 

 

Vowel

F1

F2

F3

[a] 

700

1220

2600

[ae] 

620

1660

2430

[o] 

540

1100

2300

[u]

320

900

2200

[i]

400 

1800

2570

 

Figure 2: Formant frequencies of vowels

 

 

For more details on speech synthesis and formant frequency values:

Klatt, Dennis H. "Software for a cascade/parallel formant synthesizer." the Journal of the Acoustical Society of America 67, no. 3 (1980): 971-995.

 

 

 

I made an example that plays an original mono synth track and then filters the same track with all the vowel patterns in Figure 2. This is what it sounds like:

 

 

 

 

MATLAB implementation:  Formant Synthesizer

 

 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%    Simple cascade vowel synthesis.
%%%    author: Michele Pizzi
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

[sig,Fs]= audioread(’mySound.wav’); %input audio file

 

fc1 = 700; % Hz first  formant
fc2 = 1220;% Hz second formant
fc3 = 2600;% Hz third  formant

 

band = 120;% bandwidth 

 

f1 = peakf(sig,fc1,band,Fs); %filter first formant
f2 = peakf(f1,fc2,band,Fs);  %filter second formant
f3 = peakf(f2,fc3,band,Fs);  %filter third formant

 

out = (1./max(abs(f3))).*f3; %normalization


audiowrite(’myVowel.wav’,out,Fs)  % writes output to an audio file

 

 

The script above  will need the function peakf.m in order to run. The function can be implemented as follows:

 

function [out] = peakf(in,Fc,bandW,Fs)
%peak filter

%in:    input signal
%Fc:    center frequency
%bandW: bandwidth filter
%Fs:    sampling frequency

out = zeros(length(in),1);


wo = Fc/(Fs/2);  
bw = bandW/(Fs/2);
[b,a] = iirpeak(wo,bw);

out=filter(b,a,in);
end

 

 

 

 

 

 

 

See also