ASK HERE

seminar class · 28-04-2011, 03:18 PM

[attachment=13026]
What is speech ?
Speech refers to the processes associated with the production and perception of sounds used in spoken language.
Every speaker has got a characteristic way of speaking .This enables us to differentiate one speaker from other.
Frequency domain
Loudness
Timbre (sound quality)
Pitch
To create a user recognition system by extracting certain important features of speaker’s voice.
FEATURE EXTRACTION
Extracts a small amount of data from the voice signal that characterizes given speaker.
FEATURE VERIFICATION
Process to identify the unknown speaker by comparing the extracted features with his/her voice signal
Best in terms of feature extraction
A large number of coefficients.
Speech signal is parametrically represented using MEL-frequency Cepstrum coefficients (MFCC).
Frame Blocking
Windowing
FFT spectrum
MEL frequency wrapping
Cepstrum
In this step the continuous speech signal is blocked into frames of N samples, with adjacent frames being separated by M (M < N).
The first frame consists of the first N samples.
The second frame begins M samples after the first frame, and overlaps it by N -M samples and so on.
This process continues until all the speech is accounted for within one or more forms.
The next processing step is the Fast Fourier Transform, which converts each frame of N samples from the time domain into the frequency domain.
The FFT is a fast algorithm to implement the Discrete Fourier Transform (DFT), which is defined on the set of N samples {xn}, as follow:
Human ear is more sensitive to low frequency components of sound
So we stretch the low frequency components of the FFT spectrum and shrink the high frequency components of the same
This is accomplished by using filters that are linearly placed for frequency less than 1000Hz and are logarithmic for higher frequencies
Filter bank has triangular band pass frequency .
The spacing and bandwidth is determined by mel-frequency coefficients.
Formula:-
Apply the bank of filters according Mel scale to the spectrum
Each filter output is the sum of its filtered spectral components
The mel spectrum obtained above is converted back to time domain
This gives us the mel-frequency cepstrum coefficients of the sound wave given as input
Coefficients extracted are fed as input to the neural networks.
The goal of pattern recognition is to classify objects of interest into one of a number of categories or classes.
feature matching : supervised pattern recognition.
Concept of Vector Quantization.
Create a training set of feature vectors
VQ is a process of mapping vectors from a large vector space to a finite number of regions in that space. Each region is called a cluster and can be represented by its center called a codeword.
First step: training phase
Second step: finding VQ distortion
In the recognition phase, an input utterance of an unknown voice is “vector-quantized” using each trained codebook and the total VQ distortion is computed.
The speaker corresponding to the VQ codebook with smallest total distortion is identified as the speaker of the input utterance.
Expressions and volumes
Misspoken or misread prompted phrases
Condition of the user
Background noises
Change in voice due to cold
The problem can be overcome by training the network under different conditions
Can be developed to model a speech to text converter
Password recognition
Speaker recognition is challenging problems and there is still a lot of work that needs to be done in this area.
In this seminars, it is demonstrated how a speaker recognition system can be designed by artificial neural network using Mel-Frequency Cepstrum Coefficients matrix of voice as inputs to ANN.

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	Energy transmission system for an artificial heart- leakage inductance compensation s	applied electronics	2	6,930	31-07-2013, 10:31 AM Last Post: computer topic
	Dielectric Elastomer Artificial Muscle Actuators: Toward Biomimetic Motion	seminar class	0	1,709	05-05-2011, 03:45 PM Last Post: seminar class
	NANOCOMMUNICATOIN NETWORKS full rreport	seminar class	0	1,198	27-04-2011, 02:47 PM Last Post: seminar class
	Wireless and Mobile Networks	seminar class	0	1,423	12-04-2011, 10:35 AM Last Post: seminar class
	MOBILE TO COMPUTER NETWORKS	seminar class	0	1,433	26-03-2011, 10:53 AM Last Post: seminar class
	ARTIFICIAL EMOTION ENGINE	seminar class	0	1,797	24-03-2011, 12:55 PM Last Post: seminar class
	Rate allocation in wireless sensor networks with network lifetime requirement full re	seminar topics	0	1,423	16-03-2010, 08:01 PM Last Post: seminar topics

Important Note..!

ASK HERE