SPEAKER VERIFICATION USING ARTIFICIAL NEURAL NETWORKS
#1

[attachment=13026]
What is speech ?
Speech refers to the processes associated with the production and perception of sounds used in spoken language.
Every speaker has got a characteristic way of speaking .This enables us to differentiate one speaker from other.
Frequency domain
Loudness
Timbre (sound quality)
Pitch
To create a user recognition system by extracting certain important features of speaker’s voice.
FEATURE EXTRACTION
Extracts a small amount of data from the voice signal that characterizes given speaker.
FEATURE VERIFICATION
Process to identify the unknown speaker by comparing the extracted features with his/her voice signal
Best in terms of feature extraction
A large number of coefficients.
Speech signal is parametrically represented using MEL-frequency Cepstrum coefficients (MFCC).
Frame Blocking
Windowing
FFT spectrum
MEL frequency wrapping
Cepstrum
In this step the continuous speech signal is blocked into frames of N samples, with adjacent frames being separated by M (M < N).
The first frame consists of the first N samples.
The second frame begins M samples after the first frame, and overlaps it by N -M samples and so on.
This process continues until all the speech is accounted for within one or more forms.
The next processing step is the Fast Fourier Transform, which converts each frame of N samples from the time domain into the frequency domain.
The FFT is a fast algorithm to implement the Discrete Fourier Transform (DFT), which is defined on the set of N samples {xn}, as follow:
Human ear is more sensitive to low frequency components of sound
So we stretch the low frequency components of the FFT spectrum and shrink the high frequency components of the same
This is accomplished by using filters that are linearly placed for frequency less than 1000Hz and are logarithmic for higher frequencies
Filter bank has triangular band pass frequency .
The spacing and bandwidth is determined by mel-frequency coefficients.
Formula:-
Apply the bank of filters according Mel scale to the spectrum
Each filter output is the sum of its filtered spectral components
The mel spectrum obtained above is converted back to time domain
This gives us the mel-frequency cepstrum coefficients of the sound wave given as input
Coefficients extracted are fed as input to the neural networks.
The goal of pattern recognition is to classify objects of interest into one of a number of categories or classes.
feature matching : supervised pattern recognition.
Concept of Vector Quantization.
Create a training set of feature vectors
VQ is a process of mapping vectors from a large vector space to a finite number of regions in that space. Each region is called a cluster and can be represented by its center called a codeword.
First step: training phase
Second step: finding VQ distortion
In the recognition phase, an input utterance of an unknown voice is “vector-quantized” using each trained codebook and the total VQ distortion is computed.
The speaker corresponding to the VQ codebook with smallest total distortion is identified as the speaker of the input utterance.
Expressions and volumes
Misspoken or misread prompted phrases
Condition of the user
Background noises
Change in voice due to cold
The problem can be overcome by training the network under different conditions
Can be developed to model a speech to text converter
Password recognition
Speaker recognition is challenging problems and there is still a lot of work that needs to be done in this area.
In this seminars, it is demonstrated how a speaker recognition system can be designed by artificial neural network using Mel-Frequency Cepstrum Coefficients matrix of voice as inputs to ANN.
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: speaker verification algorithms, mel mcgrath paramount, new project areas in speaker verification 2011, ltspice speaker 6edfd9f7a2b7dea36577478d12f40ba3, gmm ubm for speaker verification matlab, neural networks in speaker identif ppt, powered by phpbb speaker systems,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  Energy transmission system for an artificial heart- leakage inductance compensation s applied electronics 2 6,930 31-07-2013, 10:31 AM
Last Post: computer topic
  Dielectric Elastomer Artificial Muscle Actuators: Toward Biomimetic Motion seminar class 0 1,709 05-05-2011, 03:45 PM
Last Post: seminar class
  NANOCOMMUNICATOIN NETWORKS full rreport seminar class 0 1,198 27-04-2011, 02:47 PM
Last Post: seminar class
  Wireless and Mobile Networks seminar class 0 1,423 12-04-2011, 10:35 AM
Last Post: seminar class
  MOBILE TO COMPUTER NETWORKS seminar class 0 1,433 26-03-2011, 10:53 AM
Last Post: seminar class
  ARTIFICIAL EMOTION ENGINE seminar class 0 1,797 24-03-2011, 12:55 PM
Last Post: seminar class
  Rate allocation in wireless sensor networks with network lifetime requirement full re seminar topics 0 1,423 16-03-2010, 08:01 PM
Last Post: seminar topics

Forum Jump: