25-11-2014, 06:14 PM
hi i need speaker verification code for speaker
Posts: 14,118
Threads: 61
Joined: Oct 2014
Speaker recognition is the process of automatic recognition, who said on the basis of personal information included in the speech of the waves. This method allows you to use voice speaking to check their identities and control access to services such as voice dialing, banking by phone, telephone equipment, database access services, information services, voicemail, security control for sensitive information areas, and remote access to computers.
The identity of the speaker is correlated with physiological and behavioral characteristics of momentum. These features exist in both the spectral envelope (characteristics of vocal tract) in suprasegmentnogo and features (characteristics of the source voice and dynamic capabilities, covering several segments).
The most common short-term spectral measurement currently used linear predictive coding (LPC)-derived kepstral′nye factors and its regression coefficients. Spectral envelope is a truncated Fourier cosine transform coefficients set much more smoothly than the one recovered from the LPC coefficients. It therefore provides a view of the stable from one repetition of another speaker's specific statements. The regression coefficients are generally first and second order factors are extracted at each period frame represent the spectral dynamics. These coefficients are derived by the time functions and Fourier cosine transform coefficients are called, respectively, Delta and the Delta-Delta coefficients of the Fourier cosine transform.
A simple and effective source code for speaker recognition. This code is based on the excellent presentation of Amina Koohi already available here and improves results using advanced metrics to calculate the distance. Therefore the percentage of correctly recognized words is achieved. In the initial data set (8 speakers), we get 100% detection rate (a previuos one was 87.5%) we can achieve similar results (100%) of the recognition speed for a larger set of data (11 loudspeakers).