12-06-2012, 01:01 PM
Automatic speaker recognition
Automatic speaker recognition.docx (Size: 2.1 MB / Downloads: 6)
INTRODUCTION
Automatic speaker recognition has basically two classifications: speaker recognition and speaker identification and it is the method of automatically identifying who is speaking on the basis of individual information integrated in speech waves. Speaker recognition is widely applicable in use of speaker’s voice to verify their identity and control access to services such as banking by telephone, database access services, voice dialing telephone shopping, information services, voice mail, security control for secret information areas, and remote access to computer AT and T and TI with Sprint have started field tests and actual application of speaker recognition technology; many customers are already being used by Sprint’s Voice Phone Card. Speaker recognition technology is the most potential technology to create new services that will make our everyday lives more secured.
LITERATURE SURVEY
In late 1940s, the US Defense proposed Automatic Translation Machine. The project failed, but sparked the research at MIT, CMU, commercial institutions. The first attempts for automatic speaker recognition were made in the 1960s, one decade later than that for automatic speech recognition. Pruzansky at Bell Labs [1] was among the first to initiate research by using filter banks and correlating two digital spectrograms for a similarity measure. Pruzansky and Mathews [2] improved upon this technique; and, Li et al. [3] further developed it by using linear discriminators. Doddington at Texas Instruments (TI) [4] replaced filter banks by formant analysis. Intra-speaker variability of features, one of the most serious problems in speaker recognition, was intensively investigated by Endres et al. [5] and Furui [6]. In 1962, IBM introduced Shoebox.
PROBLEM DEFINITION
Speaker recognition involves two tasks: identification and verification, as shown in Figure 3.1. In identification, the goal is to determine which voice in a known group of voices best matches the speaker. In verification, the goal is to determine if the speaker is who he or she claims to be.
THE SPEECH SIGNAL
Human communication is to be seen as a comprehensive diagram of the process from speech production to speech perception between the talker and listener, See Figure 4.1. Five different elements, A. Speech formulation, B. Human vocal mechanism, C. Acoustic air, D. Perception of the ear, E. Speech comprehension. The first element (A. Speech formulation) is associated with the formulation of the speech signal in the talker’s mind. This formulation is used by the human vocal mechanism (B. Human vocal mechanism) to produce the actual speech waveform. The waveform is transferred via the air (C. Acoustic air) to the listener.