Audio visual information fusion in human-computer interfaces and intelligent environm
#1

[attachment=9606]
1. INTRODUCTION
A machine can be considered intelligent if a human judge involved in a natural conversation with a human and the machine, cannot distinguish between the two. The field of artificial intelligence thus has its roots in making machines human-like, with the ability to perceive, analyze and respond to their surroundings in a way that is natural and seamless to humans. Since human perception is multimodal in nature, with speech and vision being the primary senses, significant research effort has been focused on developing intelligent systems with audio and video interfaces.
The traditional interfaces such as keyboard, mouse and even close-talking microphones are considered too restrictive to facilitate natural interaction between humans and computers. Research efforts have been focused on developing non- intrusive sensors such as cameras and far field microphones so that humans can communicate through natural means like conversational speech and gestures, without feeling encumbered by the presence of sensors. In other words, the computer has to fade into the background, allowing the users of the intelligent systems to conduct their activities in a natural manner. This necessitates the use of multimodal, especially audio-visual systems. Audio-visual systems are not restricted to human computer interfaces (HCI) alone. In several applications such as meeting archival and retrieval and human behavioral studies, audio-visual fusion can be applied as a post processing step.
Another significant advantage of using multimodal sensors is the robustness to environment and sensor noise that can be achieved through careful integration of information from different types of sensors. This is particularly true in cases where a particular human activity can be deduced from two or more different sensory cues like for example, audio and lip movements in the case of human speech. Many other tasks like person tracking, head pose estimation, affective state analysis also exhibit significant overlap in the information conveyed over multiple modalities, especially audio and video.
Though different sensors might carry redundant information as suggested in the previous paragraph, these sensors are rarely equal, in the sense, they carry complementary information too, making it advantageous to use certain sensors over others for certain tasks. This is clearly demonstrated in the case of speech and gesture analysis for HCI applications, where the information carried through gestures complements the information presented through speech. Utilizing both these cues leads to a system that can understand the user more completely than using just one of the modalities.
Figure 1 shows the example of an audiovisual emotion detection scheme to illustrate some key points of information fusion.
2. AUDIO-VISUAL INFORMATION FUSION SCHEMES
The varied application domains of multimodal human activity analysis systems have always presented a challenge to the systematic understanding of their information fusion models and algorithms. The traditional approach to information fusion schemes classifies them based on early, late and intermediate fusion strategies and describes their associated merits. However most multimodal systems are built to exploit one or more of the advantages. Thus, a classification of fusion strategies based on their “intent” would provide a new angle to look at these schemes. Also, the various models used, probabilistic and otherwise also need to be examined for their merits in the fusion schemes.
Humans are the ultimate intelligent systems equipped with multimodal sensors and the capability to seamlessly process, analyze, learn and respond to multimodal cues. Human beings seem to learn the cross-modal correspondences early on and use that along with other techniques to combine the multi-modal information at various levels of abstraction. This seems to be the ideal approach to sensory information fusion as exemplified by the success of hierarchical modeling schemes. However significant progress is necessary before computers can begin to process multimodal information at the level of humans. The models and algorithms used in intelligent systems need not be motivated by human information processing alone. However, human cognition can provide valuable insight into the what and how of intelligent systems.
Figure 3 illustrates the various signal and semantic abstraction level at which fusion of information can occur. The multimodal fusion schemes based on their primary”intent”. For example, an audio- visual speech recognition system’s intent would be to achieve robustness to environmental and sensor noise. Further, the traditional early/intermediate/late fusion strategies and the different modeling techniques used under each of these main categories. Thus the systems are classified as those that use multimodal sensors primarily for:
• Achieving robustness to environmental and sensor noise
• Facilitating natural human computer interaction.
• Exploiting complementary information across modalities.
Achieving robustness to environmental and sensor noise is the traditional motivation for audio-visual information fusion. Thus this category includes the major part of the multimodal fusion strategies studied so far. The most widely accepted notion of sensory information fusion applies to these systems. Those tasks which involve redundant cues in multiple modalities due to the nature of the human activity fall under this group. Audio-visual speech recognition is the classic example of such a task. It is also one of the earliest areas to generate multimodal information fusion techniques and also covering other areas including audio-visual person tracking, affect analysis, person identification etc. However, the organization will be based on the fusion strategies rather than the particular application domain. The fusion strategies have been classified as follows –
• Signal enhancement and sensor level fusion strategies
• Feature level fusion strategies
• Classifier level fusion strategies
• Decision level fusion strategies
• Semantic level fusion strategies
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: migrating to ddr3 interfaces from ddr2, brain computer interfaces bci pdf, neurochip interfaces, trends in automotive sensor interfaces, otbes case study user interfaces, fusion memory information, clases abstractas e interfaces,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  information about satellite based project 1 2,152 09-03-2017, 12:26 PM
Last Post: jaseela123d
  A Disaster Information System by Ballooned Wireless Adhoc Network seminar surveyer 2 2,366 15-02-2013, 10:20 AM
Last Post: seminar details
  BIT for Intelligent System Design Electrical Fan 2 2,713 13-02-2013, 10:27 AM
Last Post: seminar details
  VISUAL CRYPTOGRAPHIC STEGANOGRAPHY IN IMAGES seminar class 8 5,694 11-02-2013, 11:42 AM
Last Post: Guest
  On the channel and signal crosscorrelation uplink and downlink of mobile UHFDTV with seminar class 2 2,047 10-01-2013, 05:30 PM
Last Post: Guest
  ADVANCED COMPUTER ANALYSIS OF POWER SYSTEM CONTROL AND POWER ELECTRONICS TRANSIENTS seminar class 1 1,843 01-12-2012, 01:40 PM
Last Post: seminar details
  Opportunistic Routing for Wireless Ad Hoc and Sensor Networks: Present and Future seminar presentation 3 3,000 22-11-2012, 01:26 PM
Last Post: seminar details
  INTELLIGENT TRANSPORTATION SYSTEMS hari.k.s 2 1,459 16-11-2012, 01:04 PM
Last Post: seminar details
  Intelligent Transport System Using GIS project report helper 4 3,432 16-11-2012, 01:04 PM
Last Post: seminar details
  digital visual interface full report project reporter 2 4,115 07-11-2012, 11:56 AM
Last Post: seminar details

Forum Jump: