2007 IEEE International Conference on Image Processing - San Antonio, Texas, U.S.A. - September 16-19, 2007

Technical Program

Paper Detail

Paper:TA-P3.4
Session:Image and Video Modeling II
Time:Tuesday, September 18, 09:50 - 12:30
Presentation: Poster
Title: BLIND AUDIOVISUAL SOURCE SEPARATION USING SPARSE REPRESENTATIONS
Authors: Anna Llagostera Casanovas; Ecole Polytechnique Federale de Lausanne (EPFL) 
 Gianluca Monaci; Ecole Polytechnique Federale de Lausanne (EPFL) 
 Pierre Vandergheynst; Ecole Polytechnique Federale de Lausanne (EPFL) 
Abstract: In this work we present a method to jointly separate active audio and visual structures on a given mixture. Blind Audiovisual Source Separation is achieved exploiting the coherence between a video signal and a one-microphone audio track. The efficient representation of audio and video sequences allows to build relationships between correlated structures on both modalities. Video structures exhibiting strong correlations with the audio signal and that are spatially close are grouped using a robust clustering algorithm that can count and localize audiovisual sources. Using such information and exploiting audio-video correlation, audio sources are also localized and separated. To the best of our knowledge this is the first blind audiovisual source separation algorithm conceived to deal with a video sequence and the corresponding mono audio signal.



©2016 Conference Management Services, Inc. -||- email: webmaster@icip2007.com -||- Last updated Friday, August 17, 2012