Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Video assisted speech source separation

Wang, Wenwu, Cosker, Darren P., Hicks, Yulia Alexandrovna ORCID: https://orcid.org/0000-0002-7179-4587, Sanei, Saeid and Chambers, Jonathon A. 2005. Video assisted speech source separation. Presented at: ICASSP 2005. IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, PA, USA, 18-23 March 2005.

Full text not available from this repository.

Abstract

We investigate the problem of integrating the complementary audio and visual modalities for speech separation. Rather than using independence criteria suggested in most blind source separation (BSS) systems, we use visual features from a video signal as additional information to optimize the unmixing matrix. We achieve this by using a statistical model characterizing the nonlinear coherence between audio and visual features as a separation criterion for both instantaneous and convolutive mixtures. We acquire the model by applying the Bayesian framework to the fused feature observations based on a training corpus. We point out several key existing challenges to the success of the system. Experimental results verify the proposed approach, which outperforms the audio only separation system in a noisy environment, and also provides a solution to the permutation problem.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Engineering
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Uncontrolled Keywords: Bayesian framework ; audio features ; blind source separation ; convolutive mixtures ; feature extraction ; feature fusion ; instantaneous mixtures ; nonlinear coherence ; speech processing ; speech separation ; statistical analysis ; unmixing matrix optimization ; video assisted speech source separation ; video signal Processing ; visual features
Additional Information: Digital Object Identifier: 10.1109/ICASSP.2005.1416331
Last Modified: 17 Oct 2022 09:41
URI: https://orca.cardiff.ac.uk/id/eprint/5169

Citation Data

Cited 49 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item