On-line speaker diarization for smart objects
Texte intégral
Documents relatifs
On the development data, the overall speaker diarization error was reduced from 24.8% for the best setting of the baseline system to 13.2% using BIC clustering, and to 7.1% with
These preliminary experiments show that the binary key SAD approach outper- forms a “classic” HMM-based audio segmentation tool using the NIST RT05 database.. In addition,
In this paper, we propose to perform speaker diariza- tion within the dialogue scenes of TV series by combining the au- dio and video modalities: speaker diarization is first
Both bottom-up and top-down approaches are generally based on Hidden Markov Models (HMMs) where each state is a Gaussian Mixture Model (GMM) and corresponds to a speaker.
While ESTER and REPERE campaigns allowed to train a single and cross-show diarization system based on i-vector/PLDA frameworks trained on labeled data, we proposed to train
The experimental results on two distinct target cor- pora show that using data from the corpora themselves to per- form unsupervised iterative training and domain adaptation of
Novel contributions include: (i) a new approach to BK extraction based on multi-resolution spectral analysis; (ii) an approach to speaker change detection using BKs; (iii)
While know- ing the number of speakers and the use of labelled data is also at odds with the traditional definition of diarization, many meeting scenarios involve a short