Affect in Multimedia: Benchmarking Violent Scenes Detection
Texte intégral
Documents relatifs
Two different data sets are proposed: (i) a set of 31 Holly- wood movies whose DVDs must be purchased by the partic- ipants, for the main task and (ii) a set of 86 short YouTube 2
The acoustic features (low-level features), the audio con- cept log-likelihoods (mid-level features) and the violence/non- violence scores from video are concatenated in a into
We submitted five runs in total (Fig 1): (R1) using training set A, we first select the best still image feature and fuse it with motion and audio features; (R2) using training set
As last year [4], we use the concept annotations as a step- ping stone for predicting violence: We train a separate clas- sifier for each of 10 different concepts on the
Besides using the low-level features for training the SVM classifiers for vi- olent scenes detection, we show the feasibility in using the concept detectors to infer the occurrence
semantic concept detection, global feature, local feature, mo- tion feature, audio feature, mid-level feature, late
Only shot-level runs obtained through shot aggregation have been submitted for the objective subtask, while both shot and chunk aggregation were used for the subjective one.. Run
Bizer and Schultz [7] proposed a SPARQL-like mapping lan- guage called R2R, which is designed to publish expressive, named executable mappings on the Web, and to flexible combine