• Aucun résultat trouvé

IMPACT OF NOMINAL EVENTS IN KEYWORD -BASED SEARCH OF VIDEO CLIPS

Dans le document MULTIMEDIA INFORMATION EXTRACTION (Page 139-143)

VISUAL SEMANTICS FOR REDUCING FALSE POSITIVES IN VIDEO SEARCH

7.5 IMPACT OF NOMINAL EVENTS IN KEYWORD -BASED SEARCH OF VIDEO CLIPS

To test the impact of detecting nominal events on the performance of an informa-tion retrieval system for searching video clips using keywords, we created a corpus of video clips and their accompanying text fi les using the test collection of TREC 2002 Video Track 2. About 174 video clips were downloaded from the Internet Archive website 3 and two clips from The Open Video Project 4 for a total of 176 video clips. For each video clip, we created an accompanying text fi le using the fol-lowing information gathered from the clip web page: title, description, reviews, and shot list. In the end, this effort resulted in a collection of 176 video clips and their corresponding text fi les. For each text fi le in the corpus, we created a list with all the nouns that describe events presented in the video clip and that have instances of events in the text fi le. There are two reasons for these annotations:

They serve as labels for events that appear in the video in order to automati-cally detect linguistic triggers.

They provide an upper bound for the improvement of the performance in retrieving video clips using keyword search.

TABLE 7.4. Results of the Experiments with Multinomial Model on the Corpus of Video Transcripts

Here we address the second reason, and we performed experiments in order to see the improvement in the retrieval and ranking of video clips using their correspond-ing text fi les. We used the Lucene 5 information retrieval system with default settings to perform the experiments. The previous annotations were used to boost term frequency scores of the nouns that relate to events in the video clip with a weight w . We experimented with three values: w = 3, w = 5, and w = 10.

We created 50 queries based on words found in the text fi les corresponding to video clips. We noticed that queries based on two or more keywords (linked with the AND operator) would retrieve only a single document most of the time, and cannot be used to show improvement in retrieval performance or ranking. This happens because the size of the corpus is small. Therefore, we used single - word queries based on nouns that appear in text fi les corresponding to video clips. The queries are presented in Table 7.5 . About 34 (68%) of the nouns used in the queries can refer to a nominal event. For each query, we annotated the set of relevant documents.

We used two types of performance measures: mean average precision ( MAP ) and average reciprocal rank ( ARR ). The MAP measure is defi ned as:

IMPACT OF NOMINAL EVENTS IN KEYWORD-BASED SEARCH OF VIDEO CLIPS 127 the documentdjk. in the set of retrieval results.

We performed four experiments: one experiment using an unmodifi ed informa-tion retrieval system with default settings, and three experiments where we increased the term frequency of the relevant nouns with the weight w : w = 3, w = 5, and w = 10.

Table 7.6 presents the results of these experiments. We can see from the table that giving more importance to the nouns referring to actual events in the video clip does increase the ranking performance. The MAP scores increase by 12 – 15% and the ARR scores increase by 9 – 11% when we increase the term frequency of the relevant nouns.

We found some specifi c examples where increasing the importance of the nominal events appearing in the video provides better ranking. We found eight relevant documents for the query “ parade. ” The document retrieval system using normal indexing retrieves two relevant documents in the fi rst two positions; however, in the third position, we found a document related to the video clip “Aluminum on the march (Part II) , ” which does not have any parade event because one of the review-ers metaphorically compares the events in the video clip with a military parade: “an aluminum man and his metallic minions lurching across the screen in military parade fashion . ” The document related to the fi rst part of this clip “Aluminum on the march (Part I)” is returned in the fi fth position. The retrieval system that uses the term frequency of nominal events increased by w = 3 returns these “ aluminum ” docu-ments in positions six and eight and the systems withw = 5 and w = 10 returns them in the seventh and ninth positions. Table 7.7 presents the ranking of nonrelevant

“ aluminum ” documents and the MAP and ARR scores for each retrieval system for this specifi c query.

While the improvement in ranking using nominal events over simple keyword search may be relatively modest in these experiments, a more dramatic effect will be seen in much larger collections. Due to the ground truth experiments, we are limited in the size of data sets we can experiment with.

TABLE 7.6. The Results of the Information Retrieval Experiments

System MAP Score (%) ARR Score (%)

Unmodifi ed term freq. ( tf ) 77.08 83.67

tf increased with w = 3 89.55 92.67

tf increased with w = 5 90.57 93.45

tf increased with w = 10 91.95 94.18

7.6 SUMMARY

We have focused on the problem of granular and semantic searching of video, for example, searching for specifi c events involving specifi c entities. We are interested in situations where collateral audio or text is available, thereby permitting recent advances in text - based information extraction to be exploited. The problem of detecting nominal events and their signifi cance in accurate event search has been explored in depth. Initial results are promising. Although the problem is challenging, the impact on video search is expected to be signifi cant. Our next step will be to derive more useful linguistic triggers and test on larger data collection.

TABLE 7.7. The Position of Nonrelevant “ Aluminum ” Documents for the Query “ Parade ” and the MAP and ARR for Each Retrieval System

System

Position of “ Aluminum ” Nonrelevant Documents

MAP Score (%)

ARR Score (%)

Unmodifi ed term freq. (tf) 3 and 5 76.84 84.46

tf increased with w = 3 6 and 8 93.00 97.03

tf increased with w = 5 7 and 9 96.65 98.42

tf increased with w = 10 7 and 9 96.65 98.42

129

CHAPTER 8

AUTOMATED ANALYSIS OF

Dans le document MULTIMEDIA INFORMATION EXTRACTION (Page 139-143)