• Aucun résultat trouvé

Who Does Need Multimedia Data Mining?

1. Introduction into Multimedia Data Mining and Knowledge Discovery

1.2 Who Does Need Multimedia Data Mining?

To answer the above question, we consider the application areas of MDM and related industries and companies who are (potential) users of technology. The following are the five major application areas of MDM:

Customer Insight—Customer insight includes collecting and summarizing informa-tion about customers’ opinions about products or services, customers’ complains, customers’ preferences, and the level of customers’ satisfaction of products or ser-vices. All product manufacturers and service providers are interested in customer insight. Many companies have help desks or call centers that accept telephone calls from the customers. These calls are recorded and stored. If an operator is not available a customer may leave an audio message. Some organizations, such as banks, insurance companies, and communication companies, have offices where customers meet company’s representatives. The conversations between customers and sales representatives can be recorded and stored. The audio data serve as an input for data mining to pursue the following goals:

r Topic detection—The speech from recordings is separated into turns, i.e., speech segments spoken by only one speaker. Then the turns are transcribed into text and keywords are extracted. The keywords are used for detecting topics and estimat-ing how much time was spent on each topic. This information aggregated by day, week, and month gives the overview of hot topics and allows the management to plan the future training taking into account the emerging hot topics.

r Resource assignment—Call centers have a high turn around rate and only small percentage of experienced operators, who are a valuable resource. In case when the call center collects messages, the problem is how to assign an operator who will call back. To solve the problem, the system transcribes the speech and detects the topic. Then it estimates the emotional state of the caller. If the caller is agitated, angry, or sad, it gives the message a higher priority to be responded by an experienced operator. Based on the topic, emotional state of the caller, and operators’ availability the system assigns a proper operator to call back [9].

r Evaluation of quality of service—At a call center with thousands of calls per day, the evaluation of quality of service is a laborious task. It is done by people who listen to selected recordings and subjectively estimate the quality of service.

Speech recognition in combination with emotion recognition can be used for automating the evaluation and making it more objective. Without going into

deep understanding of the conversation meaning, the system assigns a score to the conversation based on the emotional states of speakers and keywords. The average score over conversations serves as a measure of quality of service for each operator.

Currently, there are several small companies that provide tools and solutions for customer management for call centers. Some tools include procedures that are based on MDM techniques. The extension of this market is expected in the future.

Surveillance—Surveillance consists of collecting, analyzing, and summarizing au-dio, video, or audiovisual information about a particular area, such as battlefields, forests, agricultural areas, highways, parking lots, buildings, workshops, malls, retail stores, offices, homes, etc. [10]. Surveillance often is associated with intel-ligence, security, and law enforcement, and the major uses of this technology are military, police, and private companies that provide security services. The U.S. Gov-ernment is supporting long-term research and development activities in this field by conducting the Video Analysis and Content Extraction (VACE) program, which is oriented on development technology for military and intelligence purposes. The National Institute of Standards and Technology is conducting the annual evaluation of content-based retrieval in video (TRECVID) since 2000 [11]. However, many civilian companies use security services for protecting their assets and monitoring their employees, customers, and manufacturing processes. They can get valuable insight from mining their surveillance data. There are several goals of surveillance data mining:

r Object or event detection/recognition—The goal is to find an object in an image or in a sequence of images or in soundtrack that belongs to a certain class of objects or represents a particular instance. For example, detect a military vehicle in a satellite photo, detect a face in sequence of frames taken by a camera that is watching an office, recognize the person whose face is detected, or recognize whether sound represents music or speech. Another variant of the goal is to identify the state or attributes of the object. For example, classify an X-ray image of lungs as normal or abnormal or identify the gender of a speaker. Finally, the goal can be rather complex for detecting or identifying an event, which is a sequence of objects and relationships among objects. For example, detect a goal event in a soccer game; detect violence on the street, or detect a traffic accident. This goal is often a part of the high-level goals that are described below.

r Summarization—The goal is to aggregate data by summarizing activities in space and/or time. It covers summarizing activities of a particular object (for example, drawing a trajectory of a vehicle or indicating periods of time when a speaker was talking) or creating a “big picture” of activities that happened during some period of time. For example, assuming that a bank has a surveillance system that includes multiple cameras that watch tellers and ATM machines indoors and outdoors, the goal is to summarize the activities that happened in the bank during 24 h.

To meet the goal, unsupervised learning and visualization techniques are used.

Summarization also serves as a prerequisite step to achieve another goal—find frequent and rare events.

r Monitoring—The goal is to detect events and generate response in real time. The major challenges are real-time processing and generating minimum false alarms.

Examples are monitoring areas with restricted access, monitoring a public place for threat detection, or monitoring elderly or disabled people at home.

Today we witness a real boom in video processing and video mining research and development [12–14]. However, only a few companies provide tools that have elements of data mining.

Media Production and Broadcasting—Proliferation of radio stations and TV channels makes broadcasting companies to search for more efficient approaches for creating programs and monitoring their content. The MDM techniques are used to achieve the following goals:

r Indexing archives and creating new programs—A TV program maker uses raw footage calledrushesand clips from commercial libraries calledstockshot li-brariesto create a new program. A typical “shoot-to-show” ratio for a TV pro-gram is in the range from 20 to 40. It means that 20–40 hours of rushes go into one hour of a TV show. Many broadcasting companies have thousands of hours of rushes, which are poorly indexed and contain redundant, static, and low-value episodes. Using rushes for TV programs is inefficient. However managers be-lieve that rushes could be valuable if the MDM technology could help program makers to extract some “generic” episodes with high potential for reuse.

r Program monitoring and plagiarism detection—A company that pays for broad-casting its commercials is interested to know how many times the commercial has been really aired. Some companies are also interested how many times their logo has been viewed on TV during a sporting event. The broadcasting compa-nies or independent third-party compacompa-nies can provide such service. The other goals are detecting plagiarism in and unlicensed broadcasting of music or video clips. This requires using MDM techniques for audio/video clip recognition [15]

and robust object recognition [16].

Intelligent Content Service—According to Forrester Research, Inc, (Cambridge, MA) the Intelligent Content Service (ICS) is “a semantically smart content-centric set of software services that enhance the relationship between information workers and computing systems by making sense of content, recognizing context, and un-derstanding the end user’s requests for information” [17]. Currently, in spite of many Web service providers’ efforts to extend their search services beyond basic keyword search to ICS, it is not available for multimedia data yet. This area will be the major battlefield among the Web search and service providers in next 5 years.

The MDM techniques can help to achieve the following goals:

r Indexing Web media and using advanced media search—It includes creating indexes of images and audio and video clips posted on the Web using audio and visual features; creating ontologies for concepts and events; implementing advance search techniques, such as search images or video clips by example or by sketch and search music by humming; using Semantic Web techniques to infer the context and improve search [18]; and using context and user feedback to better understand the user’s intent.

r Advanced Web-based services—Taking into account exponential growth of in-formation on the Web such services promise to be an indispensable part of

everybody’s everyday life. The services include summarization of events ac-cording the user’s personal preferences, for example, summarizing a game of the user’s favorite football or basketball team; generating a preview of a music album or a movie; or finding relevant video clip for a news article [19].

Knowledge Management—Many companies consider their archives of documents as a valuable asset. They spend a lot of money to maintain and provide access to their archives to employees. Besides text documents, these archives can contain draw-ings of designs, photos and other images, audio and video recording of meetdraw-ings, multimedia data for training, etc. The MDM approaches can provide the ICS for supporting knowledge management of companies.