• Aucun résultat trouvé

Distributed Video Editing, Archival and Retrieval

N/A
N/A
Protected

Academic year: 2022

Partager "Distributed Video Editing, Archival and Retrieval"

Copied!
5
0
0

Texte intégral

(1)

Report

Reference

Distributed Video Editing, Archival and Retrieval

MILANESE, Ruggero, et al.

Abstract

Within the Distributed Video Production project (European Union ACTS A089), the goal of the Distributed Video Editing, Archival and Retrieval (DVER) application is to provide broadcasters with a complete solution for distributed video post-production. This system should combine together archival, retrieval, and editing functionalities in order to increase the accessibility and reuse of archive material. Moreover, the system should be geographically distributed, guarantee a high degree of portability to different platforms, and employ digital video using standard compression formats. All these objectives have been achieved, and a complete prototype has been integrated and put in operation by our end user (MegaChannel TV) for news post-production.

MILANESE, Ruggero, et al . Distributed Video Editing, Archival and Retrieval . 1998

Available at:

http://archive-ouverte.unige.ch/unige:48027

Disclaimer: layout of this document may differ from the published version.

(2)

The retrieval tool (cf. figure 3, right) allows even naive users (e.g. journalists or program directors that have not been trained to use a database) to retrieve video material from the archive, using the available metadata from the catalog server. The database can be queried using both textual and visual data. Textual queries address specific fields of a video entity defined by the documentalist during the archival process. Visual queries address visual metadata extracted during the prepro- cessing phase. For instance, one may specify an example image from those already available on the user’s desktop, and ask for all video entities whose keyframes match its visual content. Alternative- ly, one may define the desired type of camera motion, for instance in terms of pan, tilt, and zoom.

Furthermore, all these aspects may be combined into a unique query by means of weighting coeffi- cients. Once the desired video entities are selected, it is possible to export them to the editing tool in order to compose a new video.

2. Dissemination of results

The system prototype has been demonstrated to several private companies and we are currently negotiating technology transfer agreements with two of them. The system architecture, the data- base schema, and the video and image processing algorithms have been presented at the Fribourg meeting of the MPEG-7 consortium (Oct. 1997), which aims at standardizing multimedia data representations for archival and retrieval.

Overall, the DVP project has allowed the vision group of the CUI to start a research group on image and video databases. This group has produced a number of scientific publications, and its activities are currently being continued through new research grants.

We would like to thank the Télévision Suisse Romande (Geneva) for their help in defining the user requirements. TSR has joined in 1996 the DVP consortium as a sponsoring partner.

[1] R. Milanese, M. Cherbuliez, and T. Pun, Invariant Content-Based Image Retrieval using the Fourier-Mellin Transform. Submitted to Intl. Conference on Advances in Pattern Recognition (ICAPR’98), Forte House, Plymouth, UK, 23-25 November 1998.

[2] S. Startchik, R. Milanese, and T. Pun, Projective and illumination invariant representation of disjoint shapes. European Conference on Computer Vision, Freiburg, Germany. June 2-6, 1998.

[3] S. Startchik, R. Milanese and T. Pun, Projective and photometric invariant representation of pla- nar disjoint shapes, Image and Vision Computing (accepted for publication, 1998).

[4] D. McG. Squire and T. Pun, Assessing Agreement Between Human and Machine Clusterings of Image Databases, Pattern Recognition (accepted for publication, 1998).

[5] R. Milanese and M. Cherbuliez, A rotation, translation, scale-invariant approach to content- based image retrieval. Submitted to J. Vis. Comm. and Image Repres., 1997.

[6] R. Milanese, F. Deguillaume, A. Jacot-Descombes, Video segmentation and camera motion char- acterization using compressed data, SPIE Conf. Multim. Stor. & Arch. Sys. II, Dallas, Nov. 1997.

[7] C. Rauber, T. Pun and P. Tschudin, Retrieval of images from a library of watermarks for ancient paper identification, In Proc. Elektronische Bildverarbeitung und Kunst, Kultur, Historie, Ber- lin, Germany. Gesellschaft zur Foerderung angewandter Informatik e.V., November 1997.

[8] D. McG. Squire and T. Pun, A Comparison of Human and Machine Assessments of Image Sim- ilarity for the Organization of Image Databases, In M. Frydrych, J. Parkkinen and A. Visa, eds., The 10th Scandinavian Conf. on Image Analysis, pp. 51-58, Lappeenranta, Finland, June 1997.

[9] R. Milanese, D. McG. Squire and T. Pun, Correspondence Analysis and Hierarchical Indexing For Content-Based Image Retrieval, In P. Delogne, ed., IEEE Int. Conf. on Image Processing, pp.

859-862, 3, Lausanne, Switzerland, September 1996.

[10]T. Pun and D. McG. Squire, Statistical structuring of pictorial databases for content-based im- age retrieval systems, Pattern Recognition Letters, 17, pp. 1299-1310, 1996.

[11]C. Rauber, P. Tschudin, S. Startchik and T. Pun, Archivage et recherche d'images de filigranes, 4ème Coll. National sur l'Ecrit et le Document, pp. 69-76, Nantes, France, July 1996.

[12]C. Rauber, P. Tschudin, S. Startchik and T. Pun, Archival and retrieval of historical watermark images, IEEE Int. Conf. on Image Processing, pp. 773-776, 2, Lausanne, September 1996.

[13]S. Startchik, R. Milanese, C. Rauber, T. Pun, Planar shape databases with affine invariant search, 1st IAPR Wksp on Image Databases and Multimedia Search, Amsterdam, August 1996.

(3)

1.2 The Video Archival and Retrieval Subsystem Catalog Server

The role of the catalog server is to compute and to provide access to metadata extracted from video clips retrieved from the archive server. Metadata are of two types: descriptions of the video visual content, and textual descriptions. Visual descriptions are automatically extracted by processing the video clip in the MPEG-1 format, without requiring decompression.

First, a clip is decomposed into smaller segments, and transitions between shots are detected by analyzing each frame’s motion vectors. For each shot, still images (keyframes) are extracted for display pur- poses. Moreover, a compact, numerical representation is extracted from each key- frame using a wavelet decomposition, in order to enable browsing and search through similar images. Finally, camera and camera lens motion (pan, tilt, zoom, stationary) properties are also computed from the motion vectors (cf. figure 2).

The second type of metadata consists of textual descriptions introduced by a documentalist at var- ious abstraction levels of the clip, as well as close-caption data retrieved from the original video.

Graphical User Interfaces

The CUI has developed two tools that run on the end-user workstation: the archival tool and the retrieval tool. Both have been implemented as Java applets in order to maximize portability. The archival tool (cf. figure 3, left) allows a documentalist to analyze the video’s content, using the meta- data automatically extracted from the clip (keyframes, temporal segmentation). The user can dis- play/edit these results, playback the video using random access through the keyframe display, and may define group of shots that, albeit visually different, may share a common semantic content.

Once a video entity (shot, group of shots, clip) has been selected, the user can enter additional tex- tual annotation.

0 5 10 15 20 25

0 2 4 6 8 10 12 14 16 18

maigrir, frame 190 : #valid = 59.09 %, max(|x|,|y|) = (51, 27)

Figure 2: Example of MPEG-1 motion vectors.

Figure 3: Main panels of the (left) Archival Tool and (right) Retrieval Tool.

(4)

Distributed Video Editing, Archival and Retrieval

R. Milanese, A. Jacot-Descombes, T. Pun

F. Deguillaume, L. Petrucci, M. Cherbuliez, A. De Giacomi

Within the Distributed Video Production project (European Union ACTS A089), the goal of the Dis- tributed Video Editing, Archival and Retrieval (DVER) application is to provide broadcasters with a complete solution for distributed video post-production. This system should combine together ar- chival, retrieval, and editing functionalities in order to increase the accessibility and reuse of ar- chive material. Moreover, the system should be geographically distributed, guarantee a high degree of portability to different platforms, and employ digital video using standard compression formats.

All these objectives have been achieved, and a complete prototype has been integrated and put in operation by our end user (MegaChannel TV) for news post-production.

1. Work done by the CUI

The Centre Universitaire d’Informatique (CUI) of the University of Geneva has played a major role in this application, by contributing to the assessment of the state of the art in the field, by partici- pating to the functional specifications of the complete DVER system, by completing the technical specifications of the archival subsystem, and finally by implementing the archival subsystem.

1.1 DVER Architecture Design

The system architecture includes an archive server, an editing server, a catalog server, and a client station for the end user (see figure 1). The archive server offers video streaming services through ATM, as well as file transfer through FTP/IP. Compressed digital videos are stored at two quality levels. The low-bitrate version (MPEG-1, 1.5 Mb/s) is used mainly for archival/retrieval/browsing purposes. This type of videos can also be edited on a low-cost client station. The high-bitrate version (MPEG-2, 8-50 Mb/s) is stored for producing the final program, of suitable quality for broadcasting.

Sun Microsystems’ MediaCenter has been em- ployed as the basic platform for the archive serv- er. On its top, an API has been developed for Java clients. Connectivity with the other system com- ponents is provided through an FTP server, and an ATM interface.

The catalog server, running on a Sun UltraSparc workstation, downloads new video clips from the archive server, and preprocesses them in order to extract metadata. These metadata are stored and indexed in a relational database built on the Il- lustra DBMS. Database access is provided to Java clients through an HTTP server and a data- base connectivity server/driver (Wedji).

The client station allows remote users to perform archival and retrieval operations, using both the catalog and the archive servers. In order not to overload the client station and the network, only low-bitrate video material is used to this end.

Once some video segments of interest have been identified, a user can export them to an editing tool, in order to compose a new program. The ed- iting list created by the editing tool is then trans- mitted to the editing server, which applies it to the corresponding high-bitrate material, in order to produce the ready-to-broadcast final video.

Archive Server

Java Archive

Media Services

HBR LBR Archival Tool

End-user

Retr ieval Tool Editing Tool

Illustra

Catalog Server

LBR

EDL

HBR LBR EDL LBR

Editing Server

Editing Server

LBR

Textual DataBlade Visual DataBlade

WorkStation

text c.im.

ID textual query visual query

text c.im.

ID ID(lbr) text c.im.

Java Control Server

ID(lbr)

ID(c) ID

Java Catalog Daemon clip select.

Video Storage Array Caption

File Storage What’s new EDL What’s browsing

LBR Caption

t.c.

Transcoding

Video Input Station HBR

LBR Digital

text c.im.

ID(lbr,hbr)

System Services & Management API new

Caption Video ORDBMS

http server (Wedji)

Daemon ftp server

Figure 1: DVER system architecture

(5)

TECHNICAL REPORT VISION

Distributed Video Editing, Archival and Retrieval

Ruggero Milanese, Alain Jacot-Descombes, Thierrry Pun

Frédéric Deguillaume, Lori Petrucci, Michel Cherbuliez, André De Giacomi

Computer Vision Group

Centre Universitaire d’Informatique, University of Geneva 24 rue du Général Dufour, CH - 1211 Geneva 4 SWITZERLAND

Phone: +41 (22) 705-7660, Fax: +41 (22) 705-7780 E-mail: FirstName.LastName@cui.unige.ch

Date: February 19, 1998 N

o

98.02

CENTRE UNIVERSITAIRE D’INFORMATIQUE

GROUPE VISION

UNIVERSITE DE GENEVE

Références

Documents relatifs

For the participation at MediaEval 2011, we based our experiments on the information retrieval framework Xtrieval (eXtensible reTRIeval and EVALuation) [2].. Xtrieval is being

Thus, we assume that an accurate retrieval model preferen- tially retrieves shots which have high recognition scores of objects, corresponding to concepts related to the query..

6 presents statistical outcomes of the averaged Euclidean distance between the overall 3D points (from depth maps) and the fitted flat surface by the proposed and classical plane

The LIG search system [1] uses a user-controlled combina- tion of six criteria: keywords, phonetic string (new in 2009), similarity to example images, semantic categories, similarity

Therefore, objective of this study was to investigate whether the number of reported vertigo attacks was equiva- lent between two reporting strategies: (1) event sampling using

1 Division of Balance Disorders, Department of Otorhinolaryngology and Head and Neck Surgery, Maastricht University Medical Center, Maastricht, Netherlands.. 2 Department

When introducing automation into video editing work, there are three important factors to consider: understanding existing video editing tasks and workflows, deciding where and how

In literature the task is known as content-based video retrieval (CBVR) [1], content- based video search [2], near-duplicate video matching [3] or content based video copy