Home
  Springer proceedings
 Submit paper
 Submit review
  Register
  Participants
  Tasks/Tracks
  Adhoc
  ° Collection
  ° Topics
  ° Submissions
  ° Assessments
  ° Results
  Interactive
  ° Guidelines
  ° Topics
  ° System
  ° Log Viewer
  ° Schedule Exp
  Multimedia
  ° Topics
  ° Submissions
  ° Assessments
  ° Results
  Relevance feedback
  ° Submissions
  ° Results
  Document mining
  User-case studies
  ° Results
  XML Entity Ranking
  Natural language    processing
  ° Submissions
  ° Results
  Heterogenous Collection
  ° Collections
  ° Topics
  ° Runs
  Workshop
  News
  Organizers
  Schedule
  Publications
  2006
  2005
  2004
  2003
  2002
 

Multimedia Track

Introduction

Structured document retrieval allows for the retrieval of document fragments, i.e. XML elements, containing relevant information. The main INEX adhoc task focusses on text-based XML element retrieval. Although text is dominantly present in most XML document collections, other types of media can also be found in those collections. Existing research on multimedia information retrieval has already shown that it is far from trivial to determine the combined relevance of a document that contains several multimedia objects. The objective here is to exploit the XML structure that provides a logical level at which multimedia objects are connected, to improve the retrieval performance of an XML-driven multimedia information retrieval system.

The multimedia track will continue to provide an evaluation platform for the retrieval of multimedia document fragments. In addition, we want to create a discussion forum where the participating groups can exchange their ideas on different aspects of the multimedia XML retrieval task. Ideas raised here, may provide input for this the task descriptions for this year and/or the coming years.

Task description

The task set for the multimedia track is to retrieve relevant document fragments based on an information need with a (structured) multimedia character. A structured document retrieval approach in that case should be able to combine the relevance of the different media types into a single (meaningful) ranking that is presented to the user. The INEX multimedia track differs from other approaches in multimedia information retrieval, like TRECVID [2] and IMAGECLEF [3], in the sense that it focuses on using the structure of the document to extract, relate, and combine the relevances of different multimedia fragments. The focus for 2006 is on the combination of text and image retrieval.

In the table below, a schematic representation is given, which illustrates the different combinations types of information need and objects of retrieval that are identified for the multimedia track. Only the areas that are marked orange, will be subject to investigation.

Document Collections

The document collections for the multimedia track are based on wikipedia data. We distinguish two main collections:
  • Wikipedia Ad Hoc XML collection:This is the same collection that is used for the INEX 2006 Ad Hoc track. The assumption is that a user will be able to see images from the multimedia corpus in-place in the XML fragments when assessing a fragment. The collection is available from the wikipedia XML corpus pages at LIP-6.
  • Wikipedia image XML collection: This XML collection is specially prepared for the Multimedia track. It consists of XML documents containing image meta-data. Each document in this collection contains exactly one image with (often) a short description. This corresponds with the information that is also available on wikipedia, consider for instance: http://en.wikipedia.org/wiki/Image:AnneFrankHouseAmsterdam.jpg. The collection constists of 170,370 XML files and a text and xml file listing all image identifiers and their original names. Download the Wikipedia image XML collection [27MB].
The images referred to in these two collections are available from the wikipedia XML corpus pages at LIP-6. The set of images in the multimedia corpus contains a subset of images referred to in the Wikipedia Ad Hoc XML collection. Additional images referred to in the Ad Hoc collection have been removed from this collection (for parsing problems or copyright reasons). Only images in the Wikipedia image XML collection (listed in MMlistOfPictures.txt that comes with the Wikipedia image XML collection) are assumed to be available for the user. The set of XML files coming with the MM images at lip6 can be ignored as they are replaced by the Wikipedia image XML collection as introduced above.

Additional sources of information

In 2006 the two Wikipedia based collections discussed above are the main search collections. The returned elements should come from these collections. A number of additional sources of information is provided to help participants get to the relevant information in these collections.
  • Image classification scores: For each image the classification scores for 101 different concepts are provided by the University of Amsterdam (Examples of each concept). The UvA classifier is trained on manually annotated TRECVID video data and the concepts are picked for the broadcast news domain. The performance of these classifiers on the broad collection of Wikipedia images varies greatly, but we believe it may still be a useful source of information. For an impression of the quality of the classifications, have a look at the top 100 images for each concept. For making optimal use of this valuable source of information it may be possible to re-interpret some of the classes based on the new domain, for example, the concept anchor person may not be widely present in this collection, but the classifier results can be useful for finding persons, or perhaps the tennis classifier is very bad at finding tennis, but good at finding vertical lines. Also, it may be useful to first remove all non-photographic material (using the graphics concept?) since this seems to have confused the classifier often.

    Details of the classification techniques can be found in the following paper:
    C.G.M. Snoek, M. Worring, J.C. van Gemert, J.M. Geusebroek, and A.W.M. Smeulders. The Challenge Problem for Automated Detection of 101 Semantic Concepts in Multimedia.In Proceedings of ACM Multimedia, Santa Barbara, USA, October 2006.

    Download the full set of concept classifcations [192MB]

  • Image features: A set of 120D feature vectors, one for each image, is available that has been used to derive the image classification scores. Participants can use these feature vectors to build a custom CBIR-system, without having to pre-process the image collection. The features are based on natural images statistics to compactly represent color invariant texture information by a Weibull distribution.

    / Details of the classification techniques can be found in the following paper:
    Jan C. van Gemert, Jan-Mark Geusebroek, Cor J. Veenman, Cees G.M. Snoek, and Arnold W.M. Smeulders Robust Scene Categorization by Learning Image Statistics in Context In CVPR Workshop on Semantic Learning Applications in Multimedia, New York, USA, June 2006.

    Download the Features [95MB]

  • CBIR system: An on-line service to get a ranked list of similar images given a query image (from the collection) is provided by RMIT. Details at the CBIR system webpage

Assessments and Evaluation Methodology

The MMfragments task is assessed as in the Ad Hoc track. Assessors highlight relevant passages using the X-Rai system.

MMimages is a document retrieval task, we only need binary assessments at the document level. Assessors can do their work starting from the MMimages assessment page For last year's evaluation the standard TREC-based evaluation measures have been used. At current it is open for discussion whether we will continue to use this set of metrics, or that we will adhere to the evaluation methodology that is used for the Adhoc track. One of the aspects that was positively perceived by the participants was the simplicity of the assessment procedure, which substantially reduced the time needed for the assessment phase.

Additional Information

The MM track guidelines for 2006 are available. They detail the tasks, the topic development procedure, the runs and the submission format. Information on assessments will be added later.

For more information about previous editions of this track, we refer to the INEX 2005 multimedia track report, the slides used for the presentation of the track report at the Dagstuhl 2005 Workshop, and the slides of the discussion session.

Schedule

The following pre-liminary schedule is set up to give a timeline of the activities that will be deployed in the Multimedia track. Please, note that the most of the deadlines are set one month after the deadlines of the Ad hoc track, to avoid possible clashes.
Mar 20 - Mar 27:The collection of XML documents will be distributed to all participants (we will be using a collection based on the wikipedia.
Jul 14:Participants will be provided with detailed instructions and formatting criteria for candidate topics/queries.
Aug 4:Submission deadline for candidate topics.
Aug 11:Distribution of final set of topics/queries to participants along with detailed information on the formatting requirements of the search results.
Oct 1:Submission deadline of search results.
Oct 8:Distribution of merged results to participants for relevance assessments.
Oct 27:Submission deadline for relevance assessments.
Nov 6:Distribution of relevance judgements and evaluation scores to participants.
Nov 27:Submission of papers for the workshop pre-proceedings.
Dec 08:Workshop pre-proceedings and workshop programme online.
Dec 18-20:Workshop at Schloss Dagstuhl. (http://www.dagstuhl.de/).

Organisers

Roelof van Zwol
Yahoo! Research Barcelona
Ocata 1
08003 Barcelona
Spain
Email: roelof@yahoo-inc.com
Phone: +34 935421164
Fax: +34 935421150

Thijs Westerveld
Centre for Mathematics and Computer Science, CWI
Kruislaan 413
1098 SJ Amsterdam
The Netherlands
http://www.cwi.nl/~thijs/
Email: thijs@cwi.nl
Phone:+31-(0)20 5924316
Fax: +31-(0)20 5924312