Home
  Springer proceedings
 Submit paper
 Submit review
  Register
  Participants
  Tasks/Tracks
  Adhoc
  ° Collection
  ° Topics
  ° Submissions
  ° Assessments
  ° Results
  Interactive
  ° Guidelines
  ° Topics
  ° System
  ° Log Viewer
  ° Schedule Exp
  Multimedia
  ° Topics
  ° Submissions
  ° Assessments
  ° Results
  Relevance feedback
  ° Submissions
  ° Results
  Document mining
  User-case studies
  ° Results
  XML Entity Ranking
  Natural language    processing
  ° Submissions
  ° Results
  Heterogenous Collection
  ° Collections
  ° Topics
  ° Runs
  Workshop
  News
  Organizers
  Schedule
  Publications
  2006
  2005
  2004
  2003
  2002
 

Relevance feedback task

The process of information retrieval is an uncertain one. Searchers may have less than well developed ideas of what they are searching for and the types of information available for retrieval; they may be unable to express a conceptual need for information in terms of a suitable query. Early in the development of IR, researchers recognized that although users often had difficulty expressing their informational needs precisely, they could recognize useful information when they saw it. That is, although searchers may be unable to readily convert informational needs into requests, once the system presents them with an initial set of documents, they can easily differentiate between those documents that do contain useful information and those that do not.

This recognition led to the notion of relevance feedback (RF): users evaluating (marking or selecting) a small set of documents as relevant or irrelevant with respect to an informational need. RF techniques use data from the selected documents (i.e., those returned by the system in response to the user's original query and then evaluated by the user for relevance) to automatically reformulate that query. They modify the initial query and produce a revised query - the feedback query - to be processed by the retrieval system. RF algorithms can be also used for Automatic Query Refinement (AQR) by applying an automatic process that marks the top results returned by the search engine as relevant and the tail results as non relevant for use by subsequent iterations.

The aim of this track is to investigate relevance feedback in the context of XML retrieval. In standard full text search engines, RF has been translated into detecting a "bag of words" that are good (or bad) at retrieving relevant information. These terms are then added to (or removed from) the query and weighted according to their power in retrieving relevant information. With XML documents, a more sophisticated approach - one that can exploit the characteristics of XML - is necessary. The approach should ideally consider not only content but also the structural features of XML documents. The query reformulation process must therefore infer which content and structural elements are important for effectively retrieving relevant data.

Please note that participants in the track must register for the INEX initiative. To have access to the test collection and in particular the relevance assessments, participants must perform the relevance assessment task. Participants in the RF track are also required to submit retrieval runs to the ad-hoc task, since the ad-hoc runs will serve as baselines for the RF task.

By 14 July 2006, participants in the Relevance Feedback (RF) task should submit their retrieval runs (search results) as per the Ad Hoc task guidelines. The participant's runs will serve as the baselines upon which RF will be performed. Participants should refer to the Ad Hoc retrieval task guidelines for detailed information on the formatting requirements of search results. On 22 October 2006, relevance assessment data will be distributed to the participants. RF feedback runs can then be performed using relevance information from the assessments.

To limit the number of RF submissions we chose a subset of some common ad-hoc tracks for participants to test their RF algorithms. Participants may submit up to 3 RF runs for each of their original submitted Ad Hoc runs for the CO.Thorough and CAS.Thorough tracks. Totally there could be at most 9 CO submissions (3 RF * 3 original) and 9 CAS submissions (3 RF * 3 original).

Please note that some topics may not be used in the RF track if they are judged inadequate for that purpose (e.g., if they do not retrieve enough relevant elements). There are no restrictions on the number of iterations of relevance feedback for a given query. Participants must submit their RF runs by 30 Nov 2006.

An RF run is built as follows: The relevance of up to 20 elements is checked against the relevance assessment data and is used as input for the relevance feedback algorithm. For most algorithms, these elements will be taken from the top-ranked elements in the baseline run. A participant may apply several iterations of RF where in each round, feedback for up to 20 new elements is received and a new set of results is computed.

The submission format for the runs is derived from the AdHoc submission format, with two new attributes of the inex-submission element:

  • base_run_id - the id of the original ad-hoc run
  • iterations - number of iterations used for the RF submission

Unlike the previous years, the submission must reflect *exactly* the results of the RF run without any postprocessing such as freezing. All postprocessing will be done when evaluating the runs. We want to make experiments with different postprocessing strategies to eliminate the influence of elements with known relevance on the results, among them freezing of the top-20 results used for feedback and several variants of the residual collection method.

The XML file must follow submission.dtd(there is also a corresponding XML Schema definition).

This year we introduce some optional features to the RF runs. First feature is that we allow submission of runs generated with non-standard methods beyond top-20 feedback. For example one can decide to select some other elements as the feedback elements. For such runs, the run has to list for each topic the results whose relevance was looked up for feedback (feedback elements), in which iteration it was used, and at which rank in the result list it was found. This is defined by the element as described in the DTD. Note that if the run used the default 20 top elements then there is no need to specify the elements.

Another optional feature is to specify the expanded query used for the topics in the RF run. To compare the expanded queries generated by different feedback algorithms, participants can store, for each topic, the expanded query with the results whenever it is appropriate for their algorithm. This can be specified by the element as described in the DTD. The format for this expanded query is NEXI with additional, optional weights for the terms, e.g.,

//article[about(., 0.5*XML 0.75*database -0.3*index)]

The comparison of different algorithms will be made with the following standard setting for CO.Thorough and CO+S.Thorough: A single iteration of feedback for the top-20 elements of the baseline run, using freezing of the top-20 results

Each participant is required to submit at least one run for a CO.Thorough baseline.

The reported evaluation scores for each RF submission will measure the improvement of the RF run over the original base run

.

Schedule

Jul 14:Submission deadline of search results in the ad-hoc track
Sep 15:Submission deadline for relevance assessments of the ad-hoc runs
Oct 15:Distribution of assessment pool to participants in the RF track
Nov 30:Submission deadline for relevance assessments runs
Dec 12:Distribution of evaluation scores to participants in the RF track
Dec 18-20:Workshop in Schloss Dagstuhl (http://www.dagstuhl.de/)

Organisers

Yosi Mass
Information Retrieval Group
IBM Research Lab,
Haifa 31905, Israel
Email: yosimass@il.ibm.com


Ralf Schenkel
Max-Planck-Institut für Informatik
Stuhlsatzenhausweg 85
66123 Saarbrücken
Email: schenkel@mpi-inf.mpg.de
Phone: (+49) 681 9325 504
Fax: (+49) 681 9325 599