Link the Wiki track
Link-the-Wiki: At INEX we are using the Wikipedia collection - about 5
GBytes consisting of 660,000 documents, in XML format. The document set
is extensively hyperlinked. The Link-the-Wiki task aims at evaluating
the state of the art in automated discovery of document hyperlinks.
The objective of the task is to provide an evaluation forum and a set of
standard tasks and corresponding achievable results. We aim to create a
reusable resource for evaluating and comparing different state of the
art systems and approaches to automated link discovery. More
specifically, given a new orphan wikipedia document, the task is to
analyse the text and recommend a set of incoming and outgoing links
from/to anchor text in the existing collection. Going beyond
traditional text document analysis, in the context of INEX we aim to
operate at the XML element level. This means that anchor text or anchor
elements will link not only to a related document, but to a specific XML
element within, or to the best entry point for starting to read the
referenced material from.
|