Robust WSD Task @ CLEF 2009
Word Sense Disambiguation for (Cross-Lingual) Information Retrieval
Following the success of the 2007 joint SemEval-CLEF task and the 2008 Robust-WSD task at CLEF, a follow-up pilot task is planned with the aim of exploring the contribution of Word Sense Disambiguation to monolingual and multilingual Information Retrieval.
Robust-WSD aims at exploring the contribution of Word Sense Disambiguation to monolingual and multilingual Information Retrieval. The organizers of the task will provide documents and topics which have been automatically tagged with Word Senses from WordNet using several state-of-the-art Word Sense Disambiguation systems.
Robust-WSD at CLEF 2008 (see working notes) showed that some top-scoring systems improved their IR and CLIR results with the use of WSD tags.
The Robust-WSD task in 2008 used two languages often used in previous CLEF campaigns (English, Spanish). Documents were in English, and topics in both English and Spanish. The documents collections are based on the widely used LA94 and GH95 news collections. For 2009 we will repeat the same basic layout and document collections.
- Please join the mailing list, so we can keep you updated with the relevant news. To join enter your e-mail in the button to the left, and press subscribe. You can also browse the e-mail list messages.
Instructions for participation
Please note the following steps in order to participate:
1. Registration is via the CLEF website (press the CLEF 2009 button). Participants must sign an agreement restricting use of the data and regulating publication and dissemination of results. Registration closes around 1 May.
2. Join the mailing list (see subscribe button to the left) for updates.
3. (Optional) Explore the public dataset to replicate the 2008 setting
4. One you get the passwords, you can download the document collection and training data (topics plus relevance data) from 15 March:
a. enter the registered space for participants, then follow link to Data Collections for CLEF 2009 (you will use your first pair of user-password)
b. scroll down to "English", and get the WSD versions of LA 1994 and GH 1995 (second pair of user-password)
c. scroll to the bottom down to "AdHoc Robust Track", and get the 2009 training data, with the queries and relevance judgements for English and Spanish (third pair of user-password)
5. Download the English and Spanish WordNets (see below)
6. Read the guidelines
7. Download the test topics from 24 April from DIRECT, and disambiguated test topics from the url provided by organizers.
8. Submit results by 9 June (instructions to be announced in due time).
Note that you can play around with the public data from the 2008 exercise, available here .
Time Schedule
· Registration Opens - 1 February 2009 (closes on 1 May) · Data Release - from 15 March 2009 · Topic Release - 24 April 2009 · Submission of Runs by Participants - 9 June 2009 (23:59 CET) · Release of Relevance Assessments and Individual Results - from 26 June 2009 · Submission of Paper for Working Notes - 30 August 2009 · Workshop - 30 September to 2 October 2009 (collocated with ECDL 2009)
Description
The Robust-WSD task brings semantic and retrieval evaluation together. The participants will be offered topics and document collections from previous CLEF campaigns which were annotated by systems for word sense disambiguation (WSD). The goal of the task is to test whether WSD can be used beneficially for retrieval systems.
The organizers believe that polysemy is among the reasons for information retrieval (IR) systems to fail. WSD could allow a more targeted retrieval. Last year, the campaigns SemEval and CLEF cooperated and created a task where participants were required to provide WSD on CLEF data collections. In a retrieval experiment by the organizers the WSD data was used for retrieval but did not lead to improvement. This year, participants are given the WSD data (or can derive their own) and can run their own retrieval experiments with various retrieval strategies.
The WSD data is based on WordNet version 1.6 and will be supplemented with data from the English and Spanish WordNets in order to test different expansion strategies. Several leading WSD experts will run their systems, and provide those WSD results for the participants to use.
Participants are required to submit at least one baseline run without WSD and one run using the WSD data. They can submit four further baseline runs without WSD and four runs using WSD with in various ways.
The robust task will use two languages often used in previous CLEF campaigns (English, Spanish). Documents will be in English, and topics in both English and Spanish.
A subset of highly ambiguous topics will be identified by the organizers and used for a separate evaluation to see how WSD works for these hard topics.
The evaluation will be based on Mean Average Precision (MAP) as well as Geometric Average Precision (GMAP). The robust measure GMAP intends to evaluate stable performance over all topics instead of high average performance in Mono- and Cross-Language IR (“ensure that all topics obtain minimum effectiveness levels” Voorhees 2005 SIGIR Forum).
Data for Robust Task
- The documents will comprise any or both of the ad-hoc collections which were available at CLEF 2001
- LA Times 94 (with WSD data) : 72,027,935 tokens - Glasgow Herald 95 (with WSD data): 27,731,946 tokens
- The topics (with WSD data) will be split in training and test, and will consist on a combination of topics from past CLEF exercises (see below)
- monolingual IR (English) - bilingual (Spanish -> English)
Topics
The test and train topics are distributed as follows:
- CLEF years 2001-2002,2004: for Training - CLEF years 2003, 2005-2006: for Testing
The topics are available in Spanish and English.
The following table summarizes the test/train topics and corresponding target collections are the following:
|
Year
|
Topics
|
English Documents
|
|
Train
|
CLEF 2001
|
41-90
|
LA Times 94
|
x
|
Train
|
CLEF 2002
|
91-140
|
LA Times 94
|
x
|
Train
|
CLEF 2004
|
201-250
|
x
|
Glasgow Herald 95
|
Test
|
CLEF 2003
|
141-200
|
LA Times 94
|
Glasgow Herald 95
|
Test
|
CLEF 2005
|
251-300
|
LA Times 94
|
Glasgow Herald 95
|
Test
|
CLEF 2006
|
301-350
|
LA Times 94
|
Glasgow Herald 95
|
Test and Training Data for Robust 2009
Data formats, additional files, and source of WSD tags
These are the DTDs for the disambiguated topics and documents. Please check these sample topics (original ) and documents (original).
Spanish topics will be disambiguated using the first sense heuristic. English topics and documents will be annotated with the following word sense disambiguation systems:
1. Agirre, Eneko & Lopez de Lacalle, Oier (2007). UBC-ALM: Combining k-NN with SVD for WSD. Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007). pp. 341-345. Prague, Czech Republic.
2. Chan, Yee Seng, & Ng, Hwee Tou, & Zhong, Zhi (2007). NUS-PT: Exploiting Parallel Texts for Word Sense Disambiguation in the English All-Words Tasks. Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007). pp. 253-256. Prague, Czech Republic.
In order to expand from WordNet synset numbers to words in English and Spanish, you will need the following:
- The English WordNet version 1.6 available from here
- The Spanish WordNet, available free for research from here
Contact
Thomas Mandl, University of Hildesheim, Germany, mandl at uni-hildesheim de Eneko Agirre , University of the Basque Country, Basque Country, e.agirre at ehu es
|