CLEF 2008

Robust WSD Task @ CLEF 2008

Word Sense Disambiguation for (Cross-Lingual) Information Retrieval

Following the 2007 joint SemEval-CLEF task , a related pilot task is planned with the aim of exploring the contribution of Word Sense Disambiguation to monolingual and multilingual Information Retrieval. In this case, the organizers of the task will provide document collections (from the news domain) and topics which have been automatically tagged with Word Senses from WordNet using several state-of-the-art Word Sense Disambiguation systems. QA WSD@CLEF is a closely related task on question answering.

News

2009 Feb 9: Data to replicate 2008 exercise available here
Sep 20: After the success of the first competition ROBUST-WSD will be held again at CLEF 2009 with a similar setting. If interested, please join-in the mailing list and/or send e-mail to Eneko Agirre below.
Sep 13: working notes and slides from the workshop available
Jul. 28: final results availabe in DIRECT
May 21: new result submission deadline: 30 of June
Apr. 24: new version of guidelines
Apr. 14: small bug in WSD results of UBC fixed. Please download again

Mailing list

You can browse the e-mail list. To join enter your e-mail in the button to the left, and press subscribe.

Instructions for participation

Please note the following steps in order to participate:

1. Registration is via the CLEF website. Participants must sign an agreement restricting use of the data and regulating publication and dissemination of results. Registration closes 1 May.

2. Join the mailing list (see subscribe button to the left) for updates.

3. Download the document collection and training data (topics plus relevance data) from 1 March.

4. Download the English and Spanish WordNets (see below)

5. Read the guidelines (updated 24/4)

6. Download the test topics from 1 May.

7. Submit results by 15 June.

Time Schedule

· Registration Opens - 10 February 2008 (closes 1 May 2008)
· Data Release - from 1 March 2008
· Topic Release - from 1 May 2008
· Submission of Runs by Participants - 19 June 2008
· Release of Relevance Assessments and Individual Results - 15 July 2008
· Submission of Paper for Working Notes - 15 August 2008
· Workshop - 17-19 September 2008

Description

The robust task will bring semantic and retrieval evaluation together. The participants will be offered topics and document collections from previous CLEF campaigns which were annotated by systems for word sense disambiguation (WSD). The goal of the task is to test whether WSD can be used beneficially for retrieval systems.

The organizers believe that polysemy is among the reasons for information retrieval (IR) systems to fail. WSD could allow a more targeted retrieval. Last year, the campaigns SemEval and CLEF cooperated and created a task where participants were required to provide WSD on CLEF data collections. In a retrieval experiment by the organizers the WSD data was used for retrieval but did not lead to improvement. This year, participants are given the WSD data (or can derive their own) and can run their own retrieval experiments with various retrieval strategies.

The WSD data is based on WordNet version 1.6 and will be supplemented with data from the English and Spanish WordNets in order to test different expansion strategies. Several leading WSD experts will run their systems, and provide those WSD results for the participants to use.

Participants are required to submit at least one baseline run without WSD and one run using the WSD data. They can submit four further baseline runs without WSD and four runs using WSD with in various ways.

The robust task will use two languages often used in previous CLEF campaigns (English, Spanish). Documents will be in English, and topics in both English and Spanish.

A subset of highly ambiguous topics will be identified by the organizers and used for a separate evaluation to see how WSD works for these hard topics.

The evaluation will be based on Mean Average Precision (MAP) as well as Geometric Average Precision (GMAP). The robust measure GMAP intends to evaluate stable performance over all topics instead of high average performance in Mono- and Cross-Language IR (“ensure that all topics obtain minimum effectiveness levels” Voorhees 2005 SIGIR Forum).

Data for Robust Task

Ad-hoc collections which were available at CLEF 2001

- LA Times 94 (with WSD data) : 72,027,935 tokens
- Glasgow Herald 95 (with WSD data): 27,731,946 tokens

Topics (with WSD data)

- 2001-2002,2004: for Training
- 2003, 2005-2006: for Testing

Tasks

- monolingual IR (English)
- bilingual (Spanish -> English)

Data Collections for the Robust Task

CLEF Year	Topics No.	English
2001	41-90	LA Times 94	x
2002	91-140	LA Times 94	x
2003	141-200	LA Times 94	Glasgow Herald 95
2004	201-250	x	Glasgow Herald 95
2005	251-300	LA Times 94	Glasgow Herald 95
2006	301-350	LA Times 94	Glasgow Herald 95
2007	only for 2002

Test and Training Data for Robust 2008

Data formats, additional files, and source of WSD tags

These are the DTDs for the disambiguated topics and documents. Please check these sample topics (original) and documents (original).

Spanish topics will be disambiguated using the first sense heuristic. English topics and documents will be annotated with the following word sense disambiguation systems:

1. Agirre, Eneko & Lopez de Lacalle, Oier (2007). UBC-ALM: Combining k-NN with SVD for WSD. Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007). pp. 341-345. Prague, Czech Republic.

2. Chan, Yee Seng, & Ng, Hwee Tou, & Zhong, Zhi (2007). NUS-PT: Exploiting Parallel Texts for Word Sense Disambiguation in the English All-Words Tasks. Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007). pp. 253-256. Prague, Czech Republic.

In order to expand from WordNet synset numbers to words in English and Spanish, you will need the following:

The English WordNet version 1.6 available from here
The Spanish WordNet, available free for research from here

Contact

Thomas Mandl, University of Hildesheim, Germany, mandl at uni-hildesheim de
Eneko Agirre, University of the Basque Country, Basque Country, e.agirre at ehu es