CLEF 2008

Robust WSD Task @ CLEF 2008

Word Sense Disambiguation for (Cross-Lingual) Information Retrieval

Following the 2007 joint SemEval-CLEF task , a related pilot task is planned with the aim of exploring the contribution of Word Sense Disambiguation to monolingual and multilingual Information Retrieval. In this case, the organizers of the task will provide document collections (from the news domain) and topics which have been automatically tagged with Word Senses from WordNet using several state-of-the-art Word Sense Disambiguation systems. QA WSD@CLEF is a closely related task on question answering.

News

  • 2009 Feb 9: Data to replicate 2008 exercise available here
  • Sep 20: After the success of the first competition ROBUST-WSD will be held again at CLEF 2009 with a similar setting. If interested, please join-in the mailing list and/or send e-mail to Eneko Agirre below.
  • Sep 13: working notes and slides from the workshop available
  • Jul. 28: final results availabe in DIRECT
  • May 21: new result submission deadline: 30 of June
  • Apr. 24: new version of guidelines
  • Apr. 14: small bug in WSD results of UBC fixed. Please download again

Mailing list

  • You can browse the e-mail list. To join enter your e-mail in the button to the left, and press subscribe.

Instructions for participation

Please note the following steps in order to participate:

1. Registration is via the CLEF website. Participants must sign an agreement restricting use of the data and regulating publication and dissemination of results. Registration closes 1 May.
2. Join the mailing list (see subscribe button to the left) for updates.
3. Download the document collection and training data (topics plus relevance data) from 1 March.
4. Download the English and Spanish WordNets (see below)
5. Read the guidelines (updated 24/4)
6. Download the test topics from 1 May.
7. Submit results by 15 June.

Time Schedule

· Registration Opens - 10 February 2008 (closes 1 May 2008)
· Data Release - from 1 March 2008
· Topic Release - from 1 May 2008
· Submission of Runs by Participants - 19 June 2008
· Release of Relevance Assessments and Individual Results - 15 July 2008
· Submission of Paper for Working Notes - 15 August 2008
· Workshop - 17-19 September 2008

Description

The robust task will bring semantic and retrieval evaluation together. The participants will be offered topics and document collections from previous CLEF campaigns which were annotated by systems for word sense disambiguation (WSD). The goal of the task is to test whether WSD can be used beneficially for retrieval systems.

The organizers believe that polysemy is among the reasons for information retrieval (IR) systems to fail. WSD could allow a more targeted retrieval. Last year, the campaigns SemEval and CLEF cooperated and created a task where participants were required to provide WSD on CLEF data collections. In a retrieval experiment by the organizers the WSD data was used for retrieval but did not lead to improvement. This year, participants are given the WSD data (or can derive their own) and can run their own retrieval experiments with various retrieval strategies.

The WSD data is based on WordNet version 1.6 and will be supplemented with data from the English and Spanish WordNets in order to test different expansion strategies. Several leading WSD experts will run their systems, and provide those WSD results for the participants to use.

Participants are required to submit at least one baseline run without WSD and one run using the WSD data. They can submit four further baseline runs without WSD and four runs using WSD with in various ways.

The robust task will use two languages often used in previous CLEF campaigns (English, Spanish). Documents will be in English, and topics in both English and Spanish.

A subset of highly ambiguous topics will be identified by the organizers and used for a separate evaluation to see how WSD works for these hard topics.

The evaluation will be based on Mean Average Precision (MAP) as well as Geometric Average Precision (GMAP). The robust measure GMAP intends to evaluate stable performance over all topics instead of high average performance in Mono- and Cross-Language IR (“ensure that all topics obtain minimum effectiveness levels” Voorhees 2005 SIGIR Forum).

Data for Robust Task

  • Ad-hoc collections which were available at CLEF 2001

- LA Times 94 (with WSD data) : 72,027,935 tokens
- Glasgow Herald 95 (with WSD data): 27,731,946 tokens

  • Topics (with WSD data)

- 2001-2002,2004: for Training
- 2003, 2005-2006: for Testing

  • Tasks

- monolingual IR (English)
- bilingual (Spanish -> English)

Data Collections for the Robust Task

CLEF Year

Topics No.

English

2001

41-90

LA Times 94

x

2002

91-140

LA Times 94

x

2003

141-200

LA Times 94

Glasgow Herald 95

2004

201-250

x

Glasgow Herald 95

2005

251-300

LA Times 94

Glasgow Herald 95

2006

301-350

LA Times 94

Glasgow Herald 95

2007

only for 2002

Test and Training Data for Robust 2008

Data formats, additional files, and source of WSD tags

These are the DTDs for the disambiguated topics and documents. Please check these sample topics (original) and documents (original).

Spanish topics will be disambiguated using the first sense heuristic. English topics and documents will be annotated with the following word sense disambiguation systems:

1. Agirre, Eneko & Lopez de Lacalle, Oier (2007). UBC-ALM: Combining k-NN with SVD for WSD. Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007). pp. 341-345. Prague, Czech Republic.

2. Chan, Yee Seng, & Ng, Hwee Tou, & Zhong, Zhi (2007). NUS-PT: Exploiting Parallel Texts for Word Sense Disambiguation in the English All-Words Tasks. Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007). pp. 253-256. Prague, Czech Republic.

In order to expand from WordNet synset numbers to words in English and Spanish, you will need the following:

  • The English WordNet version 1.6 available from here
  • The Spanish WordNet, available free for research from here

Contact

Thomas Mandl, University of Hildesheim, Germany, mandl at uni-hildesheim de
Eneko Agirre, University of the Basque Country, Basque Country, e.agirre at ehu es