This compressed file (aprox. 1.3G) contains the set of unordered words occurring in the GH95 and LA94 document collections. There is a file for each document, organized in directories. For each word occurrence in the document we have a line with the following information: - word form - part of speech - lemma - WSD output o fthe UBC system, using Wordnet 1.6 synsets, and corresponding weight. See for more details Authors: - Arantxa Otegi - Eneko Agirre - Oier Lopez de Lacalle