Machines need computing tools that are more powerful than conventional dictionaries for tasks like information extraction, disambiguation of word meanings, etc. This is in fact the function of the Euskal WordNet application —developed by the IXA Group (UPV/EHU)— which can already be consulted and downloaded free of charge.
This is the first Lexical Knowledge Base (LKB) developed for the Basque language: a “semantic dictionary” or “store” that compiles and organises lexical and semantic information. “It’s like a database, but the difference is that it not only gathers the usual information of a dictionary —the meanings of words and their corresponding definitions and examples—, it also links the concepts with each other,” pointed out Eneko Agirre, an IXA Group computer programmer.
If we look up the entry hatz (“finger”, “digit” or “toe” in Basque), the result is as follows: “Each of the five appendages at the end of human hands and feet.” That is what the term means. But apart from this information, we can get much more: the finger/toe is an appendage of the body; the thumb is a finger; fingers are part of the hand; hands, in turn, are part of the arm and fingers are used to touch objects, etc. In short: all the concepts are interrelated hierarchically. Every concept is also related to its equivalents in other languages: digit, hatz, dedo, dixito and dit.
This database is tremendously useful in various fields, like machine translation, information extraction, disambiguation of word meanings and for question-answer systems. In machine translation, for example, the system has to understand which word it is translating, a task for which it needs a “semantic dictionary” of this type. “For a quality translation, it is necessary to be able to distinguish the most appropriate meaning from among the various ones,” stressed Agirre.
“Our aim (within the framework of QTLeap European project) is to improve the quality of machine translations by using WordNet,” he pointed out.
Over the 2014-2015 academic year, the university Master’s degree in Language Analysis and Processing (LAP) that the IXA Group will be running at the UPV/EHU will be studying the Basque WordNet and other language technologies used to develop similar applications.
Master’s in Language Analysis and Processing (LAP)
The aim of the University Master’s in Language Analysis and Processing is to analyse language and to learn about the techniques and applications available for processing it with the help of the computer.
This Master’s has been organised by the UPV/EHU’s IXA Group and is geared towards anybody who combines linguistics and computing: philologists and linguistics experts, computing and telecommunications engineers, mathematicians, translators, etc. To apply for it, it is enough to be in possession of a University degree, have some experience and display some interest in the subject.
The Master’s will take one year and a half and the classes will be held at the Computing Faculty of the UPV/EHU-University of the Basque Country. It will be possible to spread it over two or three academic years (to cater for professionals who are working).
The pre-registration period is already open, and applications will be accepted until June 30. For further information on the Master’s, please check out http://ixa.si.ehu.es/master/.
[…] The Basque WordNet semantic dictionary is a “public resource” now […]