award – Ixa Group. Language Technology. https://www.ehu.eus/ehusfera/ixa News from the Ixa Group in the University of the Basque Country Wed, 03 Dec 2014 15:15:15 +0000 en-US hourly 1 https://wordpress.org/?v=5.6.4 Koldo Mitxelena award for PhD theses to Arantxa Otegi https://www.ehu.eus/ehusfera/ixa/2013/02/07/koldo-mitxelena-award-for-phd-theses-to-arantxa-otegi/ https://www.ehu.eus/ehusfera/ixa/2013/02/07/koldo-mitxelena-award-for-phd-theses-to-arantxa-otegi/#comments Thu, 07 Feb 2013 17:39:59 +0000 http://www.ehu.eus/ehusfera/ixa/?p=1393

Our colleague Arantxa Otegi won last Janaury the III. Koldo MItxelena Award for PhD Theses organized by Euskaltzaindia (the Academy of Basque Language) and the University of the Basque Country.

CONGRATULATIONS Arantxa!

Congratulations to her supervisors (Xabier Arregi and Eneko Agirre).

The title of this thesis is ‘Expansion for information retrieval: contribution of word [...]]]> III_Koldo_Mitxelena_Arantxa

Our colleague Arantxa Otegi won last Janaury the III. Koldo MItxelena Award for PhD Theses organized by Euskaltzaindia (the Academy of Basque Language) and the University of the Basque Country.

CONGRATULATIONS Arantxa!

Congratulations to her supervisors (Xabier Arregi and Eneko Agirre).

The title of this thesis is ‘Expansion for information retrieval: contribution of word sense disambiguation and semantic relatedness’.

The whole text is available here. This is the abstract:

Information retrieval (IR) aims at searching documents which satisfy the information need of an user. In that way, an IR system informs the user about relevant documents, that is those documents that contain the information they need as formulated in the query. Well-known search engines like Google and Yahoo are prime examples of IR systems.
A perfect IR system should retrieve only, and all, the relevant documents, rejecting the non-relevant ones. However, perfect retrieval systems do not exist. One of the main problems is the so-called vocabulary mismatch problem between query and documents: some documents might be relevant to the query even if the specific terms used differ substantially, or some documents might not be relevant to the query even they have some terms in common. The former is because several words or phrases can be used to express the same idea or item (synonymy). The latter is caused by ambiguity, where one word can have more than one interpretation depending on the context. Owing to these facts, if an IR system relies only on terms occurring in both the query and the document when it comes to deciding whether a document is relevant, it might be diffcult to fnd some of the interesting documents, and also to reject non-relevant documents. It seems fair to think that there will be more chances of successful retrieval if the meaning of the text is also taken into account.
Even though the vocabulary mismatch problem has been widely discussed in the literature from the early days of IR it remains unsolved, and most search engines just ignore it. This PhD dissertation explores whether natural language processing (NLP) can be used to alleviate this problem.
In a nutshell, we expand queries and documents making use of two NLP techniques, word sense disambiguation and semantic relatedness. For each of the mentioned techniques we propose an expansion strategy, in which we obtain synonyms and other related words for the words in the query and documents. We also present, for each case, a method to combine the expansions and original words effectively in an IR system. Furthermore, as the expansion technique we propose is useful for translating queries and documents, we show how a cross lingual information retrieval system could be improved using such an expansion technique.

Our extensive experiments on three datasets show that the expansion methods explored in this dissertation help overcome the mismatch problem, consequently improving the effectiveness of an IR system.

]]> https://www.ehu.eus/ehusfera/ixa/2013/02/07/koldo-mitxelena-award-for-phd-theses-to-arantxa-otegi/feed/ 1
Mitxelena Award for PhD theses: Maite Oronoz eta Larraitz Uria https://www.ehu.eus/ehusfera/ixa/2011/04/06/mitxelena-award_oronoz-uria/ https://www.ehu.eus/ehusfera/ixa/2011/04/06/mitxelena-award_oronoz-uria/#comments Wed, 06 Apr 2011 11:46:22 +0000 http://www.ehu.eus/ehusfera/ixa/?p=498  

Our colleague Maite Oronoz won last Monday the II. Koldo MItxelena Award for PhD Theses organized by Euskaltzaindia (the Academy of Basque Language) and the University of the Basque Country.

CONGRATULATIONS Maite!

Besides, our colleague Larraitz Uria’s PhD thesis was also nominated for this award.

Both theses face language error detection. Maite’s thesis deals [...]]]>  

Our colleague Maite Oronoz won last Monday the II. Koldo MItxelena Award for PhD Theses organized by Euskaltzaindia (the Academy of Basque Language) and  the University of the Basque Country.

CONGRATULATIONS Maite!

Besides, our colleague Larraitz Uria’s PhD thesis was also nominated for this award.

Both theses face language error detection. Maite’s thesis deals with it from a computational point of view, while Larraitz’ work does it from a linguistic perspective.

Title of Maite’s thesis: Euskarazko errore sintaktikoak detektatzeko eta zuzentzeko baliabideen garapena: datak, postposizio-lokuzioak eta komunztadura.
(Saroi, a system to detect and correct syntactic mistakes: dates, complex postpositions, and agreement.)
Maite’s supervisors: Arantza Diaz de Ilarraza and Koldo Gojenola
Title of Larraitz’ thesis: Euskarazko erroreen eta desbideratzeen analisirako lan-ingurunea. Determinatzaile-erroreen azterketa eta prozesamendua.
(A framework for the analysis of errors and deviations in Basque texts. Analysis and processing of errors on the use of determiners.
Larraitz’ supervisors: Igone Zabala and Montse Maritxalar
Publications:

]]> https://www.ehu.eus/ehusfera/ixa/2011/04/06/mitxelena-award_oronoz-uria/feed/ 1