Project – Ixa Group. Language Technology. https://www.ehu.eus/ehusfera/ixa News from the Ixa Group in the University of the Basque Country Tue, 20 Jul 2021 11:52:49 +0000 en-US hourly 1 https://wordpress.org/?v=5.6.4 Workshop: Resources and tools for the automatic processing of the languages of the Pyrenees (2021-05-12, Online, Free) https://www.ehu.eus/ehusfera/ixa/2021/04/30/workshop-resources-and-tools-for-the-automatic-processing-of-the-languages-of-the-pyrenees-2021-04-12-online-free/ https://www.ehu.eus/ehusfera/ixa/2021/04/30/workshop-resources-and-tools-for-the-automatic-processing-of-the-languages-of-the-pyrenees-2021-04-12-online-free/#respond Fri, 30 Apr 2021 17:52:56 +0000 http://www.ehu.eus/ehusfera/ixa/?p=2814

The European project EFA 227/16/LINGUATEC “Development of cross-border cooperation and knowledge transfer in language technologies” organizes the workshop open to all researchers, with the aim of disseminating the work carried out within the project and presenting some of the advances made for Basque and Occitan. This project is co-financed by the European Regional [...]]]>

The European project EFA 227/16/LINGUATEC “Development of cross-border cooperation and knowledge transfer in language technologies” organizes the workshop open to all researchers, with the aim of disseminating the work carried out within the project and presenting some of the advances made for Basque and Occitan.
This project is co-financed by the European Regional Development Fund (ERDF)

Free registration. Please, use this registration form

12 May, 2021
Online, with presentations in English, Spanish and French, with simultaneous translation into English, Spanish and French.

10h – Opening

10h15 Invited talks: Catalan processing

Lluis Padró (Universitat Politècnica de Catalunya)
Morphological and Syntactic Resources in FreeLing
Presentation in English – Simultaneous translation in Spanish and French

Mariona Taulé (Universitat de Barcelona)
AnCora: un corpus anotado a diferentes niveles lingüístico
AnCora: a corpus annotated at different linguistic levels
Presentation in Spanish – Simultaneous translation in English and French

11h15 — Break

11h30 Presentations: Corpora for Occitan, Basque and other under-resourced languages

Assaf Urieli, Joliciel
Talismane, Jochre: automatic syntax analysis and OCR for under-resourced languages
Presentation in English – Simultaneous translation in Spanish and French

Aleksandra Miletic y Dejan Stosic, CLLE
Mutualisation des ressources pour la création de treebanks : le cas du serbe et de l’occitan
Pooling resources for the creation of syntactic tree banks: the case of Serbian and Occitan
Presentation in French – Simultaneous translation in English and Spanish

Ainara Estarrona (IXA, HiTZ, UPV/EHU)
Construcción del corpus histórico en euskera
Construction of a historical corpus in Basque
Presentation in Spanish – Simultaneous translation in English and French

13h — Break

14h30 Invited talk: Use of Neural Networks

Mans Hulden (University of Colorado)
Neural Networks in Linguistic Research
Presentation in English – Simultaneous translation in Spanish and French

15h30 Presentación: Language processing

Rodrigo Agerri (IXA, HITZ, UPV/EHU)
Contextual lemmatization for inflected languages: statistical and deep-learning approaches
Presentation in English – Simultaneous translation in Spanish and French

16h – Break

16h15 – Presentations: Results of the LINGUATEC project

Myriam Bras, Aleksandra Miletic, Marianne Vergez-Couret, Clamença Poujade, Jean Sibille, Louise Esher, CLLE :
Automatic processing of Occitan: construction of the first annotated corpora
Video in Occitan with accessible subtitles in English, Spanish and French

Elhuyar
Creation and improvement of Basque resources within the framework of Linguatec
Video in Occitan with accessible subtitles in English, Spanish and French

16h45 – Conclusions

Presentation in Spanish and French – No simultaneous translation

17h – Closing

]]> https://www.ehu.eus/ehusfera/ixa/2021/04/30/workshop-resources-and-tools-for-the-automatic-processing-of-the-languages-of-the-pyrenees-2021-04-12-online-free/feed/ 0
The Ixa research group has been awarded in the artificial intelligence competition promoted by the US government related to COVID-19 disease https://www.ehu.eus/ehusfera/ixa/2020/05/07/ixa-awarded-in-the-artificial-intelligence-competition-related-to-covid-19-disease/ https://www.ehu.eus/ehusfera/ixa/2020/05/07/ixa-awarded-in-the-artificial-intelligence-competition-related-to-covid-19-disease/#comments Thu, 07 May 2020 12:37:36 +0000 http://www.ehu.eus/ehusfera/ixa/?p=2688 The competition CORD-19 (COVID-19 Open Research Dataset Challenge) has been organized by several organizations such as Allen Institute for AI, Chan Zuckerberg Initiative, Georgetown University, Microsoft Research, National Institutes of Health and The White House Office of Science and Technology Policy. The organization has made available to the global research community more than 50,000 scientific [...]]]> The competition CORD-19 (COVID-19 Open Research Dataset Challenge)  has been organized by several organizations such as Allen Institute for AI, Chan Zuckerberg Initiative, Georgetown University, Microsoft Research, National Institutes of Health and The White House Office of Science and Technology Policy. The organization has made available to the global research community more than 50,000 scientific articles on COVID-19, SARS-CoV-2 and other coronavirus. At the same time, they issue a call to action to artificial intelligence researchers to apply the recent advances in natural language processing, in order to help scientists fighting COVID-19 disease to find necessary information in the scientific literature.

In the first phase of the competition there were 10 awards, and the system developed in the Ixa group of the HITZ centre has been awarded with one of them. Researchers from the University of the Basque Country Arantxa Otegi and Jon Ander Campos and professors Eneko Agirre and Aitor Soroa participated in the development of the system. The developed system finds answers to high priority questions from experts related to COVID-19 disease and the SARS-CoV-2 virus analyzing the aforementioned scientific articles. Thus, this system is useful for finding answers to questions such as the history of coronavirus, the transmission and diagnosis of the virus, the prevention measures in the contact between humans and animals and the lessons of previous epidemiological studies. The results of the system have been evaluated by a group of experts from the NIH of the United States and it has been selected as the system that has best answered a set of questions on the topic “What do we know about diagnostics and surveillance?”. The answers given by the system can be seen here.

See here some examples

 

]]>
https://www.ehu.eus/ehusfera/ixa/2020/05/07/ixa-awarded-in-the-artificial-intelligence-competition-related-to-covid-19-disease/feed/ 1
Eneko Agirre won for the third consecutive year the Google prize https://www.ehu.eus/ehusfera/ixa/2020/04/01/eneko-agirre-won-for-the-third-consecutive-year-the-google-prize/ https://www.ehu.eus/ehusfera/ixa/2020/04/01/eneko-agirre-won-for-the-third-consecutive-year-the-google-prize/#respond Wed, 01 Apr 2020 11:31:36 +0000 http://www.ehu.eus/ehusfera/ixa/?p=2727

Eneko Agirre won again a Google prize last March. He is one of the few researchers who has obtained the Google Faculty Research Award on three occasions. The $62,000 prize will fund the project ‘Conversational Question Answering agents that learn after deployment’ to develop user dialogue systems, chatbots and artificial intelligence.

[...]]]>

Eneko Agirre  won again a Google prize last March. He is one of the few researchers who has obtained the Google Faculty Research Award on three occasions. The $62,000 prize will fund the project ‘Conversational Question Answering agents that learn after deployment’ to develop user dialogue systems, chatbots and artificial intelligence.

Eneko Agirre, member of Ixa Group and professor at the Faculty of Computer Science of the UPV/EHU, is the director of the newly created HiTZ Research Center. The other 6 colleagues in the project are professors Aitor Soroa and Gorka Azkune, researcher Arantxa Otegi, doctoral student Jon Ander Campos, student of Master in Language Analysis and Processing Aitor Agirre and student of Degree in Computer Science Eduardo Vallejo.

Although the project focuses mainly on English dialogues (questions about cooking and food), they are also working with Basque dialogues. For this purpose, last year Ixa Group launched a campaign to recruit volunteers for the collection of interviews in Basque. The campaign was succesfull and many personal interviews were collected in Basque (http://ixa.eus/lagundu).

 

]]>
https://www.ehu.eus/ehusfera/ixa/2020/04/01/eneko-agirre-won-for-the-third-consecutive-year-the-google-prize/feed/ 0
Meeting of LINGUATEC project in Donostia (2019-02-21) https://www.ehu.eus/ehusfera/ixa/2019/02/26/meeting-of-linguatec-project-in-donostia-2019-02-21/ https://www.ehu.eus/ehusfera/ixa/2019/02/26/meeting-of-linguatec-project-in-donostia-2019-02-21/#respond Tue, 26 Feb 2019 18:29:28 +0000 http://www.ehu.eus/ehusfera/ixa/?p=2651 LINGUATEC is an European project funded by FEDER via POCTEFA (Programa INTERREG V-A España-Francia-Andorra). The partners are the followings:

Elhuyar Fundazioa Lo Congrès Permanent de la Lenga Occitana Universidad Del País Vasco / Euskal Herriko Unibertsitatea (Ixa Taldea) CNRS (CENTRE National de [...]]]>
LINGUATEC project:  Development of cross-border cooperation and knowledge transfer in language technologies.

LINGUATEC is an European project funded by FEDER via POCTEFA (Programa INTERREG V-A España-Francia-Andorra). The partners are the followings:

  • Elhuyar Fundazioa
  • Lo Congrès Permanent de la Lenga Occitana
  • Universidad Del País Vasco / Euskal Herriko Unibertsitatea (Ixa Taldea)
  • CNRS (CENTRE National de la Recherche Scientifique) – Delegation Regionale Midi-Pyrenees
  • Euskaltzaindia – Real Academia de la Lengua Vasca
  • Sociedad De Promoción y Gestión del Turismo Aragonés

The main objective in Linguatec is to develop, test and disseminate new innovative linguistic resources, tools and solutions for a better digitalization level of the Aragonian, Basque and Occitan languages. As a result, we will obtain, among others, (1) a road map of Aragonian Digitalization, (2) new monolingual and bilingual lexicons and morphosyntactic and syntactic analysers for Occitan, (3) a Northern Basque speech recognition system and several linguistic tools as well as (4) new innovative solutions for Aragonian, Basque and Occitan.

These cross-border cooperation will allow the transfer of knowledge and to develop linguistic solutions with a potential market uptake, benefiting language professionals, easing access to multilingual contents, and fostering the development of a cross-border language tech cluster.

After one year work, last Wednesday we had a project meeting in Donostia organized by Euskaltzaindia. Ixa Group presented the progress in the creation of an improved Neuronal Machine Translation system for the pair Spanish-Basque.

 

]]>
https://www.ehu.eus/ehusfera/ixa/2019/02/26/meeting-of-linguatec-project-in-donostia-2019-02-21/feed/ 0
PhD position in Innsbruck with Michael Ustaszewski https://www.ehu.eus/ehusfera/ixa/2017/04/13/phd-position-in-innsbruck-with-michael-ustaszewski/ https://www.ehu.eus/ehusfera/ixa/2017/04/13/phd-position-in-innsbruck-with-michael-ustaszewski/#comments Thu, 13 Apr 2017 01:10:53 +0000 http://www.ehu.eus/ehusfera/ixa/?p=2498 After finishing our Erasmus Mundus LCT Master in 2016 Michael Ustaszewski is now a postdoc assistant at the University of Innsbruck, and Unit Manager (liaison with the Department of Translation Studies) at the Innsbruck Translation Centre. His group is working on Corpus-Based Translation and asked us to publish this Call for PhD Position Candidates:

After finishing our Erasmus Mundus LCT Master in 2016 Michael Ustaszewski is now a postdoc assistant at the University of Innsbruck, and Unit Manager (liaison with the Department of Translation Studies) at the Innsbruck Translation Centre. His group is working on Corpus-Based Translation and asked us to publish this Call for PhD Position Candidates:

The Department of Translation Studies at the University of Innsbruck invites applications for a PhD position in the framework of the two-year research project “TransBank: A Meta-Corpus for Translation Research” funded by the Austrian Academy of Sciences.

The goal of the project is to build a large, open and expandable bank of translated texts and their original texts. Its main innovative feature is the ability to exploit a rich set of metadata labels characterising each text and text pair for the compilation and download of sub-corpora, tailored to the requirements of specific translation-related research questions.

The PhD student will be involved in all stages of the corpus building process, thus having the opportunity to gather translation data relevant to his/her specific research interest. The student will work autonomously on the development of the metadata labelset and on collecting translation data, on the basis of which he or she will conduct quantitative and/or qualitative analyses for his/her thesis. Work will be carried out in close collaboration with the project’s two principal investigators and two MA students.

The following requirements are looked for in the successful candidate:

  • Master’s degree in Translation Studies, Corpus Linguistics,
  • Computational Linguistics or a related field
  • proven familiarity with translation theory
  • strong interest in data-driven research methodologies and linguistic annotation
  • excellent teamwork skills
  • proficiency in English on a level suitable for written and spoken scientific communication
  • solid programming skills in a scripting language (e.g. Python) will be an asset, as will knowledge of German or any other language(s)

The two-year position with a weekly working time of 20 hours (50%) commences in September 2017 and offers an annual stipend of € 19,117 plus allowances for conference attendance. The position involves enrolment in the PhD programme in Linguistics and Media Studies at the University of Innsbruck.

Applications should include:

  1. A cover letter (1 page maximum) that relates the candidate’s  experience and interest in the TransBank project
  2. A two-page thesis proposal describing the research question and methodology underlying the candidate’s envisaged analyses using TransBank data
  3. A CV listing any publications
  4. Copies of relevant diplomas and certificates
  5. A recommendation letter by the candidate’s MA thesis supervisor or a university professor
  6. A copy of the MA thesis or the latest draft

To apply, please submit the documents in two PDF files (one containing documents 1 to 5, one containing document 6) by 10 April 2017 via the upload form at http://transbank.info/jobs

Shortlisted applicants will be interviewed in person or via Skype towards the end of April.

Further information:

Details on the research project can be found on the project website http:/www.transbank.info
For enquiries about the position and the application process, please contact mail[at]transbank.info
Information about the Department of Translation Studies at the University of Innsbruck: http://translation.uibk.ac.at
For information on the PhD programme in Linguistics and Media Studies at the University of Innsbruck and the enrolment process, please refer to
https://www.uibk.ac.at/studium/angebot/phd-sprach-und-medienwissenschaft/index.html.en

]]> https://www.ehu.eus/ehusfera/ixa/2017/04/13/phd-position-in-innsbruck-with-michael-ustaszewski/feed/ 1
Are the meanings of these two words related? (Eneko’s Google Award) https://www.ehu.eus/ehusfera/ixa/2016/04/05/are-the-meanings-of-these-two-words-related-enekos-google-award/ https://www.ehu.eus/ehusfera/ixa/2016/04/05/are-the-meanings-of-these-two-words-related-enekos-google-award/#comments Tue, 05 Apr 2016 07:24:47 +0000 http://www.ehu.eus/ehusfera/ixa/?p=2294 Measuring the distance between word meanings

Eneko Agirre has been awarded with a Google Research Award after its annual open call for proposals on computer science and related topics, including machine learning, speech recognition, natural language processing, and computational neuroscience. Google received 950 proposals covering 55 countries and over 350 universities, and decided to [...]]]>

Measuring the distance between word meanings

Eneko Agirre has been awarded with a Google Research Award after its annual open call for proposals on computer science and related topics, including machine learning, speech recognition, natural language processing, and computational neuroscience. Google received 950 proposals covering 55 countries and over 350 universities, and  decided to fund 151 projects, 10 of then related to Natural Language Processing.

Eneko will expend the $50.000 prize in research on “Learning Interlingual Representations of Words and Concepts.”. In the field of language processing Eneko is one of the 10 researchers awarded by Google. In addition to Eneko, also they have been awarded researchers at universities like Harvard, Berkeley, Edinburgh or Washington.

Eneko Agirre: “The goal of this basic research is the interlingual representation of the meaning of words, ie, knowing automatically when the meanings of two words are related in a language or in different languages. It would be like having a special dictionary to know which words have similar meanings. For example, knowing that the meaning of the word ‘banco‘ in Spanish is similar to ‘caja de ahorros‘ (‘savings bank’) and ‘silla‘ (chair’), depending on the word sense, but not to words like ‘cat‘ or ‘Monday‘. We visualize the distance between different senses of a word, and we can thus represent the senses of ‘banco‘, one that resembles ‘savings bank‘ and the other sense similar to ‘chair‘. Our proposal is able to represent the meanings of words from several languages in the same space, which allows us to know that a sense of ‘banco‘ is similar to ‘bank‘ and ‘kutxa‘ in English and Basque, respectively, and that the other sense of ‘banco’ is similar to ‘chair‘ and ‘aulki‘, but none of the two meanings are similar to ‘cat‘ or ‘katu‘”

The methods that support this kind of research are taught in Eneko Agirre’s course in the Master “Language Analysis and Processing” at the Faculty of Informatics of the University of the Basque Country in Donostia.

CONGRATULATIONS Eneko!

http://player.vimeo.com/video/129409827?title=0&byline=0

]]> https://www.ehu.eus/ehusfera/ixa/2016/04/05/are-the-meanings-of-these-two-words-related-enekos-google-award/feed/ 4
Eneko Agirre awarded by Google Research https://www.ehu.eus/ehusfera/ixa/2016/02/17/eneko-agirre-awarded-by-google-research/ https://www.ehu.eus/ehusfera/ixa/2016/02/17/eneko-agirre-awarded-by-google-research/#respond Wed, 17 Feb 2016 14:17:05 +0000 http://www.ehu.eus/ehusfera/ixa/?p=2290 Last week, Eneko Agirre was awarded with a Google Research Awards: Fall 2015 after its annual open call for proposals on computer science and related topics including machine learning, speech recognition, natural language processing, and computational neuroscience. Google received 950 proposals covering 55 countries and over 350 universities, and decided to fund 151 projects, [...]]]>

Last week, Eneko Agirre was awarded with a Google Research Awards: Fall 2015 after its annual open call for proposals on computer science and related topics including machine learning, speech recognition, natural language processing, and computational neuroscience. Google received 950 proposals covering 55 countries and over 350 universities, and  decided to fund 151 projects, ten of then related to Natural Language Processing.

Eneko will expend the $50.000 prize in research on “Learning Interlingual Representations of Words and Concepts.”

CONGRATULATIONS Eneko!

Ten awards in the area of NLP

]]>
https://www.ehu.eus/ehusfera/ixa/2016/02/17/eneko-agirre-awarded-by-google-research/feed/ 0
2nd Workshop on Semantics-Driven Machine Translation (SedMT) https://www.ehu.eus/ehusfera/ixa/2016/02/17/2nd-workshop-on-semantics-driven-machine-translation-sedmt/ https://www.ehu.eus/ehusfera/ixa/2016/02/17/2nd-workshop-on-semantics-driven-machine-translation-sedmt/#comments Wed, 17 Feb 2016 13:31:19 +0000 http://www.ehu.eus/ehusfera/ixa/?p=2282 Eneko AGIRRE and Nora ARANBERRI, Ixa Group from University of the Basque Country, together with Deyi Xiong (Soochow University), Kevin Duh (Johns Hopkins University), and HoufengWang (Peking University), are the organizers of the 2nd Workshop on Semantics-Driven Machine Translation (SedMT) San Diego, California, USA; June 16, 2016 (co-located with NAACL).

This workshop seeks to build [...]]]> Eneko AGIRRE and Nora ARANBERRI, Ixa Group from University of the Basque Country,  together with Deyi Xiong (Soochow University), Kevin Duh (Johns Hopkins University), and HoufengWang (Peking University), are the organizers of the 2nd Workshop on Semantics-Driven Machine Translation (SedMT)  San Diego, California, USA; June 16, 2016 (co-located with NAACL).

This workshop seeks to build on the success of its precursor S2MT 2015, which was held in conjunction with ACL 2015 in Beijing. This workshop will continue efforts of promoting the shift of interest from syntax to semantics in machine translation, exploring new horizons and cultivating ideas of cutting-edge models and algorithms for semantic machine translation.

QTLeap project Best Paper Award

QTLeap_LogoThis year SedMT will award a best paper award among papers which advance MT using lexical emantics and deep language processing.ward is sponsored by the European Union QTLeap project.

IXA talde is a partner in QTLeap (Quality Translation by Deep Language Engineering Approaches),  a project that is run by an European consortium with other seven partners: Bulgarian Academy of Sciences, Charles University in Prague, German Research Center for Artificial Intelligence, Higher Functions Lda., Humboldt University in Berlin, University of the Basque Country, University of Groningen and University of Lisbon. For more information and contact details please visit: qtleap.eu.

 Important Dates

Paper submission: March 8, 2016
Notification of acceptance: March 25, 2016
Camera-ready papers due: April 7, 2016
Workshop: June 16, 2016

Sumbissions: http:// hlt.suda.edu.cn/workshop/s2mt/index.html

]]> https://www.ehu.eus/ehusfera/ixa/2016/02/17/2nd-workshop-on-semantics-driven-machine-translation-sedmt/feed/ 1
QTLeap european project: Meeting in Donostia-San Sebastian (June/30 – July/1) https://www.ehu.eus/ehusfera/ixa/2015/06/29/qtleap-european-project-meeting-in-donostia-san-sebastian-june30-july1/ https://www.ehu.eus/ehusfera/ixa/2015/06/29/qtleap-european-project-meeting-in-donostia-san-sebastian-june30-july1/#comments Mon, 29 Jun 2015 06:53:19 +0000 http://www.ehu.eus/ehusfera/ixa/?p=2216 IXA Group is organizing in Donostia-San Sebastián a meeting of the European project QTLeap from Monday June 29th to Wednesday July 1st.

Recently, at the beginging of June, this project succesfully organized in Denver, Colorado, the SSST-9 – Ninth Workshop on Syntax, Semantics and Structure in Statistical Translation collocated with NAACL 2015, June 4, 2015.

[...]]]>
IXA Group is organizing in Donostia-San Sebastián a meeting of the European project QTLeap from Monday June 29th to Wednesday July 1st.

Recently, at the beginging of June, this project succesfully organized in Denver, Colorado, the SSST-9 – Ninth Workshop on Syntax, Semantics and Structure in Statistical Translation collocated with NAACL 2015, June 4, 2015.

The QTLeap project (Quality Translation by Deep Language Engineering Approaches) investigates and develops an innovative methodology for
Machine Translation that explores new solutions, using deep language engineering approaches to achieve higher quality translations. The project is run by an European consortium with other seven partners: Bulgarian Academy of Sciences, Charles University in Prague, German Research Center for Artificial Intelligence, Higher Functions Lda., Humboldt University in Berlin, University of the Basque Country, University of Groningen and University of Lisbon. For more information and contact details please visit: qtleap.eu.
]]>
https://www.ehu.eus/ehusfera/ixa/2015/06/29/qtleap-european-project-meeting-in-donostia-san-sebastian-june30-july1/feed/ 1
R. Agerri: mentor in Google Summer of Code 2015 https://www.ehu.eus/ehusfera/ixa/2015/05/13/r-agerri-mentor-in-google-summer-of-code-2015/ https://www.ehu.eus/ehusfera/ixa/2015/05/13/r-agerri-mentor-in-google-summer-of-code-2015/#comments Wed, 13 May 2015 15:03:07 +0000 http://www.ehu.eus/ehusfera/ixa/?p=2164 Our colleague Rodrigo Agerri has been selected as the mentor of the project Word Sense Disambiguation – Supervised Techniques presented by Apache Software Foundation to the Google Summer of Code 2015.

The objective of Word Sense Disambiguation is to determine which sense of a word is meant in a particular context. Apache OpenNLP [...]]]> Our colleague Rodrigo Agerri has been selected as the mentor of the project Word Sense Disambiguation – Supervised Techniques presented by Apache Software Foundation to the Google Summer of Code 2015.

The objective of Word Sense Disambiguation is to determine which sense of a word is meant in a particular context. Apache OpenNLP currently lacks a WSD module, therefore, the purpose of this project is to design and build a WSD module that implement the algorithms of common supervised techniques. The implemented techniques could serve as examples for any future contributor that would like to add other approaches.

List of projects accepted into Google Summer of Code 2015

]]> https://www.ehu.eus/ehusfera/ixa/2015/05/13/r-agerri-mentor-in-google-summer-of-code-2015/feed/ 1