The main research objective of BigKnowledge consists in advancing the state-of-the-art in the development of open NLP systems for Deep Learning and Big Data textual processing, cross-lingual semantic interpretation, building large-scale lexical knowledge resources, multi-faceted domain adaptation, benchmarking and its exploitation in a number of advanced content-based domain applications for the main official languages in Spain (including Spanish, Catalan, Basque, and Galician) and English, in eScience, eHealth.
This project will aim to develop transfer and deep learning approaches to address the lack of knowledge resources for many NLP tasks and domains, focusing mostly on Spanish, English, Basque, Catalan and Galician. Even though deep learning and monolingual embeddings have improved the state of the art of NLP across tasks and languages, higher-level semantic tasks still require high-quality and large-coverage cross-lingual knowledge bases for advanced semantic processing. For many languages and domains, the existence of such knowledge bases is limited or simply non-existent, leading to much lower results than those obtained for English. This project also explores how to apply deep learning techniques for building automatically large-scale lexical knowledge bases from scratch from any language and domain. Several sectors and domains can benefit from the results of BigKnowledge even if we will focus on eHealth and eScience.
The project runs for two years, starting April 2019.