EPEC-DEP (BDT)
Euskarazko zuhaitz-bankua edo treebank-a (EPEC-DEP) dependentzia-erlazioetan oinarrituta eskuz sintaktikoki etiketatu den Euskararen Prozesamendurako Erreferentzia Corpusa (EPEC) da. EPEC corpusa euskara estandarrean idatzitako 300.000 hitzek osatzen duten testu-bilduma da. Heren bat XX. mendeko euskararen corpus estatistikotik (www.euskaracorpusa.net) hartu da eta beste bi herenak Euskaldunon Egunkariatik. Hainbat mailatan (morfologia, sintaxi partziala eta semantika) dago etiketatuta eskuzko metodoak nahiz automatikoak baliatuta.
EPEC-DEP treebank-ean, 200.000 hitz etiketatu dira eskuz Dependentzia Gramatikaren Teoria (Tesnière, 1959) jarraituz. Teoria honetan, esaldiko hitzak binaka lotuz esaldiaren zuhaitz sintaktikoa (dependentzia-zuhaitza ere deitua) lortzen da. Zuhaitz hauetan, batetik, adabegietan dauden hitzen arteko gobernatzaile/mendeko erlazioak irudikatzen dira, eta bestetik, bi hitzen arteko loturan mendekoak betetzen duen funtzio sintaktikoa adierazten da dependentzia-etiketen (Aranzabe, 2008) bidez.
Tools and services
Averell
Averell is a Python library and command line interface to download and to standardize corpora from ten multi-lingual poetry repositories |
Jollyjumper
Jollyjumper is our enjambment detection Python library for Spanish |
Rantanplan
Rantanplan is a Python library for the automated scansion of Spanish poetry |
PoetryLab app
PoetryLab: An Open Source Toolkit for the Analysis of Spanish Poetry Corpora |
PDMapping
Tool for documenting and analyzing speakers' judgments about spatial and sociocultural linguistic variation. |
Ferramenta On-Line de ExpeRimentación PerceptivA (FOLErPa)
FOLERPA is an online tool for carrying out perceptual experiments. |
Cartografía dos apelidos de Galicia
Research tool for the study of the geographical distribution of surnames in Galicia. |
Vocabulary analyzer Web Service
This web service calculates different lexicometric measures and displays them graphically (tokens, types, hapaxes & type/token ratio). |
Ngram Statistics de Pedersen
Pedersen's Ngram Statistics Package |
UPF Freeling-based part-of-speech tagger.
This is the UPF Freeling-based part-of-speech tagger. |
Análisis de relaciones de dependencias
This WS performs dependency parsing using Bohnet's graph-based Parser. The input is text in plain text or CoNLL format. The languages supported are English and Spanish. |
Freeling Named Entity Recognition - NER
Freeling-based Named Entity Recognition - NER |
WSD-IXA
Word-Sense Disambiguation |
Ixa pipes
Multilingual NLP tools |
ixaKat
A modular chain of Natural Language Processing tools for Basque |
Maltixa
Statistical Syntactic analyzer for Basque |
Eustagger
Morphosyntactic tagger for Basque |
Xuxen
Spelling and grammar checker for Basque |
BASYQUE
A web application to analyse syntactic variation of Basque dialects |
Analhitza
Category analyzer |