ETSC-CBST
Corpus of Basque simplified texts (CBST)
Euskarazko Testu Sinplifikatuen Corpusa (ETSC)
Descripción (en):
The corpus of Basque simplified texts compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural and the intuitive. The sentences are divided in three texts (Bernoulli, Etxeko and Exoplanetak) and have been aligned to their respective simplified versions.
Descripción:
Euskarazko Testu Sinplifikatuen Corpusa (ETSC) eskuz sinplifikatutako testuekin eta euren jatorrizko bertsioarekin osatu dugun testu-bilduma da. Testuak sinplifikatzean egin diren eragiketak deskribatzeko etiketatze-eskema osatu dugu eta testuak BRAT tresnaren (Stenetorpet al., 2012) bitartez etiketatu ditugu.
Descarga
Enlace para acceder online o descargar:
Tipo:
Corpora
Persona de contacto:
Itziar González
Email persona de contacto:
itziar.gonzalezd@ehu.eus
Grupo de investigación:
IXA-UPV/EHU
Euskara
Displaying 1 - 20 of 20
Tools and services
Averell
Averell is a Python library and command line interface to download and to standardize corpora from ten multi-lingual poetry repositories |
Jollyjumper
Jollyjumper is our enjambment detection Python library for Spanish |
Rantanplan
Rantanplan is a Python library for the automated scansion of Spanish poetry |
PoetryLab app
PoetryLab: An Open Source Toolkit for the Analysis of Spanish Poetry Corpora |
PDMapping
Tool for documenting and analyzing speakers' judgments about spatial and sociocultural linguistic variation. |
Ferramenta On-Line de ExpeRimentación PerceptivA (FOLErPa)
FOLERPA is an online tool for carrying out perceptual experiments. |
Cartografía dos apelidos de Galicia
Research tool for the study of the geographical distribution of surnames in Galicia. |
Vocabulary analyzer Web Service
This web service calculates different lexicometric measures and displays them graphically (tokens, types, hapaxes & type/token ratio). |
Ngram Statistics de Pedersen
Pedersen's Ngram Statistics Package |
UPF Freeling-based part-of-speech tagger.
This is the UPF Freeling-based part-of-speech tagger. |
Análisis de relaciones de dependencias
This WS performs dependency parsing using Bohnet's graph-based Parser. The input is text in plain text or CoNLL format. The languages supported are English and Spanish. |
Freeling Named Entity Recognition - NER
Freeling-based Named Entity Recognition - NER |
WSD-IXA
Word-Sense Disambiguation |
Ixa pipes
Multilingual NLP tools |
ixaKat
A modular chain of Natural Language Processing tools for Basque |
Maltixa
Statistical Syntactic analyzer for Basque |
Eustagger
Morphosyntactic tagger for Basque |
Xuxen
Spelling and grammar checker for Basque |
BASYQUE
A web application to analyse syntactic variation of Basque dialects |
Analhitza
Category analyzer |