ETSC-CBST

Corpus of Basque simplified texts (CBST)
Euskarazko Testu Sinplifikatuen Corpusa (ETSC)
Descripción (en): 
The corpus of Basque simplified texts compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural and the intuitive. The sentences are divided in three texts (Bernoulli, Etxeko and Exoplanetak) and have been aligned to their respective simplified versions.
Descripción: 

Euskarazko Testu Sinplifikatuen Corpusa (ETSC) eskuz sinplifikatutako testuekin eta euren jatorrizko bertsioarekin osatu dugun testu-bilduma da. Testuak sinplifikatzean egin diren eragiketak deskribatzeko etiketatze-eskema osatu dugu eta testuak BRAT tresnaren (Stenetorpet al., 2012) bitartez etiketatu ditugu.

Descarga
Enlace para acceder online o descargar: 
Tipo: 
Corpora
Persona de contacto: 
Itziar González
Email persona de contacto: 
itziar.gonzalezd@ehu.eus
Grupo de investigación: 
IXA-UPV/EHU
Euskara
Displaying 1 - 2 of 2

Grammars and language models

EDGK
Rule-based Dependency Grammar for Basque

BERTeus
BERT language model for Basque
Displaying 1 - 20 of 20

Tools and services

Averell
Averell is a Python library and command line interface to download and to standardize corpora from ten multi-lingual poetry repositories
Jollyjumper
Jollyjumper is our enjambment detection Python library for Spanish
Rantanplan
Rantanplan is a Python library for the automated scansion of Spanish poetry
PoetryLab app
PoetryLab: An Open Source Toolkit for the Analysis of Spanish Poetry Corpora
PDMapping
Tool for documenting and analyzing speakers' judgments about spatial and sociocultural linguistic variation.
Ferramenta On-Line de ExpeRimentación PerceptivA (FOLErPa)
FOLERPA is an online tool for carrying out perceptual experiments.
Cartografía dos apelidos de Galicia
Research tool for the study of the geographical distribution of surnames in Galicia.
Vocabulary analyzer Web Service
This web service calculates different lexicometric measures and displays them graphically (tokens, types, hapaxes & type/token ratio).
Ngram Statistics de Pedersen
Pedersen's Ngram Statistics Package
UPF Freeling-based part-of-speech tagger.
This is the UPF Freeling-based part-of-speech tagger.
Análisis de relaciones de dependencias
This WS performs dependency parsing using Bohnet's graph-based Parser. The input is text in plain text or CoNLL format. The languages supported are English and Spanish.
Freeling Named Entity Recognition - NER
Freeling-based Named Entity Recognition - NER
WSD-IXA
Word-Sense Disambiguation
Ixa pipes
Multilingual NLP tools
ixaKat
A modular chain of Natural Language Processing tools for Basque
Maltixa
Statistical Syntactic analyzer for Basque

Eustagger
Morphosyntactic tagger for Basque

Xuxen
Spelling and grammar checker for Basque
BASYQUE
A web application to analyse syntactic variation of Basque dialects
Analhitza
Category analyzer