MT4All Resources
- Translation Models: Neural translation models trained via unsupervised machine translation for the MT4All project
- Dictionaries: Automatically created dictionaries that has been derived from the training of unsupervised machine translation models
- Crosslingual Word Embeddings: Word embeddings that have been aligned on the same vector space using Vecmap (https://github.com/artetxem/vecmap) according to their similarity
- Corpus: Corpus that have been collected for the MT4All project
Code
- MT4All/monoses: Adapted version of Monoses for the MT4All project
External Resources
- Paracrawl: Broader/Continued Web-Scale Provision of Parallel Corpora for European Languages
- Common Crawl: open repository of web crawl data that can be accessed and analyzed by anyone
- OSCAR: Open Super-large Crawled ALMAnaCH coRpus is a huge multilingual corpus obtained by language classification and filtering of the Common Crawl corpus
- OpusMT: NMT systems trained with all the languages and corpora available in Opus