COVID update: Due to the pandemic the course is also available online. In addition to onsite attendance, the classes will be broadcasted live online. The practical labs are also available online, with split groups with one lecturer in each. Given our online teaching experience in the last months, we can offer a high-quality and engaging course, both at the theoretical and hands-on practical sessions.

Deep Learning neural network models have been successfully applied to natural language processing, and are now changing radically how we interact with machines (Siri, Amazon Alexa, Google Home, Skype translator, Google Translate, or the Google search engine). These models are able to infer a continuous representation for words and sentences, instead of using hand-engineered features as in other machine learning approaches. The seminar will introduce the main deep learning models used in natural language processing, allowing the attendees to gain hands-on understanding and implementation of them in Tensorflow.

This course is a 35 hour introduction to the main deep learning models used in text processing, covering the latest developments, including Transformers and pre-trained (multilingual) language models like GPT, BERT and XLMR. It combines theoretical and practical hands-on classes. Attendants will be able to understand and implement the models in Tensorflow.

Student profile

Addressed to professionals, researchers and students who want to understand and apply deep learning techniques to text. The practical part requires basic programming experience, a university-level course in computer science and experience in Python. Basic math skills (algebra or pre-calculus) are also needed.


Introduction to machine learning and NLP with Tensorflow

Machine learning, Deep learning
Natural Language Processing
A sample NLP task with ML
. Sentiment analysis
. Features
. Logistic Regression
LABORATORY: Sentiment analysis with logistic regression

Multilayer Perceptron

Multiple layers ~ Deep: MLP
Backpropagation and gradients
Learning rate
More regularization
LABORATORY: Sentiment analysis with Multilayer Perceptron

Embeddings and Recurrent Neural Networks

Representation learning
Word embeddings
From words to sequences: Recurrent Neural Networks (RNN)
LABORATORY: Sentiment analysis with RNNs

Seq2seq, Neural Machine Translation and better RNNs

Application of RNN:
. Language Models (sentence encoders)
. Language Generation (sentence decoders)
. Sequence to sequence models and Neural Machine Translation (I)
Problems with gradients in RNN
LABORATORY: Sentiment analysis with GRUs

Attention, Transformers and Natural Language Inference

Re-thinking seq2seq:
. Attention, memory, Transformers
. State of the art NMT
Natural Language Inference with siamese networks
LABORATORY: Attention Model for NLI

Pre-trained transformers, BERTology

CNNs for text
Pre-trained language models
Deep learning frameworks
Last words
LABORATORY: Pre-trained transformers for sentiment analysis and NLI

Bridging the gap between natural languages and the visual world

Brief introduction to Deep Learning for Computer Vision
Convolutional Neural Networks
Image captioning
Visual question answering
Text-based image generation
LABORATORY: Image captioning with CNNs and RNNs


Person 1

Eneko Agirre

Professor, member of IXA

Person 2

Oier Lopez de la Calle

Postdoc researcher at IXA

Person 2

Gorka Azkune

Asist. prof., member of IXA

Person 2

Ander Barrena

Postdoc researcher at IXA

Invited talk

Kyunghyun Cho

Associate Professor at NYU, CIFAR Associate Fellow

Person 4

Self-Supervised Manifold Based Data Augmentation

Vicinal risk minimization (VRM) is a learning principle under which data augmentation can be naturally integrated into supervised learning. For this use of VRM, we need to be aware of so-called local density which characterizes a probability distribution underlying the input. In this work, we use the idea of denoising to learn a data manifold using a large amount of unlabelled data and to stochastically augment data. Across multiple problems - text classification and machine translation, and across multiple datasets - Amazon review data, natural language inference to low-resource translation, we observe consistent improvement with the proposed approach, to which we refer as self-supervised manifold based data augmentation (SSMBA).

Practical details

General information

Bring your own laptop (in order to do the practical side).
Part of the Language Analysis and Processing master program.
14 sessions, theoretical and hand-on labs (35 hours).
Scheduled from January 11th to the 29th 2021, 17:30-20:00,
except January 20th (Bank Holiday).

Where: Online or onsite, E4 Lab, Computer science faculty, San Sebastian
(practical classes will be also available onsite or online, split groups).
The university provides some limited information about accommodation in San Sebastian (Basque/Spanish) and the Basque Country (English).
Teaching language: English.
Capacity: 60 attendants total (First-come first-served).
Cost: 274 euros (270 for UPV/EHU members).


Registration is open up to the 7th of January 2021 (or until room is full).
Please register by email to (subject "Registration to DL4NLP" and CC
Mention your preference: online or onsite
Same email for any enquiry you might have.
The university provides official certificates. Please apply AFTER completing the course.
Public universities are not allowed to produce invoices, but we can provide a payment certificate.

Basic programming experience, a university-level course in computer science and experience in Python. Basic math skills (algebra or pre-calculus) are also needed.
Bring your own laptop (no need to install anything).

Previous editions

Online class of July 2020 (left), with a handful of the 70 participants. To the right the screen lecturers had to talk to :-)

Class of January 2020

Class of July 2019