I am Diego Antognini, a research scientist at IBM Research AI working on machine learning and natural language processing. My research is centered around efficient machine learning for NLP: I am developing efficient alternatives to transformer-based language models, with model sizes in the order of one megabyte for resource-constrained embedded systems and data centers. My goal is to achieve similar predictive performance to large models while offering lower inference latencies for on-device NLP (which also improves user privacy). I am also interested in different pre-training strategies to reach high-quality or lower-cost pre-training. Altogether, higher efficiency can be achieved both at training time and in inference. I also teach deep learning for natural language processing classes at the Lucerne Univesity of Applied Sciences (HSLU).

I obtained my Ph.D. degree in Computer Science at Swiss Federal Institute of Technology in Lausanne (EPFL) in the Artificial Intelligence Laboratory (LIA) under the supervision of Professor Boi Faltings. My Ph.D. thesis is called "Textual Explanations and Critiques in Recommendation Systems", is available here. During my PhD, I developed models to infer high-quality explanations from text documents in a scalable and data-driven manner via selective rationalization. Furthermore, I designed new models to make explanations actionable (called critiquing) and examined two important applications in natural language processing and conversational recommendation systems. I also worked on multi-document summarization and multi-objective optimization in recommendation systems.

I have 6+ years of research experience in natural language processing (NLP), machine learning (ML), and single- and multi-objective recommendation systems. I am experienced in developing efficient models for on-device inference (in the order of one-megabyte model size and one-millisecond latency) and interpretable models that generate personalized and actionable textual explanations. I supervised 60+ B.Sc. and M.Sc. projects/theses and assessed 50+ student projects. I offer consulting services in natural language processing, recommender systems, and machine learning. You can have a quick overview of myself by downloading my résumé.

Additionnally, I give talks (e.g., NLP Meetup in Zürich, where I presented one of my past work here) and participate to challenges with students. We won a $10k prize at IARPA 2018). EPFL News (English) and 24 Heures (French) wrote an article about it.

On this website, I present some publications I've been working on and some (prior to Ph.D.) of the most exciting projects. If you have any questions, would like to see others projects (including OpenGL, realistic image synthesis, web frameworks) or you should be unable to find something, feel free to contact me here.


Here are some of my publications. You can also consult my Google scholar profile. Don't hesitate to contact me if you have any questions!

21-23) To be announced
20) Assistive Recipe Editing through Critiquing Paper
Diego Antognini, Shuyang Li, Boi Faltings, Julian McAuley
2023, EACL

TL;DR: Generate recipes and allow users to edit them using critiquing. The system coherently rewrites recipes to satisfy users’ feedback.
19) Unsupervised Term Extraction for Highly Technical Domains Paper
Francesco Fusco, Peter Staar, Diego Antognini
2022, EMNLP (Industry track)

TL;DR: We present a novel, fully unsupervised method for term extraction that generalizes across domains. Our setup improves the predictive performance and decreases the inference latency on both CPUs & GPUs.
18) Active Learning for Imbalanced Civil Infrastructure Data Paper
Thomas Frick, Diego Antognini, Mattia Rigotti, Ioana Giurgiu, Benjamin Grewe, Cristiano Malossi
2022, ECCV Workshop on Computer Vision for Civil and Infrastructure Engineering (CVCIE)

TL;DR: We present a novel method capable of operating on datasets that suffer from heavy class imbalance by replacing the traditional active learning acquisition function with an auxiliary binary discriminator.
17) Textual Explanations and Critiques in Recommendation Systems Paper or Paper
Diego Antognini
2022, EPFL Ph.D. thesis

TL;DR: This dissertation focuses on two fundamental challenges. The first involves explanation generation: inferring high-quality explanations from text documents in a scalable and data-driven manner. The second challenge consists in making explanations actionable, and we refer to it as critiquing. This dissertation examines two important applications in natural language processing and recommendation tasks.
16) Positive & Negative Critiquing for VAE-based Recommenders Paper
Diego Antognini, Boi Faltings
2022, CoRR

TL;DR: Fast negative and positive critiquing generalized for variational autoencoders and up to 15% higher success rate than state-of-the-art models. The key is to model positive and negative critiques as different modalites and using multi-modal VAE with weak supervision.
15) Interlock-Free Multi-Aspect Rationalization for Text Classification
Shuangqi Li, Diego Antognini, Boi Faltings

TL;DR: Addressing the interlocking dynamics for multi-aspect rationalization thanks to a new self-supervised contrastive loss and multi-stage training to generate more semantically diverse rationales.
14) Interacting with Explanations through Critiquing (T-RECS) Paper Paper Video (Chrome only)
Diego Antognini, Claudiu Musat, Boi Faltings
2021, IJCAI (acceptance rate: 13.9%)

TL;DR: How to extract explanations significantly preferred by humans over those produced by state-of-the-art models and make them actionable; users interact with them iteratively to improve the recommendation?
13) Fast Multi-Step Critiquing for VAE-based Recommender Systems (M&Ms-VAE) Paper Paper Video
Diego Antognini, Boi Faltings
2021, RecSys (acceptance rate: 18.4%)

TL;DR: Fast critiquing generalized for variational autoencoders and up to 26x faster and 20% higher success rate than state-of-the-art models. The key is to model the problem using multi-modal VAE and weak supervision.
12) Multi-Step Critiquing User Interface for Recommender Systems Paper Paper Video
Diana, Petrescu*, Diego Antognini*, Boi Faltings
2021, RecSys Demo

TL;DR: We propose and demonstrate a new way of interacting with recommender systems.
11) Multi-Dimensional Explanation of Target Variables from Documents (MTM) Paper Video
Diego Antognini, Claudiu Musat, Boi Faltings
2021, AAAI (acceptance rate: 21%)

TL;DR: One model to extract interpretable, meaningful, and coherent multi-faceted rationales for multi-task text classficiation problems.
10) Rationalization through Concepts Paper Video
Diego Antognini, Boi Faltings
2021, ACL Findings (acceptance rate: 21.3% (main) + 14.9% (findings))

TL;DR: Generalization of MTM: how to extract interpretable multi-faceted concepts (i.e., rationales) for single-task classification problems.
9) Addressing Fairness in Classification with a Model-Agnostic Multi-Objective Algorithm Paper Video
Kirtan Padh, Diego Antognini, Emma L. Glaude, Boi Faltings, Claudiu Musat
2021, UAI (acceptance rate: 26.5%)

TL;DR: A novel differentiable relaxation that approximates fairness notions, and a novel model-agnostic multi-objective architecture that optimizes multiple fairness notions and sensitive attributes.
8) Multi-Gradient Descent for Multi-Objective Recommender Systems Paper
Nikola Milojkovic, Diego Antognini, Giancarlo Bergamin, Boi Faltings, Claudiu Musat
2020, AAAI Workshop on Interactive and Conversational Recommendation Systems (WICRS)

TL;DR: An efficient stochastic multi-gradient descent approach for multi-objective recommender system.
7) HotelRec: a Novel Very Large-Scale Hotel Recommendation Dataset Paper
Diego Antognini, Boi Faltings
2020, LREC

TL;DR: A new dataset with 50 million hotel reviews with meta-attributes, user information, and multiple rated dimensions.
6) Recommending Burgers based on Pizza Preferences: Addressing Data Sparsity with a Product of Experts Paper
Martin Milenkoski, Diego Antognini, Boi Faltings
2021, Recsys Workshop of Cross-Market Recommendation

TL;DR: We tackle data sparsity and create recommendations in domains with limited knowledge about the user preferences.
5) Modeling Online Behavior in Recommender Systems: The Importance of Temporal Context Paper
Milena Filipovic*, Blagoj Mitrevski*, Diego Antognini, Emma L. Glaude, Boi Faltings, Claudiu Musat
2021, RecSys Workshop on Perspectives on the Evaluation of Recommender Systems

TL;DR: Omitting temporal context when evaluating recommender systems leads to false confidence. We propose an evaluation protocol and a training procedure model-agnostic to incorporate temporal context.
4) Momentum-based Gradient Methods in Multi-objective Recommender Systems Paper
Blagoj Mitrevski*, Milena Filipovic*, Diego Antognini, Emma L. Glaude, Boi Faltings, Claudiu Musat
2021, RecSys Workshop on Multi-Objective Recommender Systems

TL;DR: A coordinated multi-objective optimization method where each is optimized via an Adam-like algorithm.
3) GameWikiSum: a Novel Large Multi-Document Summarization Dataset Paper
Diego Antognini, Boi Faltings
2020, LREC

TL;DR: A non-news domain-specific dataset for multi-document summarization, which is one 100x larger than commonly used datasets.
2) Learning to Create Sentence Semantic Relation Graphs for Multi-Document Summarization Paper
Diego Antognini, Boi Faltings
2019, EMNLP Workshop on New Frontiers in Summarization

TL;DR: How to leverage universal and domain-sepcific sentence embeddings using a graph structure for multi-document summarization.
1) Dataset Construction via Attention for Aspect Term Extraction with Distant Supervision Paper
Athanasios Giannakopoulos*, Diego Antognini*, Claudiu Musat, Andreea Hossmann and Michael Baeriswyl
2017, ICDM Workshop on Sentiment Elicitation from Natural Text for Information Retrieval and Extraction (SENTIRE)

TL;DR: How to use large corpora to better extract aspect terms using distant supervision.

Projects (prior Ph.D.)

From Relation Extraction to Knowledge Graphs - M.Sc. thesis

My master thesis at Iprova in the domains of machine learning and natural language processing. View more

NeoBrain - B.Sc. thesis

A research project about optimizing neuronal activity maps treatment using massively parallel technologies. View more


Scalable decentralized system that aggregates secondary storage devices in a cluster with the aim of supporting parallel scans of data stored across them. View more

Optimized flocking algorithm for e-pucks

Implement, test, analyse and optimize a flocking algorithm for e-pucks. The robots should avoid obstacles within the arena while retaining the collective formation. Work in a multidisciplinary team! View more


Realisation of a complete Poker Texas Hold'em game with an artificial intelligence. View more

Starfighter 4K

Shoot'em up game using the movement recognitions with Kinect and Wiimotes for the inclination and the shoot of the spaceships. View more


Multiple mini-projects for learning about GPGPU technologies, mainly CUDA. View more

Image classification

Classifier that recognizes the object present in an image using advanced models. The objects could be classifying as horse, airplane, car or other. View more

Social Recommendation System

Recommender systems for events based on user’s data and Facebook profile. View more

Facial recognition among profiles

Detect if a person has sunglasses using a set of profile pictures of different persons. Each one of them has pictures with different head positions, emotions and with/without sunglasses. View more

Pattern classification and machine learning project 1

Project about regression and classification using linear models. One dataset per task is given without any information. View more

Recommender System challenge

Third task of the challenge of European Semantic Web Conference on a Top-N recommendation of books (ESWC-14 Challenge). Github Report


A movie directory with heavy database background using real data from IMDb. View more


Planetarium software showing a current view of the sky at the current location. View more


If you have any questions, or you would like to get in touch with me, feel free to contact me in one of the following ways :