Michael A. Hedderich Contact & Links


I'm a researcher in Machine Learning and Natural Language Processing at Cornell University. My research goal is to open up machine learning to more fields and domains by

  • developing approaches that require less training data and work in low-resource settings, such as weak supervision techniques and learning with noisy or unreliable labels,
  • improving the interpretability of complex machine learning models, and
  • evaluating with a focus on human-computer interaction to ensure that new methods perform better in real-life and not just on the leaderboard.

To this end, I'm working on both foundational methods as well as applications in fields ranging from archeology to medical research.


Current affiliations:

In the past, I had the pleasure to work with


  • Nov 2022: Excited to start a post-doc position at Cornell in the group of Qian Yang at the intersection of NLP and HCI. 🇺🇸
  • July 2022: Where did it go wrong? We will present our paper on label-descriptive patterns and their application to characterizing classification errors at ICML'22. 📝
  • Mai 2022: I will visit Disney Research Studios in Zürich for three months. If you are in Switzerland and want to connect, just let me know! 🇨🇭
  • November 2021: I will have the pleasure to visit Antti Oulasvirta's HCI group at Aalto University. 🇫🇮
  • August 2021: I'll give an invited talk at the Vienna Workshop on Weak Supervision and Natural Language Processing 🇦🇹
  • August 2021: We published a visual guide for low-resource NLP on towards data science 🎨
  • June 2021: We'll present our survey on recent approaches for NLP in low-resource scenarios at NAACL'21 📚
  • Mai 2021: Bringing together researchers on weak and distant supervision: We are organizing the WeaSuL workshop at ICLR'21 🦊
  • Mai 2021: We'll present our ANEA tool for NER distant supervision at PML4DC@ICLR'21 🛠
  • Mai 2021: We'll present an HCI work on robust microgestures at CHI'21 🖖
  • April 2021: Looking forward to our workshop LANTERN @EACL'21: The 3rd Workshop Beyond Vision and Language: Integrating Real-World Knowledge 🖼️
  • Feb 2021: We presented our latest work on noisy labels, looking at it from a theoretic view and proposing a new evaluation dataset at AAAI'21 📊
  • Feb 2021: This time working on a very different kind of text: Reconstructing tablets with cuneiform from archeological finds ⛏️
  • Jan 2021: We are organizing a new talk series with invited PhD students on NLP topics: NLPhD 🖥
  • Nov 2020: We presented a conference paper on low-resource learning for African languages at EMNLP 🌎 and a Findings paper on fine-tuning and probing at BlackboxNLP 📦
  • July 2020: I'm a co-organizer of the Business Meets Technology conference at Hochschule Ansbach, Germany 💼
  • May 2020: We presented our work on machine learning for low-resource African languages at AfricaNLP@ICLR'20 and PML4DC@ICLR'20.
  • Nov 2019: We presented our latest work on learning with noisy labels at EMNLP 2019.
  • Nov 2019: I'll be in Ulm, Germany, for an invited talk at a textmining symposium.
  • Oct 2019: We gave a talk about our work on intelligibility and language modelling at RAILS 2019.
  • Jun 2019: I gave a talk on machine learning in low-resource scenarios at TaCoS 2019
  • Jun 2019: A former student presented our work on learning with noisy data from self-training at the NAACL SRW 2019.
  • May 2019: I presented our work on multi-sense word embeddings with a talk at IWCS 2019.
  • Feb 2019: I gave a guest lecture at the Ambient Intelligence group at Aalto University. The slides can be found here.
  • Jul 2018: At the ACL 2018 workshop DeepLo, I gave a talk about our work on learning with noisily labeled, automatically annotated data.
  • Jun 2018: I gave an invited talk at the Speech Recognition group at Aalto University.


Low-Resource, Weak Supervision & Noisy Labels

  • Modern machine learning approaches often require large amounts of labeled training data. We study how one can train such models in low-resource scenarios.
  • This includes transfer learning and distant supervision for African low-resource languages.
  • Distant and weak supervision allow to leverage insights from experts efficently and label large amounts of unlabeled data automatically. However, this labeling tends to contain errors. We propose methods to model the label noise and leverage these labels more effectively.

Peer-Reviewed Conferences

Peer-Reviewed Workshop Papers

* equal contribution

WeaSuL Workshop

Weak and distant supervision is a popular topic in machine learning, computer vision and NLP both from a theoretic and applied/industry perspective. To bring together researchers from these different perspectives and to help new people into the field, we organize the WeaSuL workshop at ICLR'21.
Workshop website

Visual Guide

As companion to our survey, we published a more applied and visual guide for low-resource NLP. It is available on towards data science


I gave a guest lecture as part of the course “Machine Learning for Mobile and Pervasive Systems” at Stephan Sigg's Ambient Intelligence group at Aalto University. It discusses ways to obtain large amounts of data through crowd sourcing and automatic annotation techniques and how to deal with noise in this data using different noise modeling techniques like MACE or the noise layer approach.


  • I see better understanding of neural networks, their decision making and their training processes as a key requirement for applying them in the real world.


* equal contribution

Natural Language Processing

  • Languages and text are a core aspect of human life. NLP tries to process and understand them automatically.


Most of my work in NLP has been listed in the topics above. Beyond that I worked on:

LANTERN Workshop

At EACL'21, we organized the third iteration of the LANTERN workshop with the aim of brining together researchers from different fields who interconnect language, vision and other modalities by leveraging external knowledge.
Workshop website

Human-Computer Interaction & Computer Graphics

  • Finding ways in which to improve how humans interact with computers and making computers more accesible.
  • Also, these projects have nice pictures 😁

Peer-reviewed Publication


Digital Hummanities

  • Computational techniques can open new perspectives for many research fields such as archeology or linguistics.
  • Understanding each other across the research fields is a crucial (and fun) aspect of it.



I'm the co-founder of the game development group Little Factory Games. Our games:


Couch-Multiplayer (Browser, Windows, Linux) | Website

"Entry for the Global Game Jam 2022 on Duality. The Balance between Light and Dark. In the eternal battle of opposites, one shall not forget the dependence on one another. Experience duality with one of your friends and fight together in an everchanging world of light and dark."


Couch-Multiplayer for PC | Website

"Grab some friends, jump on the couch and get ready for this fast-paced and colorful arena game. In the world of these cats, there can only be one winner!"

Tiny Taxis

Android | Website

"Welcome to your new taxi business. Your job is to control the taxis in your city. To be successful you have to keep your passengers requests in mind. Businessmen are always in a hurry, be quick enough to catch them. Tourists like to see the city’s highlights and everyone likes a ride in a fancy car. But be aware to not get stuck in the rush hour.

Invest your money wisely: Hire new drivers, buy new cars and upgrade your fleet. If you are good enough, you might even be able to buy a limousine to transport your clients in style. Gain some extra money by completing challenges and try to beat the highscores."


Windows, Linux, Mac | Website

A 2D multiplayer arena game in a fantasy setting that offers a combination of fast gameplay and tactical resource managment.