Imitation learning for structured prediction
in natural language processing

Andreas Vlachos, Gerasimos Lampouras
Department of Computer Science
University of Sheffield

Sebastian Riedel
Department of Computer Science
University College London

Your name sounds familiar

Imitation learning: an advanced behavior whereby an individual observes and replicates another's behavior


Legged locomotion
(Rattlif et al., 2006)

Autonomous helicopter flight
(Coates et al., 2008)

And more: outdoor navigation (Silver et al., 2008), Super-Mario (Ross et al., 2011), autonomous driving (Zhang and Cho, 2017)...

Your name sounds more(!) familiar

Dynamic oracles for parsing
(Goldberg and Nivre, 2012 )

Incremental coreference resolution
(Clark and Manning, 2015)

Recurrent Neural Network training
(Ranzato et al., 2016)

Search-based structured prediction (Daumé III et al., 2009)

Imitation Learning in a nutshell

Meta-learning: better model (≈policy) by generating better training data from demonstrations.

Is it supervised learning?

Yes: we assume gold standard
output for training

But: we train a classifier to predict
actions constructing the output.

Actions not in gold;
IL is rather semi-supervised

Is it reinforcement learning?

Yes (a kind of): we train a policy to
maximize rewards/minimize losses

But learning is facilitated by an expert

Why should I care?

In NLP we train classifiers to imitate experts in many tasks:

Imitation learning has been used to improve accuracy in all the above with SOTA results!

Part 1: Imitation Learning for Structured Prediction

Imitation learning algorithms:

  • Dataset Aggregation (DAgger)
  • V-DAgger
  • Locally Optimal Learning to Search (LOLS)

Interpretations and connections

  • Reinforcement Learning
  • Recurrent Neural Networks

Part 2: NLP Applications and practical advice


  • Dependency parsing
  • Natural language generation
  • Semantic parsing

Practical advice

  • Expert policy definition
  • Accelerating cost estimation
  • Troubleshooting


  • Understand how IL works via unified algorithmic presentations

  • Clarify its connections to other learning frameworks

  • Know representative NLP applications

  • Recognize when and how to apply IL