Imitation learning for structured prediction
in natural language processing

Andreas Vlachos, Gerasimos Lampouras
{a.vlachos,g.lampouras}@sheffield.ac.uk
Department of Computer Science
University of Sheffield

Sebastian Riedel
s.riedel@ucl.ac.uk
Department of Computer Science
University College London

sheffieldnlp.github.io/ImitationLearningTutorialEACL2017/

Your name sounds familiar

Imitation learning: an advanced behavior whereby an individual observes and replicates another's behavior

Robotics

Legged locomotion
(Rattlif et al., 2006)

Autonomous helicopter flight
(Coates et al., 2008)

And more: outdoor navigation (Silver et al., 2008), Super-Mario (Ross et al., 2011), autonomous driving (Zhang and Cho, 2017)...

Your name sounds more(!) familiar

Dynamic oracles for parsing
(Goldberg and Nivre, 2012 )

Incremental coreference resolution
(Clark and Manning, 2015)

Recurrent Neural Network training
(Ranzato et al., 2016)

Search-based structured prediction (Daumé III et al., 2009)

Imitation Learning in a nutshell

Meta-learning: better model (≈policy) by generating better training data from demonstrations.

Is it supervised learning?

Yes: we assume gold standard
output for training

But: we train a classifier to predict
actions constructing the output.

Actions not in gold;
IL is rather semi-supervised

Is it reinforcement learning?

Yes (a kind of): we train a policy to
maximize rewards/minimize losses

But learning is facilitated by an expert

Why should I care?

In NLP we train classifiers to imitate experts in many tasks:

Imitation learning has been used to improve accuracy in all the above with SOTA results!

Part 1: Imitation Learning for Structured Prediction

Imitation learning algorithms:

  • Dataset Aggregation (DAgger)
  • V-DAgger
  • Locally Optimal Learning to Search (LOLS)

Interpretations and connections

  • Reinforcement Learning
  • Recurrent Neural Networks

Part 2: NLP Applications and practical advice

Applications:

  • Dependency parsing
  • Natural language generation
  • Semantic parsing

Practical advice

  • Expert policy definition
  • Accelerating cost estimation
  • Troubleshooting

Outcomes

  • Understand how IL works via unified algorithmic presentations

  • Clarify its connections to other learning frameworks

  • Know representative NLP applications

  • Recognize when and how to apply IL