Papers that are recommended for presentation are denoted by ♥
Introduction (Lecture 1)
General: ML in NLP
The following two papers are here for historical reasons. These are survey papers that describe the state of the art in Machine Learning for NLP in 1999 and 2005.
Generative and Discriminative Models
Multiclass
Basic Structured Models: Sequential Models
Background
Inference with Classifiers
CRF
Structured Perceptron
BONUS: To learn how to efficiently implement averaged perceptron (without storing weight vectors), refer Fig 2.3 on page 19 in Hal Daume’s thesis.
SVM
Constrained Conditional Models
Constraint-based Models
BONUS: To learn how to convert boolean constraints to ILP constraints, refer,
Training Paradigms
Training Paradigms: Constraint-based Models
Distributed Output Representations
Applications
Unsupervised Learning and Indirect Supervision
Constraint-Driven Learning
- M. Chang, L. Ratinov, N. Rizzolo and D. Roth, Learning and Inference with Constraints AAAI 2008.
- ♥ M. Chang, L. Ratinov, and D. Roth, Guiding Semi-Supervision with Constraint-Driven Learning ACL 2007.
- ♥ K. Ganchev, J. Graca, J. Gillenwater and B. Taskar, Posterior Regularization for Structured Latent Variable Models JMLR 2010.
- ♥ K. Hall, R. McDonald, J. Katz-Brown and M. Ringgaard, Training dependency parsers by jointly optimizing multiple objectives EMNLP 2011.
Latent Variables
- ♥ M. Chang, D. Goldwasser, D. Roth and V. Srikumar, Discriminative Learning over Constrained Latent Representations NAACL 2010.
- ♥ Chun-Nam John Yu and T. Joachims, Learning Structural SVMs with Latent Variables ICML 2009.
- A. McCallum, K. Bellare and F. Pereira, A Conditional Random Field for Discriminatively-trained Finite-state String Edit Distance UAI, 2005.
- ♥ Sun, Xu, T. Matsuzaki, D. Okanohara and J. Tsujii, Latent Variable Perceptron Algorithm for Structured Classification IJCAI 2009.
- Matsuzaki, Miyao, Tsujii Probabilistic CFG with Latent Annotations ACL 2005
- ♥ Collobert and Weston A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning.
- S. Petrov, L. Barrett, R. Thibaux and D. Klein, COLING/ACL 2006 Learning Accurate, Compact, and Interpretable Tree Annotation
- P. Liang, S. Petrov, M. Jordan, and D. Klein, EMNLP 2007 The Infinite PCFG using Hierarchical Dirichlet Processes
Indirect Supervision
Inference
Inference
- ♥ T. Finley, T. Joachims, Training Structural SVMs when Exact Inference is Intractable ICML 2008.
- ♥ C. Sutton and A. McCallum Piecewise Pseudolikelihood for Efficient Training of Conditional Random Fields ICML 2007
- ♥ T. Joachims, T. Finley, Chun-Nam Yu, Cutting-Plane Training of Structural SVMs Machine Learning 2009.
- ♥ T. Koo, A. M. Rush, M. Collins, T. Jaakkola, and D. Sontag, Dual Decomposition for Parsing with Non-Projective Head Automata. EMNLP 2010.
- ♥ V. Srikumar, G. Kundu and D. Roth On Amortizing Inference Cost for Structured Prediction EMNLP 2012.
Search Based Inference
- ♥ H. Daume, J. Langford, and D. Marcu, Search-based Structured Prediction Machine Learning 2009
- J.R. Doppa, A. Fern and P. Tadepalli, HC-Search: A Learning Framework for Search-based Structured Prediction JAIR 2014
- K.-W. Chang, A. Krishnamurthy, A. Agarwal, H. Daumé III, J. Langford, Learning to Search Better Than Your Teacher ICML 2015
- T. Vieira and J. Eisner, Learning to Prune: Exploring the Frontier of Fast and Accurate Parsing TACL 2017
Deep Learning
Applications
- Richard Socher, John Bauer, Christopher D. Manning, and Andrew Y. Ng, Parsing With Compositional Vector Grammars. ACL 2013.
- Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. NIPS 2014.
- ♥ S. Wiseman and A. M. Rush. Sequence-to-sequence learning as beam-search optimization. EMNLP 2016.
- ♥ A. Karpathy, A. Joulin, and F. F. Li. Deep fragment embeddings for bidirectional image sentence mapping NIPS, 2014.
- ♥ L. Kong, C. Dyer, N. A. Smith Segmental Recurrent Neural Networks ICLR 2016.
- ♥ L. Yu, P. Blunsom, C. Dyer, E. Grefenstette, T. Kocisky The Neural Noisy Channel ICLR 2017.
- ♥ Y. Kim, C. Denton, L. Hoang, A. M. Rush Structured Attention Networks ICLR 2017.
- ♥ E. Kiperwasser, Y. Goldberg Easy-First Dependency Parsing with Hierarchical Tree LSTMs TACL 2016.