General Information

What is this course about?

Natural language processing (NLP) is one of the most important technologies referring to automatic computational processing of human languages. This includes algorithms that take human-produced text as input or produce text as output. People communicate almost everything in language: emails, phone calls, language translation, web searches, reports, books, social media, etc. Human language is symbolic in nature and also highly ambiguous and variable. Comprehending human language is a crucial and challenging part of artificial intelligence. There are a large variety of underlying tasks and machine learning models behind NLP applications. Recently, deep learning approaches have been studied and achieved high performance in many NLP tasks. The course provides an introduction to machine learning and deep learning research applied to NLP. We will cover topics including word vector representations, neural networks, recurrent neural networks, convolutional neural networks, seq2seq models, as well as some attention-based models.


This course is fast-paced and covers a lot of ground, so it is important that you have a solid foundation on both the theoretical and empirical fronts. You should have background in python programming, probability theory, linear algebra, calculus, and foundations of machine learning.



Submissions: All assignments (homework problems and project milestones) must be submitted on Canvas by 6:30 PM on the due dates.

Schedule (tentative)

Week Date Topic Reading Event
1 Sep. 2 Introduction to NLP CH1-3 SLP; CH1 NNLP
2 Sep. 9 Machine Learning Basics & Neural Networks CH2-5 DL; CH2-5 NNLP
3 Sep. 16 Vector Semantics: TFIDF, CBOW, Skip-gram, Glove CH 6 SLP; CH6-8, CH10-11 NNLP
- Mikolov et al. Efficient Estimation of Word Representations in Vector Space. 2013 (word2vec)
- Mikolov et al. Distributed Representations of Words and Phrases and their Compositionality. 2013 (Negative sampling)
- Pennington et al. GloVe: Global Vectors for Word Representation. 2014
4 Sep. 23 Deep Feedforward Networks CH6-8 DL
- Wager et al. Dropout Training as Adaptive Regularization. 2013
- Ning Qian. On the momentum term in gradient descent learning algorithms
- Ruder et al. An overview of gradient descent optimization algorithms. 2017
HW1 due
5 Sep. 30 Language Modeling CH 3 SLP; CH9 NNLP
- Bengio et al. A Neural Probabilistic Language Model. 2003
6 Oct. 7 Recurrent Neural Networks (RNNs) CH 9, 13 SLP; CH 14 NNLP
- Afshine Amidi and Shervine Amidi. Recurrent Neural Networks cheatsheet. 2019
HW2 due
7 Oct. 14 More on RNNs (vanishing gradients) CH10 DL; CH15-16 NNLP
- Sepp Hochreiter and Jürgen Schmidhuber. Long Short-Term Memory. 1997 (LSTM)
- Cho et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. 2014
8 Oct. 21 Convolutional Neural Networks (CNNs) CH9 DL; CH 13 NNLP
- LeCun et al. Gradient-based learning applied to document recognition. 1998
HW3 due
9 Oct. 28 Midterm exam
10 Nov. 4 Machine Translation, Seq2seq models, Attention models CH 10 DL; CH 17 NNLP
- Sutskever et al. Sequence to Sequence Learning with Neural Networks. 2014
Proposal due
11 Nov. 11 Natural Language Generation - Rush et al. A Neural Attention Model for Abstractive Sentence Summarization. 2015
- Yue Dong. A Survey on Neural Network-Based Summarization Methods. 2018
HW4 due
12 Nov. 18 Tree Recursive Neural Networks and Constituency Parsing CH 18 NNLP
- Socher et al. Parsing with Compositional Vector Grammars. 2013
- Socher et al. Semantic Compositionality through Recursive Matrix-Vector Spaces. 2012
- Socher et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. 2013
- Kitaev et al. Constituency Parsing with a Self-Attentive Encoder. 2018
13 Nov. 25 Thanksgiving - No class
14 Dec. 2 Dependency Parsing CH13 SLP
- Chen and Manning. A Fast and Accurate Dependency Parser using Neural Networks. 2014
HW5 due
15 Dec. 9 Project presentation -
16 Dec. 16 Project presentation - Project report due