General Information
- Time: Wednesday 06:30PM-09:00PM, fall semester, 2020
- Meeting Location: Zoom Meetings
- Instructor: Yue Ning
- Office: Gateway South 448
- Office Hours: Mondays 8:00AM - 10:00AM
- Teaching assistant: Kun Wu kwu14@stevens.edu
- Course details: We will be using Canvas for online discussion, announcements, and homework submission. You are encouraged to ask and answer questions on the forum as long as you do not give away solutions to homework problems. Participation in the online forum will count towards class participation.
What is this course about?
Natural language processing (NLP) is one of the most important technologies referring to automatic computational processing of human languages. This includes algorithms that take human-produced text as input or produce text as output. People communicate almost everything in language: emails, phone calls, language translation, web searches, reports, books, social media, etc. Human language is symbolic in nature and also highly ambiguous and variable. Comprehending human language is a crucial and challenging part of artificial intelligence. There are a large variety of underlying tasks and machine learning models behind NLP applications. Recently, deep learning approaches have been studied and achieved high performance in many NLP tasks. The course provides an introduction to machine learning and deep learning research applied to NLP. We will cover topics including word vector representations, neural networks, recurrent neural networks, convolutional neural networks, seq2seq models, as well as some attention-based models.Prerequisites
This course is fast-paced and covers a lot of ground, so it is important that you have a solid foundation on both the theoretical and empirical fronts. You should have background in python programming, probability theory, linear algebra, calculus, and foundations of machine learning.Reading
- Ian Goodfellow and Yoshua Bengio and Aaron Courville. Deep Learning (DL). 2016. MIT Press. We will cover topics including basic neural networks, back propagation, RNNs and CNNs.
- Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed. draft) (SLP). 2018.
- Yoav Goldberg. Neural Network Methods for Natural Language Processing (NNLP). 2017.
Coursework
Submissions: All assignments (homework problems and project milestones) must be submitted on Canvas by 6:30 PM on the due dates.- Homework (50%): There will be bi-weekly homework assignments with both written and programming parts. Each assignment is centered around an application and will also deepen your understanding of the theoretical concepts.
- Midterm Exam (15%): The midterm exam is to evaluate your understanding of the course so far.
- Project (25%): The final project provides an opportunity for you to use the tools from class to build something interesting of your choice. You need to make a presentation in class and submit a report.
- Participation (5%): Discussions on Canvas and attending classes.
- Quizzes (5%): Online pop quizzes.
Schedule (tentative)
Week | Date | Topic | Reading | Event |
---|---|---|---|---|
1 | Sep. 2 | Introduction to NLP | CH1-3 SLP; CH1 NNLP | |
2 | Sep. 9 | Machine Learning Basics & Neural Networks | CH2-5 DL; CH2-5 NNLP | |
3 | Sep. 16 | Vector Semantics: TFIDF, CBOW, Skip-gram, Glove |
CH 6 SLP; CH6-8, CH10-11 NNLP
- Mikolov et al. Efficient Estimation of Word Representations in Vector Space. 2013 (word2vec) - Mikolov et al. Distributed Representations of Words and Phrases and their Compositionality. 2013 (Negative sampling) - Pennington et al. GloVe: Global Vectors for Word Representation. 2014 |
|
4 | Sep. 23 | Deep Feedforward Networks |
CH6-8 DL - Wager et al. Dropout Training as Adaptive Regularization. 2013 - Ning Qian. On the momentum term in gradient descent learning algorithms - Ruder et al. An overview of gradient descent optimization algorithms. 2017 |
HW1 due |
5 | Sep. 30 | Language Modeling |
CH 3 SLP; CH9 NNLP - Bengio et al. A Neural Probabilistic Language Model. 2003 |
|
6 | Oct. 7 | Recurrent Neural Networks (RNNs) |
CH 9, 13 SLP; CH 14 NNLP - Afshine Amidi and Shervine Amidi. Recurrent Neural Networks cheatsheet. 2019 |
HW2 due |
7 | Oct. 14 | More on RNNs (vanishing gradients) |
CH10 DL; CH15-16 NNLP - Sepp Hochreiter and Jürgen Schmidhuber. Long Short-Term Memory. 1997 (LSTM) - Cho et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. 2014 |
|
8 | Oct. 21 | Convolutional Neural Networks (CNNs) |
CH9 DL; CH 13 NNLP - LeCun et al. Gradient-based learning applied to document recognition. 1998 |
HW3 due |
9 | Oct. 28 | Midterm exam | ||
10 | Nov. 4 | Machine Translation, Seq2seq models, Attention models |
CH 10 DL; CH 17 NNLP - Sutskever et al. Sequence to Sequence Learning with Neural Networks. 2014 |
Proposal due |
11 | Nov. 11 | Natural Language Generation |
- Rush et al. A Neural Attention Model for Abstractive Sentence Summarization. 2015 - Yue Dong. A Survey on Neural Network-Based Summarization Methods. 2018 |
HW4 due |
12 | Nov. 18 | Tree Recursive Neural Networks and Constituency Parsing |
CH 18 NNLP - Socher et al. Parsing with Compositional Vector Grammars. 2013 - Socher et al. Semantic Compositionality through Recursive Matrix-Vector Spaces. 2012 - Socher et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. 2013 - Kitaev et al. Constituency Parsing with a Self-Attentive Encoder. 2018 |
|
13 | Nov. 25 | Thanksgiving - No class | ||
14 | Dec. 2 | Dependency Parsing |
CH13 SLP - Chen and Manning. A Fast and Accurate Dependency Parser using Neural Networks. 2014 |
HW5 due |
15 | Dec. 9 | Project presentation | - | |
16 | Dec. 16 | Project presentation | - | Project report due |