CS 695-002

Natural Language Processing (Special Topics)

Instructor

Antonios Anastasopoulos (antonis [at] gmu [dot] edu)
Office Hours: TBD.

Meets

Mondays, 4:30 to 7:10 PM, Online.

Course Web Page

TBD -- We will also communicate through Piazza.

Course Description

Computers process massive amounts of information every day in the form of human language. Although they do not understand it, they can learn how to do things like answer questions about it, or translate it into other languages. Neural networks provide powerful new tools for modeling language, and have been used both to improve the state-of-the-art in a number of tasks and to tackle new problems that were not easy in the past. This class will start with a brief overview of NLP and Neural Networks, then spend the majority of the class demonstrating how to apply neural networks to natural language problems.

Each section will introduce a particular problem or phenomenon in natural language, describe why it is difficult to model, and demonstrate recent models that were designed to tackle this problem. In the process of doing so, the class will cover different techniques that are useful in creating neural network models, including handling variably sized and structured sentences, semi-supervised and unsupervised learning, structured prediction, and multilingual modeling. The class will include assignments culminating in a final project.

Online Classroom Specifics

I'll expect you to attend the class, as we will be having the quizzes synchronously, beyond any presentations and discussions). I'll be using Zoom for video-conferencing. I won't require you to have your video on, but it will be greatly appreciated (noone likes to just talk to a screen). I will try to adjust to your needs and to do so I will poll the class for your preferences in the beginning of the semester as well as throughout -- we're in this together!

Prerequisites

Ideally, (a) Algorithms and Data Structures, (b) Artificial Intelligence or Data Mining, and (c) Probability and Statistics (STAT 344) or equivalent.

Students should be experienced with writing substantial programs in Python. The ideal set of prerequisites includes Algorithms and Data Structures, Artificial Intelligence or Data Mining, and linear algebra and calculus, but please contact the instructor if you have questions about the necessary background.

Class Format

As the class aims to provide skills necessary to familiarize the students with, and to do cutting-edge NLP research, the classes and assignments will be at least partially implementation-focused. In general each class will take the following format:

Grading

Your final grade will be dependent on the in-class individual quizzes (20%) and the assignments/project (80%). There will be no final exam.

Quizzes: Worth 20% of the grade. Your lowest 2 quiz grades will be dropped. If you are sick or traveling on business (e.g. to a conference, for a job interview, or delayed in return due to visa issues), send a doctor's note or evidence of the reason for being away to the instructor within a week of the absence, and you will be excused. I expect excused quizzes to be relatively rare, and if you'll be away for more than, e.g. 2 classes over the semester, please consult in advance. (Of course, given the current situation, the requirements might be adjusted as the semester unfolds; when in doubt, just email the instructor)

Assignments: There will be 4 assignments, worth respectively 10%, 10%, 20%, 40% of the grade. Brief assignment details (more to follow in the class webpage):

Late Day Policy: In case there are unforeseen circumstances that don't let you turn in your assignment on time, 5 late days total over the first three assignments will be allowed (late days may not be applied to the final project, assignment 4). Note that the third assignment is harder than the first one, so it'd be a good idea to try to save your late days for the third assignment if possible. Assignments that are late beyond the allowed late days will be graded down one half-grade per day late.

Readings

For each topic/class the instructor will provide a list of papers as suggested readings. One paper will be required reading and will be tested with a quiz (see above). Students should be able to understand the course content just by following the lecture and by doing the readings. However, the following textbooks serve as good references.

Tentative Schedule

We will try to cover a lot of ground on s in the first weeks in order to lay the foundations for the projects, but then we will focus more on specific NLP tasks and Linguistics phenomena.
Week Date Topic Homework Due
1 8/24 Introduction (NLP, Neural Networks)
2 8/31 Language Models, Smoothing, and Recurrent Neural Networks
3 9/7 NO CLASS [Labor Day]
4 9/14 Topic Models, Distributional Semantics and Word Vectors Assignment 1 Due 9/18
5 9/21 Contextual Representations and Text Classification
6 9/28 Alignment, Translation and Encoder-Decoder Models
7 10/5 Attention, Self-Attention, and Variations thereof Assignment 2 Due 10/9
8 10/12 Unsupervised Learning from lots of text
9 10/19 Morphology and Syntax
10 10/26 POS Tagging, Structured Prediction I
11 11/2 Entity Recognition, Structured Prediction II Assignment 3 Due 11/6
12 11/9 Parsing and the CKY Algorithm
13 11/16 Summarization
14 11/23 Question Answering and Dataset Biases
15 11/30 Languages of the World, Multimodal, Multitask, and Multilingual Models
16 12/7 Discourse and Dialog
17 12/14 Project Presentations Project Report (Assignment 4) Due

Honor Code

The class enforces the GMU Honor Code, and the more specific honor code policy special to the Department of Computer Science. You will be expected to adhere to this code and policy.

Note to Students

Take care of yourself! As a student, you may experience a range of challenges that can interfere with learning, such as strained relationships, increased anxiety, substance use, global pandemics, feeling down, difficulty concentrating and/or lack of motivation. All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of having a healthy life is learning how to ask for help. Asking for support sooner rather than later is almost always helpful. GMU services are available, and treatment does work. You can learn more about confidential mental health services available on campus at: https://caps.gmu.edu/. Support is always available (24/7) from Counseling and Psychological Services: 703-527-4077.

Disabilities

If you have a documented learning disability or other condition which may affect academic performance, make sure this documentation is on file with the Office of Disability Services and come talk to me about accommodations. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Services, I encourage you to contact them at ods@gmu.edu.