Structured Transformer Models for NLP | George Mason Department of Computer Science

When: Wednesday, March 29, 2023 from 11:00 AM to 12:00 PM
Speakers: Nikita Kitaev, Ph.D. student in Computer Science at UC Berkeley
Location: ENGR 4201
Export to iCal

Abstract

The field of natural language processing has recently unlocked a wide range of new capabilities through the use of large language models, such as GPT-4. The growing application of these models motivates developing a more thorough understanding of how and why they work, as well as further improvements in both quality and efficiency.

In this talk, I will present my work on analyzing and improving the Transformer architecture underlying today's language models through the study of how information is routed between multiple words in an input. I will show that such models can predict the syntactic structure of text in a variety of languages, and discuss how syntax can inform our understanding of how the networks operate. I will also present my work on structuring information flow to build radically more efficient models, including models that can process text of up to one million words, which enables new possibilities for NLP with book-length text.

Bio

Nikita Kitaev is a final-year Ph.D. student in Computer Science at UC Berkeley, advised by Dan Klein. His research spans natural language processing and machine learning, with a focus on understanding the structure and operation of large language models, including leveraging insights from the field of syntax. He is the recipient of an NSF Graduate Research Fellowship and his work has been recognized through a best paper award at ACL. Previously, Nikita has received a B.S. in Electrical Engineering and Computer Science from UC Berkeley.

Posted 1 year, 1 month ago

Upcoming Events

Categories

Structured Transformer Models for NLP Events / CS Seminar

Upcoming Events

Categories

Structured Transformer Models for NLP
Events / CS Seminar