Danqi Chen is an Assistant Professor of Computer Science at Princeton University and co-leads the Princeton NLP Group. Danqi’s research focuses on deep learning for natural language processing, with an emphasis on the intersection between text understanding and knowledge representation/reasoning and applications such as question answering and information extraction. Before joining Princeton, Danqi worked as a visiting scientist at Facebook AI Research in Seattle. She received her Ph.D. from Stanford University (2018) and B.E. from Tsinghua University (2012), both in Computer Science. In the past, she was a recipient of a Facebook Fellowship, and a Microsoft Research Women’s Fellowship and paper awards at ACL’16 and EMNLP’17.

Advancing Textual Question Answering

In this talk, I will discuss my recent work on advancing textual question answering: enabling machines to answer questions based on a passage of text, and more realistically, on a very large collection of documents (aka. “machine reading at scale”). In the first part, I will examine the importance of pre-trained language representations (e.g., BERT, RoBERTa) on the state-of-the-art QA systems. In particular, I will introduce a span-based pre-training method which is designed to better represent and predict spans of text and demonstrates superior performance on a wide range of QA tasks. Although these models already matched or surpassed human performance on some standard benchmarks, there still remains a huge gap when they are scaled up to the open-domain setting. In the second part, I will present two new directions: one is to replace traditional keyword-based retrieval component with fully dense embeddings for passage retrieval and the other is to answer questions based on a structured graph of text passages. Both approaches demonstrate promises for our future textual QA systems.