Speaker abstracts and bios
Tel Aviv University / Allen Institute for AI
Symbolic and distributed representations for question answering
Models for question answering (QA) have been dominated recently by fully differentiable neural networks, based on large pre-trained language models. However, when answering a question requires multiple reasoning steps, symbolic approaches offer a natural and often more interpretable alternative. In this talk, I will describe recent work that focuses on the pros and cons of symbolic and distributed approaches for question answering. First, I will describe QDMR, a symbolic meaning representation for questions, inspired by semantic parsing, that can be annotated at scale by non-experts. QDMR was used to annotate BREAK: a benchmark for question understanding that contains 83K questions with their meaning representation from 10 existing datasets across three modalities. Then, I will show how a symbolic representation, such as QDMR, can be used to (a) improve accuracy on open-domain QA benchmarks that require multiple retrieval steps (b) improve the faithfulness of compositional neural networks for answering complex questions. I will then move to two cases where end-to-end differentiable models provide advantages over symbolic approaches. Specifically, one can automatically generate data at scale for both numerical and logical reasoning, and easily endow pre-trained language models with those missing capabilities for answering complex questions.
Jonathan Berant is a senior lecturer (assistant professor) at the School of Computer Science at Tel Aviv University and a research scientist at The Allen Institute for AI. Jonathan earned a Ph.D. in Computer Science at Tel-Aviv University, under the supervision of Prof. Ido Dagan. Jonathan was a post-doctoral fellow at Stanford University, working with Prof. Christopher Manning and Prof. Percy Liang, and subsequently a post-doctoral fellow at Google Research, Mountain View. Jonathan Received several awards and fellowships including The Rothschild fellowship, The ACL 2011 best student paper award, EMNLP 2014 best paper award, and NAACL 2019 best resource paper award, as well as several honorable mentions. Jonathan is currently an ERC grantee.
KR for KBQA and KBC
KB completion can be viewed as answering structured queries against a KB. Since answering the queries requires more than simply retrieving known facts, answering these queries requires some non-trivial processing, and hence is broadly similar to logical inference in a conventional symbolic KB. KB question-answering (KBQA) also answers queries against a KB, but in this case the queries are unstructured text queries. So both KBQA and KBC use some analog of “reasoning” over a KB. This raises the question: what can we learn about KBQA and KBC from the classical AI subfield of knowledge representation (KR)? In KR the central question is how to represent knowledge in a form that supports efficient, expressive reasoning. In my talk I will try to revisit this question in the context of modern neural learning methods, and tie the themes explored in classical KR to recently-proposed methods for KBC and KBQA.
Harvesting knowledge from the semi-structured web
Knowledge graphs have been used to support a wide range of applications and enhance search and QA for Google, Amazon Alexa, etc. However, we often miss long-tail knowledge, including unpopular entities, unpopular relations, and unpopular verticals. In this talk we describe our efforts in harvesting knowledge from semi-structured websites, which are often populated according to some templates using vast volume of data stored in underlying databases. We describe our AutoCeres ClosedIE system, which improves the accuracy of fully automatic knowledge extraction from 60%+ of state-of-the-art to 90%+ on semi-structured data. We also describe OpenCeres, the first ever OpenIE system on semi-structured data, that is able to identify new relations not readily included in existing ontologies. In addition, we describe our other efforts in ontology alignment, entity linkage, graph mining, and QA, that allow us to best leverage the knowledge we extract for search and QA.
Xin Luna Dong is a Principal Scientist at Amazon, leading the efforts of constructing Amazon Product Knowledge Graph. She was one of the major contributors to the Google Knowledge Vault project, and has led the Knowledge-based Trust project, which is called the “Google Truth Machine” by Washington’s Post. She has co-authored book “Big Data Integration”, was awarded ACM Distinguished Member, VLDB Early Career Research Contribution Award for “advancing the state of the art of knowledge fusion”, and Best Demo award in Sigmod 2005. She serves in VLDB endowment and PVLDB advisory committee, and is a PC co-chair for VLDB 2021, ICDE Industry 2019, VLDB Tutorial 2019, Sigmod 2018 and WAIM 2015.
University of Illinois Urbana-Champaign
Event-centric Knowledge Base Construction
Understanding events and communicating about them are fundamental human activities. However, it’s much more difficulty to populate event-related knowledge compared to entity-related knowledge. For example, most people in the United States will be able to answer the question “Who is President Barack Obama’s wife?”, but very few people can give a complete answer to “Who died in September 11 attacks?”. We propose a new research direction on event-centric knowledge base construction from multimedia multilingual sources. Our minds represent events at various levels of granularity and abstraction, which allows us to quickly access and reason about old and new scenarios. Progress in natural language understanding and computer vision has helped automate some parts of event understanding but the current, first-generation, automated event understanding is overly simplistic since it is local, sequential and flat. Real events are hierarchical and probabilistic. Understanding them requires knowledge in the form of a repository of abstracted event schemas (complex event templates), understanding the progress of time, using background knowledge, and performing global inference. Our approach to second-generation event understanding builds on an incidental supervision approach to inducing an event schema repository that is probabilistic, hierarchically organized and semantically coherent. Low level primitive components of event schemas are abundant, and can be part of multiple, sparsely occurring, higher-level schemas. Consequently, we combine bottom-up data driven approaches across multiple modalities with top-down consolidation of information extracted from a smaller number of encyclopedic resources. This facilitates inducing higher-level event and time representations analysts can interact with, and allow them to guide further reasoning and extract events by constructing a novel structured cross-media common semantic space. When complex events unfold in an emergent and dynamic manner, the multimedia multilingual digital data from traditional news media and social media often convey conflicting information. To understand the many facets of such complex, dynamic situations, we have also developed cross-media cross-document event coreference resolution and event-event relation tracking methods for event-centric knowledge population.
Heng Ji is a professor at Computer Science Department of University of Illinois at Urbana-Champaign. She received her B.A. and M. A. in Computational Linguistics from Tsinghua University, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Information Extraction and Knowledge Base Population. She is selected as “Young Scientist” and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017. The awards she received include “AI’s 10 to Watch” Award by IEEE Intelligent Systems in 2013 and NSF CAREER award in 2009. She has coordinated the NIST TAC Knowledge Base Population task since 2010. She has served as the Program Committee Co-Chair of many conferences including NAACL-HLT2018.
Representation Learning for Logical Reasoning in Knowledge Graphs
Learning low-dimensional embeddings of knowledge graphs is a powerful approach for predicting unobserved or missing relations between entities. However, an open challenge in this area is developing techniques that can go beyond single edge prediction and handle more complex multi-hop logical queries, which might involve multiple unobserved edges, entities, and variables. In this talk we present a framework to efficiently answer multi-hop logical queries on knowledge graphs. Our main insight is that queries can be embedded as boxes (i.e., hyper-rectangles), where a set of points inside the box corresponds to a set of answer entities of the query. We show that conjunctions and disjunctions can be naturally represented as intersections/unions of boxes. We demonstrate the effectiveness of our approach on large KGs and show its robustness in the presence of noise and missing relations.
Jure Leskovec is Associate Professor of Computer Science at Stanford University, Chief Scientist at Pinterest, and investigator at Chan Zuckerberg Biohub. His research focuses on machine learning and data mining with graphs, a general language for describing social, technological and biological systems. Computation over massive data is at the heart of his research and has applications in computer science, social sciences, marketing, and biomedicine. This research has won several awards including a Lagrange Prize, Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, and numerous best paper and test of time awards. Leskovec received his bachelor’s degree in computer science from University of Ljubljana, Slovenia, PhD in machine learning from Carnegie Mellon University and postdoctoral training at Cornell University.
University of California San Diego
Text Processing for Learning and Automation of Data Science.
How can text processing models be used to help self-directed students learn the skills they need to be effective data scientists, for example, the basics of Probability Theory? How can text processing models be used to automate mundane data wrangling tasks to help improve efficiency of Data Scientists? In this talk, I will discuss these questions, and our work on the NLP systems we are building to answer these questions.
Ndapa Nakashole is an Assistant Professor at the University of California, San Diego, where she teaches and carries out research on Statistical Natural Language Processing. Before that she was postdoctoral scholar at Carnegie Mellon University. She obtained her PhD from Saarland University and the Max Planck Institute for Informatics, Germany. She completed undergraduate studies in Computer Science at the University of Cape Town, South Africa.
University of Wisconsin-Madison
Automating Data Quality Management
Data quality management is a bottleneck in modern analytics as high-effort tasks such as data validation and cleaning are essential to obtain accurate results. This talk describes how to use machine learning to automate routine data quality management tasks. I will first introduce Probabilistic Unclean Databases (PUDs), a formal probabilistic framework to describe the quality of structured data and demonstrate how data validation and cleaning correspond to learning and inference problems over structured data distributions. I will then show how the PUDs framework forms the basis of the HoloClean framework, a state-of-the-art ML-based solution to automate data quality management for structured data. Finally, I will close with a discussion on lessons learned from HoloClean with particular emphasis on when accurate, automated data cleaning is feasible.
Theodoros (Theo) Rekatsinas is an Assistant Professor in the Department of Computer Sciences at the University of Wisconsin-Madison. He is a member of the Database Group. He is also a co-founder of inductiv, a startup focusing on automating data quality ops for analytical pipelines. Theo earned his Ph.D. in Computer Science from the University of Maryland and was a Moore Data Postdoctoral Fellow at Stanford University. His research interests are in data management, with a focus on data integration, data cleaning, and uncertain data. Theo’s work has been recognized with an Amazon Research Award in 2017, a Best Paper Award at SDM 2015, and the Larry S. Davis Doctoral Dissertation award in 2015.
Facebook / CMU
Learning to live with BERT
Large, pre-trained language models (LMs) like BERT produce high quality, general purpose representations of word(piece)s in context. Unfortunately, training and deploying these models comes at a high computational cost, limiting their development and use to a small set of institutions with access to substantial computational resources, while potentially accelerating climate change with their unprecedented energy requirements. In this talk I’ll characterize the inefficiencies of LM training and decoding, survey recent techniques for scaling down large pre-trained language models, and identify potential exciting research directions with the goal of enabling a broader array of researchers and practitioners to benefit from these powerful models, while remaining mindful of the environmental impact of our work.
Emma Strubell is a Visiting Researcher at Facebook AI Research, and Assistant Professor in the Language Technologies Institute at Carnegie Mellon University. Her research aims to provide fast and robust natural language processing to the diversity of academic and industrial investigators eager to pull insight and decision support from massive text data in many domains. Toward this end she works at the intersection of natural language understanding, machine learning, and deep learning methods cognizant of modern tensor processing hardware. Her research has been recognized with best paper awards at ACL 2015 and EMNLP 2018.
Building and mining a heterogenous biomedical knowledge graph
The biomedical research community is incredibly productive, producing over one million new publications per year. However, the knowledge contained in those publications usually remains in unstructured free text, or is fragmented across unconnected data silos. Here, I will describe recent efforts to integrate biomedical knowledge into large, heterogeneous knowledge graphs, and to mine those knowledge graphs to identify novel testable hypotheses
Andrew is a Professor at the Scripps Research Institute in the Department of Integrative Structural and Computational Biology (ISCB). His research focuses on building and applying bioinformatics infrastructure for biomedical discovery. His research has a particular emphasis on leveraging crowdsourcing for genetics and genomics. Representative projects include the Gene Wiki, BioGPS, MyGene.Info, and Mark2Cure, each of which engages the crowd to help organize biomedical knowledge. These resources are collectively used millions of times every month by members of the research community, by students, and by the general public.
A long view on Identity
Google’s Knowledge Graph is a durable collection of entities. As new things are learned over time about an entity, those “facts” are added to the entity. This long term accumulation of knowledge is central to the value of KG. To make this growth strategy work, the entities must be easily distinguishable from one another and stable in what they represent. But how should the boundaries between each entity be determined? Moreover, what is the right granularity of categories and relations that should be applied to these entities? There are many options for how the world could be cleaved ontologically, but experience with a large stable knowledge graph has shown that pragmatically some criteria may matter more than others. And yet, in some cases, the decision might not be as important as we thought.
Jamie manages the Schema Team for Google’s Knowledge Graph. The team’s responsibilities include extending KG’s underlying semantic representation, growing coverage of the ontology and enforcing semantic policy. He joined Google following the acquisition of Metaweb Technologies where he was the Minister of Information, helping organize data in Freebase and evangelizing semantic representation to web developers. Prior to Metaweb, Jamie worked in enterprise software as CTO of Determine Software and before that started one of the first ISPs in San Francisco. He is co-author of the O’Reilly book, “Programming the Semantic Web.” Jamie has a PhD from Harvard University and earned his bachelor’s degree from Colorado College.
John Hopkins University
Embracing uncertainty as the target of prediction
The Information Extraction and Computational Semantics communities are largely dominated by resources and models that strive for extracting what an observation categorically supports as a true prediction. E.g., “this image is of a CAT”, or “that token sequence refers to an ORGANIZATION”, or “this sentence ENTAILS that other sentence”. As humans we recognize that observations can be ambiguous as to what predictions they evince, but we seem to forget that when building datasets, and then blindly aim to reproduce those annotations when building models. I will discuss a series of projects that explore annotating and modeling subjective likelihood assessments, with a focus on tasks such as semantic parsing and textual entailment. For example, the sentence “Someone is walking a dog in a park” may be interpreted as strong evidence for, “The dog is alive”, weak evidence for, “The sun is shining”, and cast doubt on (but not strictly “contradict”), “The park is on fire”. While our work has concentrated on text, the point applies broadly: how often have you done an image captcha and had hesitation on whether that one picture contained a “bridge”? Let’s agree that sometimes the right answer is “maybe”.
Benjamin Van Durme is an Associate Professor of Computer Science at Johns Hopkins University and a Principal Research Scientist at Microsoft Semantic Machines. His research focusses on resources and models for natural language understanding. His observations on the problem of reporting bias in common sense knowledge acquisition was recognized with a best paper award at AKBC 2013.
Georgia Institute of Technology
Language Understanding in Social Context
Over the last few decades, natural language processing (NLP) has had increasing success and produced industrial applications like search, and personal assistants. Despite being sufficient to enable these applications, current NLP systems largely ignore the social part of language, e.g., who says it, in what context, for what goals. My research combines NLP, linguistics and social science to study how people use language in different social settings for their social goals, with the implications of developing systems to facilitate human-human and human-machine communication. In this talk, I will explain my research from two specific studies. The first part studies what makes language persuasive by introducing a semi-supervised neural network to recognize persuasion strategies in loan requests on crowdfunding platforms, and further designed neural encoder-decoder systems to automatically transform inappropriately subjective framing into a neutral point of view. The second focuses on modeling how people seek and offer support via language in online cancer support communities and building interventions to support patient communication. Through these examples, I show how we can accurately and efficiently build better language technologies for social contexts.
Diyi Yang is an assistant professor in the School of Interactive Computing at Georgia Tech, also affiliated with the Machine Learning Center (ML@GT) at Georgia Tech. She is interested in natural language processing (e.g., language generation, semantics, discourse), and computational social science. Diyi received her PhD from the Language Technologies Institute at Carnegie Mellon University, and her bachelor’s degree from Shanghai Jiao Tong University, China. Her work has been published at leading NLP/HCI conferences, and also resulted in multiple award-winning papers from EMNLP 2015 and SIGCHI 2019. She has served as an Area Chair for ACL, EMNLP, NAACL, CIKM, and CSCW conferences.
University of Washington / Facebook
Denoising Sequence-to-Sequence Pre-training
Denoising auto-encoders can be pre-trained at a very large scale by noising and then reconstructing any input text. Existing methods, based on variations of masked languages models, have transformed the field and are now provide the de facto initialization to be tuned for nearly every task. In this talk, I will present our work on sequence-to-sequence pre-training that allows arbitrary noising, by simply learning to translate any corrupted text back to the original with standard Tranformer-based neural machine translation architectures. I will show the resulting mono-lingual (BART) and multi-lingual (mBART) models are highly effective for a wide range of discrimination and generation tasks, including question answer, summarization, and machine translation. A key contribution of our generalized noising is that we can replicate other pretraining schemes within the BART framework, to better measure which factors most influence end-task performance, as I will describe. Finally, I will highlight many of the ways BART is already being used by other researchers, and discuss opportunities to further push models that pre-train for generating and understanding text in many languages.
Luke Zettlemoyer is a Professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, and a Research Scientist at Facebook. His research focuses on empirical methods for natural language semantics, and involves designing machine learning algorithms; introducing new tasks and datasets; and, most recently, studying how to best develop self-supervision signals for text. Honors include multiple paper awards, a PECASE award, and an Allen Distinguished Investigator Award. Luke received his PhD from MIT and was a postdoc at the University of Edinburgh.