Enterprise Alexandria: Online High-Precision Enterprise Knowledge Base Construction with Typed Entities

John WinnMatteo VenanziTom Minka, Ivan Korostelev, John Guiver, Elena Pochernina, Pavel Mishkov, Alex Spengler, Denise Wilkins, Sian Lindley, Richard Banks, Sam Webster, Yordan Zaykov.



An automated system for incrementally extracting multi-typed entities from private enterprise documents including emails, calendar events and documents with users in the loop
We present Enterprise Alexandria, one of the core AI technologies behind Microsoft Viva Topics. Enterprise Alexandria is a new system for automatically constructing a knowledge base with high-precision and typed entities from private enterprise data such as emails, documents and intranet pages. Built as an extension of Alexandria [Winn et al.,2019], the key novelty of Enterprise Alexandria is the ability in processing both the textual information and the structured metadata available in each document in an online learning fashion, making use of any manual curations that have happened in the interim. This task is performed entirely eyes-off to respect the privacy of the user and the restricted access their documents. The knowledge discovery process uses a probabilistic program defining the process of generating the data item from a set of unknown typed entities. Using probabilistic inference, Enterprise Alexandria can jointly discover a large set of entities with custom types specific to the organization. Experiments on three real-world datasets show that the system outperforms alternative methods with the ability to work effectively at large scale.


title={Enterprise Alexandria: Online High-Precision Enterprise Knowledge Base Construction with Typed Entities},
author={John Winn and Matteo Venanzi and Tom Minka and Ivan Korostelev and John Guiver and Elena Pochernina and Pavel Mishkov and Alex Spengler and Denise Wilkins and Sian Lindley and Richard Banks and Sam Webster and Yordan Zaykov},
booktitle={3rd Conference on Automated Knowledge Base Construction},