Predicting Institution Hierarchies with Set-based Models

Derek TamNicholas MonathAri KobrenAndrew McCallum.



Predicting hierarchies of institutions by modeling set operations over tokens
The hierarchical structure of research organizations plays a pivotal role in science of science research as well as in tools that track the research achievements and output. However, this structure is not consistently documented for all institutions in the world, motivating the need for automated construction methods. In this paper, we present a new task and model for predicting sub-institution/super-institution relationships based on their string names. The crux of our model is that it leverages learned, permutation invariant representations of various token subsets of institution name strings. Our model outperforms or matches non-set-based models and baselines. We also create a dataset for training and evaluating models for this task based on the publicly available relationships in the Global Research Identifier Database.


title={Predicting Institution Hierarchies with Set-based Models},
author={Derek Tam and Nicholas Monath and Ari Kobren and Andrew McCallum},
booktitle={Automated Knowledge Base Construction},