Semi-supervised Ensemble Learning with Weak Supervision for Biomedical Relationship Extraction

Antonios Minas KrasakisEvangelos Kanoulas, George Tsatsaronis.



We propose and apply a meta-learning methodology based on Weak Supervision, for combining Semi-Supervised and Ensemble Learning on the task of Biomedical Relationship Extraction.
Natural language understanding research has recently shifted towards complex Machine Learning and Deep Learning algorithms. Such models often outperform their simpler counterparts significantly. However, their performance relies on the availability of large amounts of labeled data, which are rarely available. To tackle this problem, we propose a methodology for extending training datasets to arbitrarily big sizes and training complex, data-hungry models using weak supervision. We apply this methodology on biomedical relation extraction, a task where training datasets are excessively time-consuming and expensive to create, yet has a major impact on downstream applications such as drug discovery. We demonstrate in two small-scale controlled experiments that our method consistently enhances the performance of an LSTM network, with performance improvements comparable to hand-labeled training data. Finally, we discuss the optimal setting for applying weak supervision using this methodology.


title={Semi-supervised Ensemble Learning with Weak Supervision for Biomedical Relationship Extraction},
author={Antonios Minas Krasakis and Evangelos Kanoulas and George Tsatsaronis},
booktitle={Automated Knowledge Base Construction (AKBC)},
Gold Sponsors
Silver Sponsors
Bronze Sponsors
Chan Zuckerberg Initiative Facebook Google
Diffbot Oracle Corporation NEC
Elsevier Kenome