Extracting Keywords from Publication Abstracts for an Automated Researcher Recommendation System
In: Digitale Welt (Proceedings of the First International Symposium on Applied Artificial Intelligence in Conjunction with DIGICON)
This paper presents an automated keyword assignment system for scientific abstracts. That system is applied to paper abstracts collected in a local publication database and used to drive a researcher recommendation system. Problems like low data volume and missing keywords are discussed. For remediation, training is performed on an extended data set based on large online publication databases. Additionally a closer look at label imbalance in the dataset is taken. Ten multi-label classification algorithms for assigning keywords from a given catalogue to a scientific abstract are compared. The usage of binary relevance as transformation method with LightGBM as classifier yields the best results. Random oversampling before the training phase additionally increases the F1-Score by around 5-6%.