Skip to content

nlp-pucrs/biomedical-eval

Repository files navigation

biomedical-eval

Intrinsic and Extrinsic Evaluation of the Quality ofBiomedical Embeddings in Different Languages

Author: Paula M. Franceschini, Henrique D. P. dos Santos and Renata Vieira

Abstract: Lately, language models have been applied to severaltasks in biomedical natural language processing. Some publiclanguage models are available online, each built with differentcorpora. In this paper, we evaluate different public word embed-ding models trained with both general and biomedical corpora forEnglish and Portuguese. We present intrinsic evaluations basedon semantic analogies that use word pairs extracted from theMeSH biomedical thesaurus and also from benchmarks that areavailable for general-domain evaluation. For extrinsic evaluationswe rely on a classification task over Eletronic Health Records.Our experiments show that biomedical embeddings can bettercapture semantics for biomedical analogies in both languages. Onthe other hand for extrinsic evaluation, based on classificationtasks using the language models, larger general textual corporaappeared equally or more effective.

Keywords: Biomedical Embeddings, MeSH thesaurus, Mul-tilanguage Evaluation

Full Text, BibText

Online Experiments

Run our experiments online with Binder

Binder

PUCRS A.I. in HealthCare

This project belongs to GIAS at PUCRS, Brazil

About

Multilingual Biomedical Embeddings: Quality Evaluation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published