25/07/2020

What makes a top-performing precision medicine search engine? Tracing main system features in a systematic way

Erik Faessler, Michel Oleynik, Udo Hahn

Keywords: precision medicine, trec, search engine evaluation

Abstract: From 2017 to 2019 the Text REtrieval Conference (TREC) held a challenge task on precision medicine using documents from medical publications (PubMed) and clinical trials. Despite lots of performance measurements carried out in these evaluation campaigns, the scientific community is still pretty unsure about the impact individual system features and their weights have on the overall system performance. In order to overcome this explanatory gap, we first determined optimal feature configurations using the Sequential Model-based Algorithm Configuration (SMAC) program and applied its output to a BM25-based search engine. We then ran an ablation study to systematically assess the individual contributions of relevant system features: BM25 parameters, query type and weighting schema, query expansion, stop word filtering, and keyword boosting. For evaluation, we employed the gold standard data from the three TREC Precision Medicine (TREC-PM) installments to evaluate the effectiveness of different features using the commonly shared infNDCG metric.

The video of this talk cannot be embedded. You can watch it here:
https://dl.acm.org/doi/10.1145/3397271.3401048#sec-supp
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at SIGIR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd

Similar Papers