22/09/2020

On target item sampling in offline recommender system evaluation

Rocı́o Cañamares, Pablo Castells

Keywords: experimental design, offline evaluation, discriminative power, evaluation bias, metrics, target items

Abstract: Target selection is a basic yet often implicit decision in the configuration of offline recommendation experiments. In this paper we research the impact of target sampling on the outcome of comparative recommender system evaluation. Specifically, we undertake a detailed analysis considering the informativeness and consistency of experiments across the target size axis. We find that comparative evaluation using reduced target sets contradicts in many cases the corresponding outcome using large targets, and we provide a principled explanation for these disagreements. We further seek to determine which among the contradicting results may be more reliable. Through comparison to unbiased evaluation, we find that minimum target sets incur in substantial distortion in pairwise system comparisons, while maximum sets may not be ideal either, and better options may lie in between the extremes. We further find means for informing the target size setting in the common case where unbiased evaluation is not possible, by an assessment of the discriminative power of evaluation, that remarkably aligns with the agreement with unbiased evaluation.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at RECSYS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers