29/06/2020

RTPTorrent: An open-source dataset for evaluating regression test prioritization

Toni Mattis, Patrick Rein, Falco Dürsch, Robert Hirschfeld

Keywords: Regression Test Prioritization, Dataset, Java, GitHub, TravisCI

Abstract: The software engineering practice of automated testing helps programmers find defects earlier during development. With growing software projects and longer-running test suites, frequency and immediacy of feedback decline, thereby making defects harder to repair. Regression test prioritization (RTP) is concerned with running relevant tests earlier to lower the costs of defect localization and to improve feedback.Finding representative data to evaluate RTP techniques is non-trivial, as most software is published without failing tests. In this work, we systematically survey a wide range of RTP literature regarding whether their dataset uses real or synthetic defects or tests, whether they are publicly available, and whether datasets are reused. We observed that some datasets are reused, however, many projects study only few projects and these rarely resemble real-world development activity.In light of these threats to ecological validity, we describe the construction and characteristics of a new dataset, named RTPTorrent, based on 20 open-source Java programs.Our dataset allows researchers to evaluate prioritization heuristics based on version control meta-data, source code, and test results from fine-grained, automated builds over 9 years of development history. We provide reproducible baselines for initial comparisons and make all data publicly available.We see this as a step towards better reproducibility, ecological validity, and long-term availability of studied software in the field of test prioritization.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at MSR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers