02/02/2021

Towards Balanced Defect Prediction with Better Information Propagation

Xianda Zheng, Yuan-Fang Li, Huan Gao, Yuncheng Hua, Guilin Qi

Keywords:

Abstract: Defect prediction, the task of predicting the presence of defects in source code artifacts, has broad application in software development. Defect prediction faces two major challenges, label scarcity, where only a small percentage of code artifacts are labeled, and data imbalance, where the majority of labeled artifacts are non-defective. Moreover, current defect prediction methods ignore the impact of information propagation among code artifacts and this negligence leads to performance degradation. In this paper, we propose DPCAG, a novel model to address the above three issues. We treat code artifacts as nodes in a graph, and learn to propagate influence among neighboring nodes iteratively in an EM framework. DPCAG dynamically adjusts the contributions of each node and selects high-confidence nodes for data augmentation. Experimental results on real-world benchmark datasets show that DPCAG improves performance compare to the state-of-the-art models. In particular, DPCAG achieves substantial performance superiority when measured by Matthews Correlation Coefficient (MCC), a metric that is widely acknowledged to be the most suitable for imbalanced data.

The video of this talk cannot be embedded. You can watch it here:
https://slideslive.com/38948795
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers