A machine learning approach for vulnerability curation

29/06/2020

A machine learning approach for vulnerability curation

Yang Chen, Andrew E. Santosa, Ang Ming Yi, Abhishek Sharma, Asankhaya Sharma, David Lo

Keywords: machine learning, open-source software, application security, classifiers ensemble, self-training

Abstract Paper Similar Papers

Abstract: Software composition analysis depends on database of open-source library vulerabilities, curated by security researchers using various sources, such as bug tracking systems, commits, and mailing lists. We report the design and implementation of a machine learning system to help the curation by by automatically predicting the vulnerability-relatedness of each data item. It supports a complete pipeline from data collection, model training and prediction, to the validation of new models before deployment. It is executed iteratively to generate better models as new input data become available. We use self-training to significantly and automatically increase the size of the training dataset, opportunistically maximizing the improvement in the models’ quality at each iteration. We devised new deployment stability metric to evaluate the quality of the new models before deployment into production, which helped to discover an error. We experimentally evaluate the improvement in the performance of the models in one iteration, with 27.59

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at MSR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

29/06/2020

Using others’ tests to identify breaking updates

Suhaib Mujahid, Rabe Abdalkareem, Emad Shihab, Shane McIntosh

Keywords Paper

Software Testing, Software Quality, Node.js, Empirical Studies, Software Ecosystems, JavaScript

0

0

0

0

10:07

03/05/2021

Generating Adversarial Computer Programs using Optimized Obfuscations

Shashank Srikant, Sijia Liu, Tamara Mitrovska and
Shiyu Chang, Quanfu Fan, Gaoyuan Zhang, Una-May O'Reilly

Keywords Paper

Models for code, Differentiable program generator, Combinatorial optimization, Program obfuscation, Adversarial computer programs, Machine Learning (ML) for Programming Languages (PL)/Software Engineering (SE)

0

0

0

0

6:27

18/11/2020

Learning code changes by exploiting bidirectional converting deviation

Jia-Wei Mi, Shu-Ting Shi, Ming Li

Keywords Paper

0

0

0

0

11:37

16/11/2020

An Imitation Game for Learning Semantic Parsers from User Interaction

Ziyu Yao, Yiqi Tang, Wen-tau Yih and
Huan Sun, Yu Su

Keywords Paper

bootstrapping, fine-tuning parsers, theoretical analysis, text-to-sql problem

0

0

0

0

11:49

02/02/2021

Transfer Learning for Efficient Iterative Safety Validation

Anthony Corso, Mykel J. Kochenderfer

Keywords Paper

0

0

0

0

18:40

06/12/2021

Self-Supervised Bug Detection and Repair

Miltiadis Allamanis, Henry Jackson-Flux, Marc Brockschmidt

Keywords Paper

machine learning, self-supervised learning

0

0

0

0

9:21

26/08/2020

Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery

Zepeng Huo, Arash PakBin, Xiaohan Chen and
Nathan Hurley, Ye Yuan, Xiaoning Qian, Zhangyang Wang, Shuai Huang, Bobak Mortazavi

Keywords Paper

0

0

0

0

12:52

25/04/2020

Building and Validating a Scale for Secure Software Development Self-Efficacy

Daniel Votipka, Desiree Abrokwa, Michelle Mazurek

Keywords Paper

secure development, scale development

0

0

0

0

14:49

03/05/2021

Latent Skill Planning for Exploration and Transfer

Kevin Xie, Homanga Bharadhwaj, Danijar Hafner and
Animesh Garg, Florian Shkurti

Keywords Paper

Partial Amortization, Model Predictive Control, Planning, Mutual Information, Skill Discovery, World Models, Model-Based Reinforcement Learning

0

0

0

0

5:10

06/12/2020

Probabilistic Active Meta-Learning

Jean Kaddour, Steindor Saemundsson, Marc Deisenroth

Keywords Paper

0

0

0

0

3:17

05/04/2021

Amazon SageMaker Debugger: A System for Real-Time Insights into Machine Learning Model Training

Nathalie Rauschmayr, Vikas Kumar, Rahul Huilgol and
Andrea Olgiati, Satadal Bhattacharjee, Nihal Harish, Vandana Kannan, Amol Lele, Anirudh Acharya, Jared Nielsen, Lakshmi Ramakrishnan, Ishan Bhatt, Kohen Chia, Neelesh Dodda, Zhihan Li, Jiacheng Gu, Miyoung Choi, Balajee Nagarajan Nagarajan, Jeffrey Geevarghese, Denis Davydenko, Sifei Li, Lu Huang, Edward Kim, Tyler Hill, Krishnaram Kenthapadi

Keywords Paper

0

0

0

0

3:47

05/04/2021

Amazon SageMaker Debugger: A System for Real-Time Insights into Machine Learning Model Training

Nathalie Rauschmayr, Vikas Kumar, Rahul Huilgol and
Andrea Olgiati, Satadal Bhattacharjee, Nihal Harish, Vandana Kannan, Amol Lele, Anirudh Acharya, Jared Nielsen, Lakshmi Ramakrishnan, Ishan Bhatt, Kohen Chia, Neelesh Dodda, Zhihan Li, Jiacheng Gu, Miyoung Choi, Balajee Nagarajan Nagarajan, Jeffrey Geevarghese, Denis Davydenko, Sifei Li, Lu Huang, Edward Kim, Tyler Hill, Krishnaram Kenthapadi

Keywords Paper

0

0

0

0

19:27

02/02/2021

Self-Progressing Robust Training

Minhao Cheng, Pin-Yu Chen, Sijia Liu and
Shiyu Chang, Cho-Jui Hsieh, Payel Das

Keywords Paper

0

0

0

0

14:34

06/12/2021

Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks

Maksym Yatsura, Jan Metzen, Matthias Hein

Keywords Paper

robustness, adversarial robustness and security, meta learning

0

0

0

0

12:55

04/07/2020

A Transformer-based Approach for Source Code Summarization

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

Keywords Paper

Source Summarization, summarization, ablation studies, Transformer-based Approach

0

0

0

0

6:14

06/12/2021

Encoding Robustness to Image Style via Adversarial Feature Perturbations

Manli Shu, Zuxuan Wu, Micah Goldblum, Tom Goldstein

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security, domain adaptation

0

0

0

0

7:36

18/07/2021

Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment

Philip Ball, Cong Lu, Jack Parker-Holder, Stephen Roberts

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:35

22/09/2020

BETA-rec: Build, evaluate and tune automated recommender systems

Zaiqiao Meng, Richard McCreadie, Craig Macdonald and
Iadh Ounis, Siwei Liu, Yaxiong Wu, Xi Wang, Shangsong Liang, Yucheng Liang, Guangtao Zeng, Junhua Liang, Qiang Zhang

Keywords Paper

Recommender Systems, Open-source, Toolkit, Framework

0

0

0

0

2:44

12/07/2020

Graph-based, Self-Supervised Program Repair from Diagnostic Feedback

Michihiro Yasunaga, Percy Liang

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

1

14:39

03/05/2021

Large Batch Simulation for Deep Reinforcement Learning

Brennan Shacklett, Erik Wijmans, Aleksei Petrenko and
Manolis Savva, Dhruv Batra, Vladlen Koltun, Kayvon Fatahalian

Keywords Paper

reinforcement learning, simulation

0

0

0

0

5:29

18/07/2021

Model Performance Scaling with Multiple Data Sources

Tatsunori Hashimoto

Keywords Paper

Algorithms, Supervised Learning

0

0

0

1

4:50

06/12/2020

Submodular Meta-Learning

Arman Adibi, Aryan Mokhtari, Hamed Hassani

Keywords Paper

0

0

0

0

3:17

02/02/2021

Towards Balanced Defect Prediction with Better Information Propagation

Xianda Zheng, Yuan-Fang Li, Huan Gao and
Yuncheng Hua, Guilin Qi

Keywords Paper

0

0

0

0

15:11

18/07/2021

Improved OOD Generalization via Adversarial Training and Pretraing

Mingyang Yi, Lu Hou, Jiacheng Sun and
Lifeng Shang, Xin Jiang, Qun Liu, Zhiming Ma

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

4:11

12/09/2020

On Tractable XAI Queries based on Compiled Representations

Gilles Audemard, Frédéric Koriche, Pierre Marquis

Keywords Paper

Explainable AI-General, Knowledge representation languages-General

0

0

0

0

17:08

25/07/2020

On understanding data worker interaction behaviors

Lei Han, Tianwa Chen, Gianluca Demartini and
Marta Indulska, Shazia Sadiq

Keywords Paper

interaction behavior, search pattern, data curation

0

0

0

0

6:01

12/07/2020

Optimizing Data Usage via Differentiable Rewards

Xinyi Wang, Hieu Pham, Paul Michel and
Antonios Anastasopoulos, Jaime Carbonell, Graham Neubig

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

12:53

03/05/2021

Parrot: Data-Driven Behavioral Priors for Reinforcement Learning

Avi Singh, Huihan Liu, Gaoyue Zhou and
Albert Yu, Nicholas Rhinehart, Sergey Levine

Keywords Paper

reinforcement learning, imitation learning

0

0

0

0

14:21

06/12/2021

Training Neural Networks with Fixed Sparse Masks

Yi-Lin Sung, Varun Nair, Colin Raffel

Keywords Paper

deep learning, transfer learning

0

0

0

0

14:20

29/06/2020

RTPTorrent: An open-source dataset for evaluating regression test prioritization

Toni Mattis, Patrick Rein, Falco Dürsch, Robert Hirschfeld

Keywords Paper

Regression Test Prioritization, Dataset, Java, GitHub, TravisCI

0

0

0

0

14:57

29/06/2020

Capture the feature flag: Detecting feature flags in open-source

Jens Meinicke, Juan Hoyos, Bogdan Vasilescu, Christian Kästner

Keywords Paper

0

0

0

0

7:47

23/06/2021

RbSyn: Type- and Effect-Guided Program Synthesis

Sankha Narayan Guria, Jeffrey S. Foster, David Van Horn

Keywords Paper

program synthesis, type and effect systems, Ruby

0

0

0

0

12:40

19/10/2020

ReQue: A configurable workflow and dataset collection for query refinement

Mahtab Tamannaee, Hossein Fani, Fattane Zarrinkalam and
Jamil Samouh, Samad Paydar, Ebrahim Bagheri

Keywords Paper

gold standard dataset, query refinement, reproducibility

0

0

0

0

15:21

18/07/2021

Scalable Normalizing Flows for Permutation Invariant Densities

Marin Biloš, Stephan Günnemann

Keywords Paper

Deep Learning, Generative Models

0

0

0

0

5:10

25/04/2020

TRACTUS: Understanding and Supporting Source Code Experimentation in Hypothesis-Driven Data Science

Krishna Subramanian, Johannes Maas, Jan Borchers

Keywords Paper

data science, programming ide, exploratory programming, information visualization, observational study

0

0

0

0

14:51

06/12/2020

Conditioning and Processing: Techniques to Improve Information-Theoretic Generalization Bounds

Hassan Hafez-Kolahi, Zeinab Golgooni, Shohreh Kasaei, Mahdieh Soleymani

Keywords Paper

0

0

0

0

3:25

04/07/2020

Simple and Effective Retrieve-Edit-Rerank Text Generation

Nabil Hossain, Marjan Ghazvininejad, Luke Zettlemoyer

Keywords Paper

Retrieve-Edit-Rerank Generation, candidate selection, Retrieve-and-edit methods, post-generation approach

0

0

0

0

6:51

12/08/2020

Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning

Ahmed Salem, Apratim Bhattacharya, Michael Backes and
Mario Fritz, Yang Zhang

Keywords Paper

0

0

0

0

13:05

16/11/2020

AxCell: Automatic Extraction of Results from Machine Learning Papers

Marcin Kardas, Piotr Czapla, Pontus Stenetorp and
Sebastian Ruder, Sebastian Riedel, Ross Taylor, Robert Stojnic

Keywords Paper

machine learning, table subtask, extraction, results extraction

0

0

0

0

11:52

06/12/2020

Information-theoretic Task Selection for Meta-Reinforcement Learning

Ricardo Luna Gutierrez, Matteo Leonetti

Keywords Paper

0

0

0

0

2:57