Improved automatic summarization of subroutines via attention to file context

29/06/2020

Improved automatic summarization of subroutines via attention to file context

Sakib Haque, Alexander LeClair, Lingfei Wu, Collin McMillan

Keywords: neural networks, natural language processing, documentation generation, source code summarization, artificial intelligence

Abstract Paper Similar Papers

Abstract: Software documentation largely consists of short, natural language summaries of the subroutines in the software. These summaries help programmers quickly understand what a subroutine does without having to read the source code him or herself. The task of writing these descriptions is called "source code summarization" and has been a target of research for several years. Recently, AI-based approaches have superseded older, heuristic-based approaches. Yet, to date these AI-based approaches assume that all the content needed to predict summaries is inside subroutine itself. This assumption limits performance because many subroutines cannot be understood without surrounding context. In this paper, we present an approach that models the file context of subroutines (i.e. other subroutines in the same file) and uses an attention mechanism to find words and concepts to use in summaries. We show in an experiment that our approach extends and improves several recent baselines.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at MSR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

19/08/2021

Graph-Augmented Code Summarization in Computational Notebooks

April Wang, Dakuo Wang, Xuye Liu, Lingfei Wu

Keywords Paper

Natural Language Processing, General, General

0

0

0

0

9:06

15/06/2020

SCAF: A speculation-aware collaborative dependence analysis framework

Sotiris Apostolakis, Ziyang Xu, Zujun Tan and
Greg Chan, Simone Campanoni, David I. August

Keywords Paper

speculation, collaboration, dependence analysis

0

0

0

0

16:16

06/12/2021

Pipeline Combinators for Gradual AutoML

Guillaume Baudart, Martin Hirzel, Kiran Kate and
Parikshit Ram, Avi Shinnar, Jason Tsay

Keywords Paper

machine learning

0

0

0

0

14:25

06/12/2021

Neural Program Generation Modulo Static Analysis

Rohan Mukherjee, Yeming Wen, Dipak Chaudhari and
Thomas Reps, Swarat Chaudhuri, Christopher Jermaine

Keywords Paper

deep learning, transformers, generative model

0

0

0

0

14:58

29/06/2020

RTPTorrent: An open-source dataset for evaluating regression test prioritization

Toni Mattis, Patrick Rein, Falco Dürsch, Robert Hirschfeld

Keywords Paper

Regression Test Prioritization, Dataset, Java, GitHub, TravisCI

0

0

0

0

14:57

15/11/2020

Designing Types for R, Empirically

Alexi Turcotte, Aviral Goel, Filip Křikava, Jan Vitek

Keywords Paper

R, dynamic languages, type declarations

0

0

0

0

16:04

19/01/2020

Partial Type Constructors: Or, Making Ad Hoc Datatypes Less Ad Hoc

Mark Jones, J. Garrett Morris, Richard A. Eisenberg

Keywords Paper

Type constructors, Parametric polymorphism

0

0

0

0

21:37

15/11/2020

Programming with a Read-Eval-Synth Loop

Hila Peleg, Roi Gabay, Shachar Itzhaky, Eran Yahav

Keywords Paper

read-eval-print loops, specification predicates, program synthesis

0

0

0

0

13:08

03/05/2021

Language-Agnostic Representation Learning of Source Code from Structure and Context

Daniel Zügner, Tobias Kirschstein, Michele Catasta and
Jure Leskovec, Stephan Günnemann

Keywords Paper

code summarization, machine learning for code

0

0

0

0

4:34

29/06/2020

An empirical study on the impact of deimplicitization on comprehension in programs using application frameworks

Jürgen Cito, Jiasi Shen, Martin Rinard

Keywords Paper

0

0

0

0

4:27

05/04/2021

A Deep Learning Based Cost Model for Automatic Code Optimization

Riyadh Baghdadi, Massinissa Merouani, Mohamed-Hicham LEGHETTAS and
Kamel Abdous, Taha Arbaoui, Karima BENATCHBA, Saman Amarasinghe

Keywords Paper

0

0

0

0

20:18

05/04/2021

A Deep Learning Based Cost Model for Automatic Code Optimization

Riyadh Baghdadi, Massinissa Merouani, Mohamed-Hicham LEGHETTAS and
Kamel Abdous, Taha Arbaoui, Karima BENATCHBA, Saman Amarasinghe

Keywords Paper

0

0

0

0

5:18

25/04/2020

ScrAPIr: Making Web Data APIs Accessible to End Users

Tarfah Alrashed, Jumana Almahmoud, Amy Zhang, David Karger

Keywords Paper

web apis, api description language, web scraping

0

0

0

0

13:20

12/07/2020

Learning and Evaluating Contextual Embedding of Source Code

Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi

Keywords Paper

Representation Learning

0

0

0

0

12:51

18/07/2021

TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer

Berkay Berabi, Jingxuan He, Veselin Raychev, Martin Vechev

Keywords Paper

Applications

0

0

0

0

5:27

29/06/2020

Embedding java classes with Code2vec: Improvements from variable obfuscation

Rhys Compton, Eibe Frank, Panos Patros, Abigail Koay

Keywords Paper

code2vec, machine learning, code obfuscation, source code, neural networks

0

0

0

0

14:20

15/11/2020

Digging for Fold: Synthesis-Aided API Discovery for Haskell

Michael B. James, Zheng Guo, Ziteng Wang and
Shivani Doshi, Hila Peleg, Ranjit Jhala, Nadia Polikarpova

Keywords Paper

Program Synthesis, Type Inference, Human-Computer Interaction

0

0

0

0

16:01

03/05/2021

BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration

Augustus Odena, Kensen Shi, David Bieber and
Rishabh Singh, Charles Sutton, Hanjun Dai

Keywords Paper

Program Synthesis

0

0

0

0

10:26

02/02/2021

Code Completion by Modeling Flattened Abstract Syntax Trees as Graphs

Yanlin Wang, Hui Li

Keywords Paper

0

0

0

0

14:31

02/02/2021

Deep Just-In-Time Inconsistency Detection Between Comments and Source Code

Sheena Panthaplackel, Junyi Jessy Li, Milos Gligoric, Raymond J. Mooney

Keywords Paper

0

0

0

0

18:05

19/10/2020

Feature extraction for large-scale text collections

Luke Gallagher, Antonio Mallia, J. Shane Culpepper and
Torsten Suel, B. Barla Cambazoglu

Keywords Paper

clueweb, feature index, feature extraction, feature repository, lambdamart, ltr, learning to rank, feature importance

0

0

0

0

9:41

06/12/2021

PLUR: A Unifying, Graph-Based View of Program Learning, Understanding, and Repair

Zimin Chen, Vincent J Hellendoorn, Pascal Lamblin and
Petros Maniatis, Pierre-Antoine Manzagol, Daniel Tarlow, Subhodeep Moitra

Keywords Paper

deep learning, machine learning, transformers, graph learning

0

0

0

0

5:59

23/06/2021

Incremental Whole-Program Analysis in Datalog with Lattices

Tamás Szabó, Sebastian Erdweg, Gábor Bergmann

Keywords Paper

Static Analysis, Incremental Computing, Datalog

0

0

0

0

22:53

19/01/2020

Interaction Trees: Representing Recursive and Impure Programs in Coq

Li-yao Xia, Yannick Zakowski, Paul He and
Chung-Kil Hur, Gregory Malecha, Benjamin C. Pierce, Steve Zdancewic

Keywords Paper

coinduction, Coq, monads, compiler correctness

0

0

0

0

25:54

03/05/2021

Concept Learners for Few-Shot Learning

Kaidi Cao, Maria Brbic, Jure Leskovec

Keywords Paper

few-shot learning, meta learning

0

0

0

0

4:55

19/01/2020

Trace Types and Denotational Semantics for Sound Programmable Inference in Probabilistic Languages

Alexander K. Lew, Marco Cusumano-Towner, Benjamin Sherman and
Michael Carbin, Vikash Mansinghka

Keywords Paper

Probabilistic programming, programmable inference, type systems

0

0

0

0

19:55

23/06/2021

CompCertO: Compiling Certified Open C Components

Jérémie Koenig, Zhong Shao

Keywords Paper

Compositional Compiler Correctness, Game Semantics, Simulation Convention, Language Interface

0

0

0

0

24:57

16/11/2020

How Much Knowledge Can You Pack Into the Parameters of a Language Model?

Adam Roberts, Colin Raffel, Noam Shazeer

Keywords Paper

fine-tuning models, neural models, open-domain systems, model size

0

0

0

0

7:31

15/06/2020

Semantic code search via equational reasoning

Varot Premtoon, James Koppel, Armando Solar-Lezama

Keywords Paper

equational reasoning, code search

0

0

0

0

16:29

20/08/2020

Kinds are Calling Conventions

Paul Downen, Zena M. Ariola, Simon Peyton Jones, Richard A. Eisenberg

Keywords Paper

representation, type systems, arity, levity, polymorphism

0

0

0

0

15:00

16/11/2020

An Imitation Game for Learning Semantic Parsers from User Interaction

Ziyu Yao, Yiqi Tang, Wen-tau Yih and
Huan Sun, Yu Su

Keywords Paper

bootstrapping, fine-tuning parsers, theoretical analysis, text-to-sql problem

0

0

0

0

11:49

15/11/2020

Precise Inference of Expressive Units of Measurement Types

Tongtong Xiang, Jeff Y. Luo, Werner Dietl

Keywords Paper

Scientific computing, Pluggable type system, Dimensional analysis, Units of measurements, Type inference

0

0

0

0

13:39

29/06/2020

Visualization of methods changeability based on VCS data

Sergey Svitkov, Timofey Bryksin

Keywords Paper

0

0

0

0

4:34

06/12/2020

PyGlove: Symbolic Programming for Automated Machine Learning

Daiyi Peng, Xuanyi Dong, Esteban Real and
Mingxing Tan, Yifeng Lu, Gabriel Bender, Hanxiao Liu, Adam Kraft, Chen Liang, Quoc V Le

Keywords Paper

0

0

0

0

3:17

03/05/2021

Generating Adversarial Computer Programs using Optimized Obfuscations

Shashank Srikant, Sijia Liu, Tamara Mitrovska and
Shiyu Chang, Quanfu Fan, Gaoyuan Zhang, Una-May O'Reilly

Keywords Paper

Models for code, Differentiable program generator, Combinatorial optimization, Program obfuscation, Adversarial computer programs, Machine Learning (ML) for Programming Languages (PL)/Software Engineering (SE)

0

0

0

0

6:27

04/07/2020

A Transformer-based Approach for Source Code Summarization

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

Keywords Paper

Source Summarization, summarization, ablation studies, Transformer-based Approach

0

0

0

0

6:14

15/11/2020

A Systematic Approach to Deriving Incremental Type Checkers

André Pacak, Sebastian Erdweg, Tamás Szabó

Keywords Paper

datalog, incremental type checking, type system transformation

0

0

0

0

16:35

29/06/2020

How often do single-statement bugs occur? The ManySStuBs4J dataset

Rafael-Michael Karampatsis, Charles Sutton

Keywords Paper

0

0

0

0

6:16

12/09/2020

High-level Programming via Generalized Planning and LTL Synthesis

Blai Bonet, Giuseppe De Giacomo, Hector Geffner and
Fabio Patrizi, Sasha Rubin

Keywords Paper

Reasoning about actions and change, action languages-General

0

0

0

0

12:34

15/11/2020

Feedback-Driven Semi-supervised Synthesis of Program Transformations

Xiang Gao, Shraddha Barke, Arjun Radhakrishna and
Gustavo Soares, Sumit Gulwani, Alan Leung, Nachiappan Nagappan, Ashish Tiwari

Keywords Paper

Program transformation, Program synthesis, Refactoring, Programming by Example

0

0

0

0

15:43