Integrating Tree Path in Transformer for Code Representation

06/12/2021

Integrating Tree Path in Transformer for Code Representation

Han Peng, Ge Li, Wenhan Wang, YunFei Zhao, Zhi Jin

Keywords: machine learning, transformers

Abstract Paper Similar Papers

Abstract: Learning distributed representation of source code requires modelling its syntax and semantics. Recent state-of-the-art models leverage highly structured source code representations, such as the syntax trees and paths therein. In this paper, we investigate two representative path encoding methods shown in previous research work and integrate them into the attention module of Transformer. We draw inspiration from the ideas of positional encoding and modify them to incorporate these path encoding. Specifically, we encode both the pairwise path between tokens of source code and the path from the leaf node to the tree root for each token in the syntax tree. We explore the interaction between these two kinds of paths by integrating them into the unified Transformer framework. The detailed empirical study for path encoding methods also leads to our novel state-of-the-art representation model TPTrans, which finally outperforms strong baselines. Extensive experiments and ablation studies on code summarization across four different languages demonstrate the effectiveness of our approaches. We release our code at \url{https://github.com/AwdHanPeng/TPTrans}.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

A Transformer-based Approach for Source Code Summarization

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

Keywords Paper

Source Summarization, summarization, ablation studies, Transformer-based Approach

0

0

0

0

6:14

25/07/2020

Relevance transformer: Generating concise code snippets with relevance feedback

Carlos Gemmell, Federico Rossetto, Jeffrey Dalton

Keywords Paper

code generation, neural machine translation, code retrieval

0

0

0

0

8:44

04/07/2020

Generalizing Natural Language Analysis through Span-relation Representations

Zhengbao Jiang, Wei Xu, Jun Araki, Graham Neubig

Keywords Paper

Natural Analysis, Natural processing, dependency parsing, semantic labeling

0

0

0

0

8:30

16/11/2020

Learning to Represent Image and Text with Denotation Graph

Bowen Zhang, Hexiang Hu, Vihan Jain and
Eugene Ie, Fei Sha

Keywords Paper

cross-modal retrieval, referring expression, compositional recognition, pre-training

0

0

0

0

10:59

16/11/2020

Cross-Thought for Sentence Encoder Pre-training

Shuohang Wang, Yuwei Fang, Siqi Sun and
Zhe Gan, Yu Cheng, Jingjing Liu, Jing Jiang

Keywords Paper

pre-training encoder, large-scale tasks, question answering, predicting words

0

0

0

0

12:06

04/07/2020

A Multi-Perspective Architecture for Semantic Code Search

Rajarshi Haldar, Lingfei Wu, JinJun Xiong, Julia Hockenmaier

Keywords Paper

Semantic Search, code matching, monolingual matching, cross-lingual task

0

0

0

0

6:45

03/05/2021

GraphCodeBERT: Pre-training Code Representations with Data Flow

Daya Guo, Shuo Ren, Shuai Lu and
Zhangyin Feng, Duyu Tang, Shujie LIU, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neels Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou

Keywords Paper

Data Flow, Code Structure, Pre-training, Code Representations, BERT

0

0

0

0

5:21

04/07/2020

Improving Image Captioning with Better Use of Caption

Zhan Shi, Xu Zhou, Xipeng Qiu, Xiaodan Zhu

Keywords Paper

Image Captioning, multimodal problem, natural processing, computer community

0

0

0

0

11:11

02/02/2021

Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction

Benfeng Xu, Quan Wang, Yajuan Lyu and
Yong Zhu, Zhendong Mao

Keywords Paper

0

0

0

0

14:48

12/07/2020

Structural Language Models of Code

Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

11:57

18/07/2021

Unifying Vision-and-Language Tasks via Text Generation

Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal

Keywords Paper

Algorithms, Multimodal Learning

0

0

0

0

4:58

15/11/2020

Compiling Symbolic Execution with Staging and Algebraic Effects

Guannan Wei, Oliver Bračevac, Shangyin Tan, Tiark Rompf

Keywords Paper

algebraic effects, symbolic execution, multi-stage programming, definitional interpreters

0

0

0

0

15:55

20/08/2020

SteelCore: An Extensible Concurrent Separation Logic for Effectful Dependently Typed Programs

Nikhil Swamy, Aseem Rastogi, Aymeric Fromherz and
Denis Merigoux, Danel Ahman, Guido Martínez

Keywords Paper

Separation Logic, Program Proofs, Concurrency

0

0

0

0

15:00

22/06/2020

Learning Relation Entailment with Structured and Textual Information

Zhengbao Jiang, Jun Araki, Donghan Yu and
Ruohong Zhang, Wei Xu, Yiming Yang, Graham Neubig

Keywords Paper

relation entailment, structured information, textual information

0

0

0

0

4:57

04/07/2020

Empower Entity Set Expansion via Language Model Probing

Yunyi Zhang, Jiaming Shen, Jingbo Shang, Jiawei Han

Keywords Paper

Empower Expansion, Entity expansion, NLP applications, question answering

0

0

0

0

11:16

15/06/2020

Verifying concurrent search structure templates

Siddharth Krishna, Nisarg Patel, Dennis Shasha, Thomas Wies

Keywords Paper

separation logic, concurrent data structures, flow framework, template-based verification

0

0

0

0

14:56

15/11/2020

DiffStream: Differential Output Testing for Stream Processing Programs

Konstantinos Kallas, Filip Niksic, Caleb Stanford, Rajeev Alur

Keywords Paper

runtime verification, differential testing, stream processing

0

0

0

0

15:50

06/12/2021

SOLQ: Segmenting Objects by Learning Queries

Bin Dong, Fangao Zeng, Tiancai Wang and
Xiangyu Zhang, Yichen Wei

Keywords Paper

machine learning, transformers

0

0

0

0

7:12

19/08/2021

Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation

Zilu Guo, Zhongqiang Huang, Kenny Q. Zhu and
Guandan Chen, Kaibo Zhang, Boxing Chen, Fei Huang

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation, NLP Applications and Tools

0

0

0

0

13:53

12/07/2020

Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao, Li Dong, Furu Wei and
Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, Hsiao-Wuen Hon

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

13:55

23/08/2020

Spectrum-guided adversarial disparity learning

Zhe Liu, Lina Yao, Lei Bai and
Xianzhi Wang, Can Wang

Keywords Paper

adversarial autoencoder, generative models, intraclass variability, activity recognition

0

0

0

0

14:30

02/02/2021

Encoder-Decoder Based Unified Semantic Role Labeling with Label-Aware Syntax

Hao Fei, Fei Li, Bobo Li, Donghong Ji

Keywords Paper

0

0

0

0

16:10

14/06/2020

JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection

Keren Fu, Deng-Ping Fan, Ge-Peng Ji, Qijun Zhao

Keywords Paper

visual saliency, salient object detection, rgb-d, depth information, joint learning, dense connections, multi-modal features, feature fusion, deep learning, encoder-decoder

0

0

0

0

1:01

04/07/2020

TAG : Type Auxiliary Guiding for Code Comment Generation

Ruichu Cai, Zhihao Liang, Boyan Xu and
zijian li, Yuexing Hao, Yao Chen

Keywords Paper

Code Generation, code task, adaptive code, TAG

0

0

0

0

11:22

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

02/02/2021

Exploring Auxiliary Reasoning Tasks for Task-oriented Dialog Systems with Meta Cooperative Learning

Bowen Qin, Min Yang, Lidong Bing and
Qingshan Jiang, Chengming Li, Ruifeng Xu

Keywords Paper

0

0

0

0

15:41

19/01/2020

Interaction Trees: Representing Recursive and Impure Programs in Coq

Li-yao Xia, Yannick Zakowski, Paul He and
Chung-Kil Hur, Gregory Malecha, Benjamin C. Pierce, Steve Zdancewic

Keywords Paper

coinduction, Coq, monads, compiler correctness

0

0

0

0

25:54

18/07/2021

How could Neural Networks understand Programs?

Dinglan Peng, Shuxin Zheng, Yatao Li and
Guolin Ke, Di He, Tie-Yan Liu

Keywords Paper

Applications, Hardware and Systems

0

0

0

0

5:18

19/10/2020

Enhance prototypical network with text descriptions for few-shot relation classification

Kaijia Yang, Nantao Zheng, Xinyu Dai and
Liang He, Shujian Huang, Jiajun Chen

Keywords Paper

text description, relation extraction, few shot

0

0

0

0

6:55

02/06/2020

SchemaTree: Maximum-Likelihood Property Recommendation for Wikidata

Lars C. Gleim, Rafael Schimassek, Dominik Hüser and
Maximilian Peters, Christoph Krämer, Michael Cochez et al.

Keywords Paper

0

0

0

0

29:33

02/02/2021

Curriculum-Meta Learning for Order-Robust Continual Relation Extraction

Tongtong Wu, Xuekai Li, Yuan-Fang Li and
Gholamreza Haffari, Guilin Qi, Yujin Zhu, Guoqiang Xu

Keywords Paper

0

0

0

0

11:33

04/07/2020

Adaptive Compression of Word Embeddings

Yeachan Kim, Kang-Min Kim, SangKeun Lee

Keywords Paper

Adaptive Embeddings, Distributed words, natural tasks, downstream tasks

0

0

0

0

12:13

12/07/2020

Learning Structured Latent Factors from Dependent Data:A Generative Model Framework from Information-Theoretic Perspective

Ruixiang ZHANG, Katsuhiko Ishiguro, Masanori Koyama

Keywords Paper

Learning Theory

0

0

0

0

14:46

02/02/2021

LRSC: Learning Representations for Subspace Clustering

Changsheng Li, Chen Yang, Bo Liu and
Ye Yuan, Guoren Wang

Keywords Paper

0

0

0

0

15:09

15/11/2020

Flow2Vec: Value-Flow-Based Precise Code Embedding

Yulei Sui, Xiao Cheng, Guanqin Zhang, Haoyu Wang

Keywords Paper

Flow2Vec, code embedding, asymmetric transitivity, value-flows

0

0

0

0

15:02

04/07/2020

MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

Jiaao Chen, Zichao Yang, Diyi Yang

Keywords Paper

Semi-Supervised Classification, text classification, data augmentation, supervision

0

0

0

0

10:54

06/12/2021

A Framework to Learn with Interpretation

Jayneel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc

Keywords Paper

deep learning, interpretability

0

0

0

0

14:05

12/08/2020

Datalog Disassembly

Antonio Flores-Montoya, Eric Schulte

Keywords Paper

0

0

0

0

12:17

07/09/2020

LaDDer: Latent Data Distribution Modelling with a Generative Prior

Shuyu Lin, Ronald Clark

Keywords Paper

variational autoencoder, generative model, variational inference, representation learning, unsupervised learning, latent space interpolation

0

0

0

0

10:02

02/02/2021

End-to-end Semantic Role Labeling with Neural Transition-based Model

Hao Fei, Meishan Zhang, Bobo Li, Donghong Ji

Keywords Paper

0

0

0

0

18:47