UniDoc: Unified Pretraining Framework for Document Understanding

06/12/2021

UniDoc: Unified Pretraining Framework for Document Understanding

Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Rajiv Jain, Nikolaos Barmpalios, Ani Nenkova, Tong Sun

Keywords: self-supervised learning, transformers

Abstract Paper Similar Papers

Abstract: Document intelligence automates the extraction of information from documents and supports many business applications. Recent self-supervised learning methods on large-scale unlabeled document datasets have opened up promising directions towards reducing annotation efforts by training models with self-supervised objectives. However, most of the existing document pretraining methods are still language-dominated. We present UniDoc, a new unified pretraining framework for document understanding. UniDoc is designed to support most document understanding tasks, extending the Transformer to take multimodal embeddings as input. Each input element is composed of words and visual features from a semantic region of the input document image. An important feature of UniDoc is that it learns a generic representation by making use of three self-supervised losses, encouraging the representation to model sentences, learn similarities, and align modalities. Extensive empirical analysis demonstrates that the pretraining procedure learns better joint representations and leads to improvements in downstream tasks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Unifying Vision-and-Language Tasks via Text Generation

Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal

Keywords Paper

Algorithms, Multimodal Learning

0

0

0

0

4:58

03/05/2021

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

Tao Yu, Jason Wu, Xi V Lin and
bailin wang, Yi Tan, Xinyi Yang, Dragomir Radev, Richard Socher, Caiming Xiong

Keywords Paper

pre-training, nlp, semantic parsing, text-to-sql

0

0

0

0

5:13

16/11/2020

Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning

Tao Shen, Yi Mao, Pengcheng He and
Guodong Long, Adam Trischler, Weizhu Chen

Keywords Paper

self-supervised tasks, pre-training, entity linking, finetuning

0

0

0

0

11:38

19/04/2021

Globalizing BERT-based transformer architectures for long document summarization

Quentin Grail, Julien Perez, Eric Gaussier

Keywords Paper

0

0

0

0

11:53

03/05/2021

Learning Structural Edits via Incremental Tree Transformations

Ziyu Yao, Frank F Xu, Pengcheng Yin and
Huan Sun, Graham Neubig

Keywords Paper

Representation Learning, Source Code, Incremental Tree Transformations, Edit, Tree-structured Data, Imitation Learning

0

0

0

0

5:18

16/11/2020

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum

Keywords Paper

nlp applications, fine-tuning, meta-learning problem, supervised tasks

0

0

0

0

11:49

06/12/2020

Learning to Learn Variational Semantic Memory

Xiantong Zhen, Yingjun Du, Huan Xiong and
Qiang Qiu, Cees Snoek, Ling Shao

Keywords Paper

0

1

1

1

3:24

06/12/2021

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare and
Shafiq Joty, Caiming Xiong, Steven Chu Hong Hoi

Keywords Paper

transformers, vision, representation learning

0

0

0

0

9:40

19/04/2021

An end-to-end model for entity-level relation extraction using multi-instance learning

Markus Eberts, Adrian Ulges

Keywords Paper

0

0

0

0

11:33

03/05/2021

$i$-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

Kibok Lee, Yian Zhu, Kihyuk Sohn and
Chun-Liang Li, Jinwoo Shin, Honglak Lee

Keywords Paper

self-supervised learning, unsupervised representation learning, data augmentation, MixUp, contrastive representation learning

0

0

0

0

5:04

04/07/2020

Generalizing Natural Language Analysis through Span-relation Representations

Zhengbao Jiang, Wei Xu, Jun Araki, Graham Neubig

Keywords Paper

Natural Analysis, Natural processing, dependency parsing, semantic labeling

0

0

0

0

8:30

16/11/2020

SLM: Learning a Discourse Language Representation with Sentence Unshuffling

Haejun Lee, Drew A. Hudson, Kangwook Lee, Christopher D. Manning

Keywords Paper

nlp, sentence-level modeling, discourse representation, pre-training methods

0

0

0

0

9:21

19/10/2020

Learning to profile: User meta-profile network for few-shot learning

Hao Gong, Qifang Zhao, Tianyu Li and
Derek Cho, DuyKhuong Nguyen

Keywords Paper

multi-task learning, multi-modal model, representation learning, meta-learning

0

0

0

1

12:10

04/07/2020

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Arman Cohan, Sergey Feldman, Iz Beltagy and
Doug Downey, Daniel Weld

Keywords Paper

Document-level Learning, Representation learning, natural systems, classification

0

0

0

0

13:07

25/07/2020

Attending to inter-sentential features in neural text classification

Billy Chiu, Sunil Kumar Sahu, Neha Sengupta and
Derek Thomas, Mohammady Mahdy

Keywords Paper

graph network, hybrid neural network, attention mechanism

0

0

0

0

6:41

06/12/2021

Integrating Tree Path in Transformer for Code Representation

Han Peng, Ge Li, Wenhan Wang and
YunFei Zhao, Zhi Jin

Keywords Paper

machine learning, transformers

0

0

0

0

4:42

16/11/2020

Multi-Stage Pre-training for Low-Resource Domain Adaptation

Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan and
Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avi Sil, Todd Ward

Keywords Paper

nlp tasks, fine-tuning, auxiliary tasks, lm transfer

0

0

0

0

6:56

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

02/02/2021

Variational Inference for Learning Representations of Natural Language Edits

Edison Marrese-Taylor, Machel Reid, Yutaka Matsuo

Keywords Paper

0

0

0

0

19:28

19/10/2020

DeText: A deep text ranking framework with BERT

Weiwei Guo, Xiaowei Liu, Sida Wang and
Huiji Gao, Ananth Sankar, Zimeng Yang, Qi Guo, Liang Zhang, Bo Long, Bee-Chung Chen, Deepak Agarwal

Keywords Paper

ranking, deep language models, natural language processing

0

0

0

0

10:40

06/12/2021

GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph

Junhan Yang, Zheng Liu, Shitao Xiao and
Chaozhuo Li, Defu Lian, Sanjay Agrawal, Amit Singh, Guangzhong Sun, Xing Xie

Keywords Paper

deep learning, transformers, graph learning, representation learning

0

0

0

1

10:44

06/12/2020

A Simple Language Model for Task-Oriented Dialogue

Ehsan Hosseini-Asl, Bryan McCann, Chien-Sheng Wu and
Semih Yavuz, Richard Socher

Keywords Paper

0

0

0

0

3:21

14/06/2020

MaskGAN: Towards Diverse and Interactive Facial Image Manipulation

Cheng-Han Lee, Ziwei Liu, Lingyun Wu, Ping Luo

Keywords Paper

facial image manipulation, face segmentation, image synthesis, generative adversarial network

0

0

0

0

1:00

23/08/2020

LayoutLM: Pre-training of text and layout for document image understanding

Yiheng Xu, Minghao Li, Lei Cui and
Shaohan Huang, Furu Wei, Ming Zhou

Keywords Paper

document image understanding, pre-trained models, LayoutLM

0

0

0

0

13:47

08/12/2020

On a Chatbot Navigating a User through a Concept-Based Knowledge Model

Boris Galitsky, Dmitry Ilvovsky, Elizaveta Goncharova

Keywords Paper

0

0

0

0

14:49

19/04/2021

BERTese: Learning to speak to BERT

Adi Haviv, Jonathan Berant, Amir Globerson

Keywords Paper

0

0

0

0

6:54

16/11/2020

Learning to Represent Image and Text with Denotation Graph

Bowen Zhang, Hexiang Hu, Vihan Jain and
Eugene Ie, Fei Sha

Keywords Paper

cross-modal retrieval, referring expression, compositional recognition, pre-training

0

0

0

0

10:59

04/07/2020

Emerging Cross-lingual Structure in Pretrained Language Models

Alexis Conneau, Shijie Wu, Haoran Li and
Luke Zettlemoyer, Veselin Stoyanov

Keywords Paper

multilingual modeling, cross-lingual transfer, transfer, Cross-lingual Models

0

0

0

0

11:49

07/06/2020

MimicProp: Learning to Incorporate Lexicon Knowledge into Distributed Word Representation for Social Media Analysis

Muheng Yan, Yu-Ru Lin, Rebecca Hwa and
Ali Mert Ertugrul, Meiqi Guo, Wen-Ting Chung

Keywords Paper

classification, embeddings, impact, learning, performance, representations, terms, texts, word embeddings, words

0

0

0

0

10:25

14/06/2020

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Yongpan Wang

Keywords Paper

data augmentation, text recognition, joint training

0

0

0

0

0:59

16/11/2020

Text Segmentation by Cross Segment Attention

Michal Lukasik, Boris Dadachev, Kishore Papineni, Gonçalo Simões

Keywords Paper

document segmentation, nlp tasks, downstream tasks, information retrieval

0

0

0

0

11:17

16/11/2020

Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting

Sanyuan Chen, Yutai Hou, Yiming Cui and
Wanxiang Che, Ting Liu, Xiangzhan Yu

Keywords Paper

pretraining, pretraining tasks, learning tasks, fine-tuning bert-large

0

0

0

1

10:52

26/04/2020

Compositional languages emerge in a neural iterated learning model

Yi Ren, Shangmin Guo, Matthieu Labeau and
Shay B. Cohen, Simon Kirby

Keywords Paper

Compositionality, Multi-agent, Emergent language, Iterated learning

0

0

0

0

5:07

14/06/2020

IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text Retrieval

Hui Chen, Guiguang Ding, Xudong Liu and
Zijia Lin, Ji Liu, Jungong Han

Keywords Paper

cross-modal image text retrieval, iterative matching, recurrent attention memory

0

0

0

0

1:04

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

04/07/2020

exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformer Models

Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann

Keywords Paper

analysis, model-internal process, exBERT, Visual Tool

0

0

0

0

9:44

18/11/2020

Bidirectional dependency-guided attention for relation extraction

Xingchen Deng, Lei Zhang, Yixing Fan and
Long Bai, Jiafeng Guo, Pengfei Wang

Keywords Paper

0

0

0

0

10:02

08/12/2020

Automatic Interlinear Glossing for Under-Resourced Languages Leveraging Translations

Xingyuan Zhao, Satoru Ozaki, Antonios Anastasopoulos and
Graham Neubig, Lori Levin

Keywords Paper

0

0

0

0

13:52

04/07/2020

Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization

Hanqi Jin, Tianming Wang, Xiaojun Wan

Keywords Paper

Extractive Summarization, Extractive , abstractive summarization, Multi-Granularity Network

0

0

0

0

10:38

22/06/2020

Learning Relation Entailment with Structured and Textual Information

Zhengbao Jiang, Jun Araki, Donghan Yu and
Ruohong Zhang, Wei Xu, Yiming Yang, Graham Neubig

Keywords Paper

relation entailment, structured information, textual information

0

0

0

0

4:57