MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

04/07/2020

MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

Jiaao Chen, Zichao Yang, Diyi Yang

Keywords: Semi-Supervised Classification, text classification, data augmentation, supervision

Abstract Paper Similar Papers

Abstract: This paper presents MixText, a semi-supervised learning method for text classification, which uses our newly designed data augmentation method called TMix. TMix creates a large amount of augmented training samples by interpolating text in hidden space. Moreover, we leverage recent advances in data augmentation to guess low-entropy labels for unlabeled data, hence making them as easy to use as labeled data. By mixing labeled, unlabeled and augmented data, MixText significantly outperformed current pre-trained and fined-tuned models and other state-of-the-art semi-supervised learning methods on several text classification benchmarks. The improvement is especially prominent when supervision is extremely limited. We have publicly released our code at https://github.com/GT-SALT/MixText.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

$i$-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

Kibok Lee, Yian Zhu, Kihyuk Sohn and
Chun-Liang Li, Jinwoo Shin, Honglak Lee

Keywords Paper

self-supervised learning, unsupervised representation learning, data augmentation, MixUp, contrastive representation learning

0

0

0

0

5:04

16/11/2020

Local Additivity Based Data Augmentation for Semi-supervised NER

Jiaao Chen, Zhenghui Wang, Ran Tian and
Zichao Yang, Diyi Yang

Keywords Paper

named recognition, deep understanding, semi-supervised ner, entity learning

0

0

0

0

11:18

14/06/2020

TransMatch: A Transfer-Learning Scheme for Semi-Supervised Few-Shot Learning

Zhongjie Yu, Lin Chen, Zhongwei Cheng, Jiebo Luo

Keywords Paper

few-shot learning, semi-supervised learning, meta-learning

0

0

0

0

1:01

14/06/2020

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

Sharon Fogel, Hadar Averbuch-Elor, Sarel Cohen and
Shai Mazor, Roee Litman

Keywords Paper

gan, semi-supervised, domain-adaptation, handwriting, generative, unlabeled, transfer learning, ocr, text, augmentation

0

0

0

0

1:01

06/12/2020

Unsupervised Data Augmentation for Consistency Training

Qizhe Xie, Zihang Dai, Eduard Hovy and
Thang Luong, Quoc V Le

Keywords Paper

0

0

0

0

3:29

16/11/2020

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum

Keywords Paper

nlp applications, fine-tuning, meta-learning problem, supervised tasks

0

0

0

0

11:49

06/12/2020

Unsupervised Semantic Aggregation and Deformable Template Matching for Semi-Supervised Learning

Tao Han, Junyu Gao, Yuan Yuan, Qi Wang

Keywords Paper

0

0

0

0

3:22

18/07/2021

Self-supervised and Supervised Joint Training for Resource-rich Machine Translation

Yong Cheng, Wei Wang, Lu Jiang, Wolfgang Macherey

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

5:21

22/11/2021

One-Shot Deep Model for End-to-End Multi-Person Activity Recognition

Shuhei Tarashima

Keywords Paper

Group Activity Recognition, Action Recognition, Multi-Object Tracking, Multi-task Learning

0

0

0

0

2:50

02/02/2021

SALNet: Semi-supervised Few-Shot Text Classification with Attention-based Lexicon Construction

Ju-Hyoung Lee, Sang-Ki Ko, Yo-Sub Han

Keywords Paper

0

0

0

0

15:28

16/11/2020

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Xiaomian Kang, Yang Zhao, Jiajun Zhang, Chengqing Zong

Keywords Paper

document-level translation, translations, document-level model, selection module

0

0

0

0

11:36

16/11/2020

Plug and Play Autoencoders for Conditional Text Generation

Florian Mai, Nikolaos Pappas, Ivan Montero and
Noah A. Smith, James Henderson

Keywords Paper

conditional tasks, style transfer, style tasks, text autoencoders

0

0

0

0

9:23

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

14/06/2020

Semi-Supervised Semantic Segmentation With Cross-Consistency Training

Yassine Ouali, Céline Hudelot, Myriam Tami

Keywords Paper

semantic segmentation, semi-supervised learning, consistency training, semi-supervised semantic segmentation

0

0

0

0

1:01

14/06/2020

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Yongpan Wang

Keywords Paper

data augmentation, text recognition, joint training

0

0

0

0

0:59

16/11/2020

Partially-Aligned Data-to-Text Generation with Distant Supervision

Zihao Fu, Bei Shi, Wai Lam and
Lidong Bing, Zhiyuan Liu

Keywords Paper

data-to-text task, generation task, dataset problem, over-generation problem

0

0

0

0

11:58

16/11/2020

Cross-Thought for Sentence Encoder Pre-training

Shuohang Wang, Yuwei Fang, Siqi Sun and
Zhe Gan, Yu Cheng, Jingjing Liu, Jing Jiang

Keywords Paper

pre-training encoder, large-scale tasks, question answering, predicting words

0

0

0

0

12:06

16/11/2020

DAGA: Data Augmentation with a Generation Approach forLow-resource Tagging Tasks

Bosheng Ding, Linlin Liu, Lidong Bing and
Canasai Kruengkrai, Thien Hai Nguyen, Shafiq Joty, Luo Si, Chunyan Miao

Keywords Paper

machine learning, generalization, low-resource tasks, named recognition

0

0

0

0

11:09

14/06/2020

OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold

Mohamed Yousef, Tom E. Bishop

Keywords Paper

text recognition, weakly supervised, handwriting recognition, convolutional neural network fully convolutional, ctc

0

0

0

0

1:00

08/12/2020

Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks

Lichao Sun, Congying Xia, Wenpeng Yin and
Tingting Liang, Philip Yu, Lifang He

Keywords Paper

0

0

0

0

9:52

05/12/2020

Self-supervised learning for pairwise data refinement

Gustavo Hernandez Abrego, Bowen Liang, Wei Wang and
Zarana Parekh, Yinfei Yang, Yunhsuan Sung

Keywords Paper

0

0

0

0

15:17

02/02/2021

Learning a Few-shot Embedding Model with Contrastive Learning

Chen Liu, Yanwei Fu, Chengming Xu and
Siqian Yang, Jilin Li, Chengjie Wang, Li Zhang

Keywords Paper

0

0

0

0

15:02

03/05/2021

IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning

Manli Zhang, Jianhong Zhang, Zhiwu Lu and
Tao Xiang, Mingyu Ding, Songfang Huang

Keywords Paper

self-supervised learning, few-shot learning, episode-level pretext task

0

0

0

0

5:03

19/08/2021

ALaSca: an Automated approach for Large-Scale Lexical Substitution

Caterina Lacerra, Tommaso Pasini, Rocco Tripodi, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:27

06/12/2021

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare and
Shafiq Joty, Caiming Xiong, Steven Chu Hong Hoi

Keywords Paper

transformers, vision, representation learning

0

0

0

0

9:40

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

05/01/2021

Few-Shot Learning via Feature Hallucination With Variational Inference

Qinxuan Luo, Lingfeng Wang, Jingguo Lv and
Shiming Xiang, Chunhong Pan

Keywords Paper

0

0

0

0

4:56

06/12/2020

DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs

yaxing wang, Lu Yu, Joost van de Weijer

Keywords Paper

Algorithms -> Online Learning, Optimization -> Stochastic Optimization

0

0

0

0

3:23

16/11/2020

Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining

Chengyu Wang, Minghui Qiu, Jun Huang, Xiaofeng He

Keywords Paper

nlp tasks, fine-tuning, learning process, multi-domain tasks

0

0

0

0

9:58

04/07/2020

Towards Unsupervised Language Understanding and Generation by Joint Dual Learning

Shang-Yu Su, Chao-Wei Huang, Yun-Nung Chen

Keywords Paper

Unsupervised Understanding, Unsupervised Generation, natural understanding, natural generation

0

0

0

0

8:23

02/02/2021

Deterministic Mini-batch Sequencing for Training Deep Neural Networks

Subhankar Banerjee, Shayok Chakraborty

Keywords Paper

0

0

0

0

16:00

07/09/2020

Meta-RetinaNet for Few-shot Object Detection

Shaoqi Li, Wenfeng Song, Shuai Li and
Aimin Hao, Hong Qin

Keywords Paper

Few shot, object detection, meta-learning, Meta-RetinaNet, Balanced Loss, coefficient vector

0

0

0

0

8:51

04/07/2020

A Transformer-based Approach for Source Code Summarization

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

Keywords Paper

Source Summarization, summarization, ablation studies, Transformer-based Approach

0

0

0

0

6:14

16/11/2020

KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation

Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang

Keywords Paper

data-to-text generation, data-to-text tasks, fully-supervised setting, pre-training learning

0

0

0

0

11:10

04/07/2020

IMoJIE: Iterative Memory-Based Joint Open Information Extraction

Keshav Kolluru, Samarth Aggarwal, Vipul Rathore and
Mausam -, Soumen Chakrabarti

Keywords Paper

Iterative Extraction, Open Extraction, IMoJIE, Iterative

0

0

0

0

9:31

16/11/2020

CSP:Code-Switching Pre-training for Neural Machine Translation

Zhen Yang, Bojie Hu, Ambyera Han and
Shen Huang, Qi Ju

Keywords Paper

neural nmt, lexicon induction, unsupervised nmt, pre-training method

0

0

0

0

10:10

02/02/2021

Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation

Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao

Keywords Paper

0

0

0

0

14:20

19/08/2021

Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation

Zilu Guo, Zhongqiang Huang, Kenny Q. Zhu and
Guandan Chen, Kaibo Zhang, Boxing Chen, Fei Huang

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation, NLP Applications and Tools

0

0

0

0

13:53

04/07/2020

Using Context in Neural Machine Translation Training Objectives

Danielle Saunders, Felix Stahlberg, Bill Byrne

Keywords Paper

Neural training, NMT training, document-level training, NMT objective

0

0

0

0

6:48

18/07/2021

Latent Space Energy-Based Model of Symbol-Vector Coupling for Text Generation and Classification

Bo Pang, Ying Nian Wu

Keywords Paper

Algorithms, Unsupervised Learning

0

0

0

0

5:17