TNT: Text Normalization based Pre-training of Transformers for Content Moderation

16/11/2020

TNT: Text Normalization based Pre-training of Transformers for Content Moderation

Fei Tan, Yifan Hu, Changwei Hu, Keqian Li, Kevin Yen

Keywords: content moderation, text manipulation, masked recovery, hate task

Abstract Paper Similar Papers

Abstract: In this work, we present a new language pre-training model TNT (Text Normalization based pre-training of Transformers) for content moderation. Inspired by the masking strategy and text normalization, TNT is developed to learn language representation by training transformers to reconstruct text from four operation types typically seen in text manipulation: substitution, transposition, deletion, and insertion. Furthermore, the normalization involves the prediction of both operation types and token labels, enabling TNT to learn from more challenging tasks than the standard task of masked word recovery. As a result, the experiments demonstrate that TNT outperforms strong baselines on the hate speech classification task. Additional text normalization experiments and case studies show that TNT is a new potential approach to misspelling correction.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

19/04/2021

Adv-OLM: Generating textual adversaries via OLM

Vijit Malik, Ashwani Bhat, Ashutosh Modi

Keywords Paper

0

0

0

0

7:04

16/11/2020

Attention is Not Only a Weight: Analyzing Transformers with Vector Norms

Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, Kentaro Inui

Keywords Paper

natural processing, norm-based analyses, word alignment, transformers

0

0

0

0

11:51

16/11/2020

Assessing Phrasal Representation and Composition in Transformers

Lang Yu, Allyson Ettinger

Keywords Paper

nlp tasks, systematic representations, deep models, phrasal representations

0

0

0

0

11:33

25/07/2020

Learning discriminative joint embeddings for efficient face and voice association

Rui Wang, Xin Liu, Yiu-ming Cheung and
Kai Cheng, Nannan Wang, Wentao Fan

Keywords Paper

bi-directional ranking constraint, face-voice association, cross-modal verification, discriminative joint embedding

0

0

0

0

8:33

03/05/2021

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Boxin Wang, Shuohang Wang, Yu Cheng and
Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Keywords Paper

adversarial training, QA, NLI, BERT, information theory, adversarial robustness

0

0

0

0

5:21

06/12/2021

ProTo: Program-Guided Transformer for Program-Guided Tasks

Zelin Zhao, Karan Samel, Binghong Chen, lee song

Keywords Paper

transformers

0

0

0

0

8:24

04/07/2020

Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering

Changmao Li, Jinho D. Choi

Keywords Paper

Span-based Answering, language tasks, token- modeling, utterance prediction

0

0

0

0

4:48

25/07/2020

Think beyond the word: Understanding the implied textual meaning by digesting context, local, and noise

Guoxiu He, Zhe Gao, Zhuoren Jiang and
Yangyang Kang, Changlong Sun, Xiaozhong Liu, Wei Lu

Keywords Paper

deep neural networks, text classification, semantic representation, implied textual meaning

0

0

0

0

19:57

03/05/2021

Random Feature Attention

Hao Peng, Nikolaos Pappas, Dani Yogatama and
Roy Schwartz, Noah Smith, Lingpeng Kong

Keywords Paper

machine translation, transformers, language modeling, Attention

0

0

0

0

10:20

04/07/2020

Span Selection Pre-training for Question Answering

Michael Glass, Alfio Gliozzo, Rishav Chakravarti and
Anthony Ferritto, Lin Pan, G P Shrivatsa Bhargav, Dinesh Garg, Avi Sil

Keywords Paper

Question Answering, language tasks, Next Prediction, pre-training task

0

0

0

0

13:16

08/12/2020

ContraCAT: Contrastive Coreference Analytical Templates for Machine Translation

Dario Stojanovski, Benno Krojer, Denis Peskov, Alexander Fraser

Keywords Paper

0

0

0

0

14:09

06/12/2021

ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

Yufei Xu, Qiming ZHANG, Jing Zhang, Dacheng Tao

Keywords Paper

machine learning, transformers, vision

0

0

0

0

10:16

19/04/2021

ENPAR:enhancing entity and entity pair representations for joint entity relation extraction

Yijun Wang, Changzhi Sun, Yuanbin Wu and
Hao Zhou, Lei Li, Junchi Yan

Keywords Paper

0

0

0

0

7:23

03/05/2021

HyperGrid Transformers: Towards A Single Model for Multiple Tasks

Yi Tay, Zhe Zhao, Dara Bahri and
Donald Metzler, DA-CHENG Juan

Keywords Paper

Transformers, Multi-Task Learning

0

0

0

0

5:14

05/12/2020

Unsupervised KB-to-text generation with auxiliary triple extraction using dual learning

Zihao Fu, Bei Shi, Lidong Bing, Wai Lam

Keywords Paper

0

0

0

0

15:01

19/04/2021

Enconter: Entity constrained progressive sequence generation via insertion-based transformer

Lee Hsun Hsieh, Yang-Yin Lee, Ee-Peng Lim

Keywords Paper

0

0

0

0

11:28

12/07/2020

Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao, Li Dong, Furu Wei and
Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, Hsiao-Wuen Hon

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

13:55

16/11/2020

Learning a natural-language to LTL executable semantic parser for grounded robotics

Christopher Wang, Candace Ross, Yen-Ling Kuo and
Boris Katz, Andrei Barbu

Keywords Paper

0

0

0

0

5:01

16/11/2020

Partially-Aligned Data-to-Text Generation with Distant Supervision

Zihao Fu, Bei Shi, Wai Lam and
Lidong Bing, Zhiyuan Liu

Keywords Paper

data-to-text task, generation task, dataset problem, over-generation problem

0

0

0

0

11:58

02/02/2021

Segatron: Segment-Aware Transformer for Language Modeling and Understanding

He Bai, Peng Shi, Jimmy Lin and
Yuqing Xie, Luchen Tan, Kun Xiong, Wen Gao, Ming Li

Keywords Paper

0

0

0

0

13:39

16/11/2020

Adversarial Attack and Defense of Structured Prediction Models

Wenjuan Han, Liwen Zhang, Yong Jiang, Kewei Tu

Keywords Paper

adversarial attacks, classification problems, structured tasks, nlp tasks

0

0

0

0

11:06

16/11/2020

On the Ability and Limitations of Transformers to Recognize Formal Languages

Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

Keywords Paper

nlp tasks, construction, transformers, lstms

1

1

0

0

11:27

06/12/2021

Grounding Spatio-Temporal Language with Transformers

Tristan Karch, Laetitia Teodorescu, Katja Hofmann and
Clément Moulin-Frier, Pierre-Yves Oudeyer

Keywords Paper

transformers

0

0

0

0

7:25

16/11/2020

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

Uma Roy, Noah Constant, Rami Al-Rfou and
Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords Paper

language-agnostic retrieval, cross-lingual tasks, cross-lingual retrieval, alignment

0

0

0

0

12:07

06/12/2020

Language Through a Prism: A Spectral Approach for Multiscale Language Representations

Alex Tamkin, Dan Jurafsky, Noah Goodman

Keywords Paper

0

0

0

0

3:34

04/07/2020

Distilling Knowledge Learned in BERT for Text Generation

Yen-Chun Chen, Zhe Gan, Yu Cheng and
Jingzhou Liu, Jingjing Liu

Keywords Paper

Text Generation, language tasks, language generation, generation tasks

0

0

0

0

10:41

12/07/2020

Working Memory Graphs

Ricky Loynd, Roland Fernandez, Asli Celikyilmaz and
Adith Swaminathan, Matthew Hausknecht

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:36

03/05/2021

Rethinking Positional Encoding in Language Pre-training

Guolin Ke, Di He, Tie-Yan Liu

Keywords Paper

Natural Language Processing, Pre-training

0

0

0

0

4:49

06/12/2020

MPNet: Masked and Permuted Pre-training for Language Understanding

Kaitao Song, Xu Tan, Tao Qin and
Jianfeng Lu, Tie-Yan Liu

Keywords Paper

0

0

0

0

3:23

08/12/2020

Bayesian Methods for Semi-supervised Text Annotation

Kristian Miok, Gregor Pirs, Marko Robnik-Sikonja

Keywords Paper

0

0

0

0

11:18

02/02/2021

Nyströmformer: A Nyström-based Algorithm for Approximating Self-Attention

Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty and
Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh

Keywords Paper

0

0

0

0

17:26

04/07/2020

Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation

Junliang Guo, Linli Xu, Enhong Chen

Keywords Paper

Non-Autoregressive Translation, natural tasks, non-autoregressive translation~(NAT, non-autoregressive

0

0

0

0

10:47

03/05/2021

Semantic Re-tuning with Contrastive Tension

Fredrik Carlsson, Amaru C Gyllensten, Evangelia Gogoulou and
Erik Y Hellqvist, Magnus Sahlgren

Keywords Paper

Fine-tuning, Pre-training, Sentence Representations, Sentence Embeddings, Language Modelling, Semantic Textual Similarity, Transformers

0

0

0

0

5:07

16/11/2020

PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation

Bin Bi, Chenliang Li, Chen Wu and
Ming Yan, Wei Wang, Songfang Huang, Fei Huang, Luo Si

Keywords Paper

natural generation, language tasks, generative answering, conversational generation

0

0

0

0

11:02

16/11/2020

Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting

Sanyuan Chen, Yutai Hou, Yiming Cui and
Wanxiang Che, Ting Liu, Xiangzhan Yu

Keywords Paper

pretraining, pretraining tasks, learning tasks, fine-tuning bert-large

0

0

0

1

10:52

08/12/2020

SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP

Katsuki Chousa, Masaaki Nagata, Masaaki Nishino

Keywords Paper

0

0

0

0

14:39

02/02/2021

Inverse Reinforcement Learning with Natural Language Goals

Li Zhou, Kevin Small

Keywords Paper

0

0

0

0

19:59

08/12/2020

CharBERT: Character-aware Pre-trained Language Model

Wentao Ma, Yiming Cui, Chenglei Si and
Ting Liu, Shijin Wang, Guoping Hu

Keywords Paper

0

0

0

0

14:20

16/11/2020

Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

Minki Kang, Moonsu Han, Sung Ju Hwang

Keywords Paper

self-supervised pre-training, question answering, task, reinforcement learning

0

0

0

0

12:00

02/02/2021

Learning to Attack Real-World Models for Person Re-identification via Virtual-Guided Meta-Learning

Fengxiang Yang, Zhun Zhong, Hong Liu and
Zheng Wang, Zhiming Luo, Shaozi Li, Nicu Sebe, Shin'ichi Satoh

Keywords Paper

0

0

0

0

14:19