Coding Textual Inputs Boosts the Accuracy of Neural Networks

16/11/2020

Coding Textual Inputs Boosts the Accuracy of Neural Networks

Abdul Rafae Khan, Jia Xu, Weiwei Sun

Keywords: natural tasks, nlp, neural-network-based systems, machine translation

Abstract Paper Similar Papers

Abstract: Natural Language Processing (NLP) tasks are usually performed word by word on textual inputs. We can use arbitrary symbols to represent the linguistic meaning of a word and use these symbols as inputs. As ``alternatives″ to a text representation, we introduce Soundex, MetaPhone, NYSIIS, logogram to NLP, and develop fixed-output-length coding and its extension using Huffman coding. Each of those codings combines different character/digital sequences and constructs a new vocabulary based on codewords. We find that the integration of those codewords with text provides more reliable inputs to Neural-Network-based NLP systems through redundancy than text-alone inputs. Experiments demonstrate that our approach outperforms the state-of-the-art models on the application of machine translation, language modeling, and part-of-speech tagging. The source code is available at https://github.com/abdulrafae/coding_nmt.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

19/04/2021

Generating syntactically controlled paraphrases without using annotated parallel pairs

Kuan-Hao Huang, Kai-Wei Chang

Keywords Paper

0

0

0

1

10:41

26/04/2020

Encoding word order in complex embeddings

Benyou Wang, Donghao Zhao, Christina Lioma and
Qiuchi Li, Peng Zhang, Jakob Grue Simonsen

Keywords Paper

word embedding, complex-valued neural network, position embedding

0

0

0

0

4:51

16/11/2020

End-to-End Slot Alignment and Recognition for Cross-Lingual NLU

Weijia Xu, Batool Haider, Saab Mansour

Keywords Paper

natural understanding, natural, nlu, goal-oriented systems

0

0

0

0

9:46

26/04/2020

A Latent Morphology Model for Open-Vocabulary Neural Machine Translation

Duygu Ataman, Wilker Aziz, Alexandra Birch

Keywords Paper

neural machine translation, low-resource languages, latent-variable models

0

0

0

0

5:10

26/04/2020

Neural Machine Translation with Universal Visual Representation

Zhuosheng Zhang, Kehai Chen, Rui Wang and
Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Keywords Paper

Neural Machine Translation, Visual Representation, Multimodal Machine Translation, Language Representation

0

0

0

0

4:50

16/11/2020

Generationary or “How We Went beyond Word Sense Inventories and Learned to Gloss”

Michele Bevilacqua, Marco Maru, Roberto Navigli

Keywords Paper

generative modeling, definition modeling, discriminative tasks, word disambiguation

0

0

0

0

11:49

03/05/2021

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

0

0

0

0

3:51

01/07/2020

Encodings of Source Syntax: Similarities in NMT Representations Across Target Languages

Tyler A. Chang, Anna Rafferty

Keywords Paper

0

0

0

0

4:00

19/08/2021

Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation

Zilu Guo, Zhongqiang Huang, Kenny Q. Zhu and
Guandan Chen, Kaibo Zhang, Boxing Chen, Fei Huang

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation, NLP Applications and Tools

0

0

0

0

13:53

19/08/2021

Exemplification Modeling: Can You Give Me an Example, Please?

Edoardo Barba, Luigi Procopio, Caterina Lacerra and
Tommaso Pasini, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:47

01/07/2020

Neural Multi-task Text Normalization and Sanitization with Pointer-Generator

Hoang Nguyen, Sandro Cavallari

Keywords Paper

0

0

0

0

9:16

16/11/2020

With More Contexts Comes Better Performance: Contextualized Sense Embeddings for All-Round Word Sense Disambiguation

Bianca Scarlini, Tommaso Pasini, Roberto Navigli

Keywords Paper

natural processing, english task, word-in-context task, contextualized embeddings

0

0

0

0

12:11

01/07/2020

Supertagging with CCG primitives

Aditya Bhargava, Gerald Penn

Keywords Paper

0

0

0

0

5:00

16/11/2020

Zero-Shot Crosslingual Sentence Simplification

Jonathan Mallinson, Rico Sennrich, Mirella Lapata

Keywords Paper

sentence simplification, translation, simplification, encoder-decoder models

0

0

0

0

10:34

03/05/2021

Rethinking Positional Encoding in Language Pre-training

Guolin Ke, Di He, Tie-Yan Liu

Keywords Paper

Natural Language Processing, Pre-training

0

0

0

0

4:49

02/02/2021

C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling

Yutai Hou, Sanyuan Chen, Wanxiang Che and
Cheng Chen, Ting Liu

Keywords Paper

0

0

0

0

15:01

04/07/2020

Learning Source Phrase Representations for Neural Machine Translation

Hongfei Xu, Josef van Genabith, Deyi Xiong and
Qiuhui Liu, Jingyi Zhang

Keywords Paper

Neural Translation, WMT tasks, Learning Representations, Transformer model

0

0

0

0

7:18

12/07/2020

Recurrent Hierarchical Topic-Guided RNN for Language Generation

Dandan Guo, Bo Chen, Ruiying Lu, Mingyuan Zhou

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

16:05

02/02/2021

Lexically Constrained Neural Machine Translation with Explicit Alignment Guidance

Guanhua Chen, Yun Chen, Victor O.K. Li

Keywords Paper

0

0

0

0

15:33

02/02/2021

Object Relation Attention for Image Paragraph Captioning

Li-Chuan Yang, Chih-Yuan Yang, Jane Yung-jen Hsu

Keywords Paper

0

0

0

0

15:03

03/05/2021

Structured Prediction as Translation between Augmented Natural Languages

Giovanni Paolini, Ben Athiwaratkun, Jason Krone and
Jie Ma, Alessandro Achille, RISHITA ANUBHAI, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto

Keywords Paper

sequence to sequence, structured prediction, language models, transfer learning, few-shot learning, multi-task learning, generative modeling

0

0

0

0

12:16

26/04/2020

Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention

Chen Zhao, Chenyan Xiong, Corby Rosset and
Xia Song, Paul Bennett, Saurabh Tiwary

Keywords Paper

Transformer-XH, multi-hop QA, fact verification, extra hop attention, structured modeling

0

0

0

0

5:03

16/11/2020

Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT

Rik van Noord, Antonio Toral, Johan Bos

Keywords Paper

discourse parsing, analysis, character-level representations, character representations

0

0

0

0

11:26

04/07/2020

Neural Syntactic Preordering for Controlled Paraphrase Generation

Tanya Goyal, Greg Durrett

Keywords Paper

Controlled Generation, Paraphrasing sentences, machine translation, Neural Preordering

0

0

0

0

11:37

02/02/2021

LIREx: Augmenting Language Inference with Relevant Explanations

Xinyan Zhao, V.G.Vinod Vydiswaran

Keywords Paper

0

0

0

0

18:56

01/07/2020

A Cross-Task Analysis of Text Span Representations

Shubham Toshniwal, Haoyue Shi, Bowen Shi and
Lingyu Gao, Karen Livescu, Kevin Gimpel

Keywords Paper

0

0

0

0

5:27

16/11/2020

Uncertainty-Aware Semantic Augmentation for Neural Machine Translation

Xiangpeng Wei, Heng Yu, Yue Hu and
Rongxiang Weng, Luxi Xing, Weihua Luo

Keywords Paper

sequence-to-sequence task, nmt, inference, translation tasks

0

0

0

0

11:11

16/11/2020

T3: Tree-Autoencoder Constrained Adversarial Text Generation for Targeted Attack

Boxin Wang, Hengzhi Pei, Boyuan Pan and
Qian Chen, Shuohang Wang, Bo Li

Keywords Paper

adversarial generation, nlp tasks, sentiment analysis, qa

0

0

0

0

11:59

16/11/2020

A Bilingual Generative Transformer for Semantic Sentence Embedding

John Wieting, Graham Neubig, Taylor Berg-Kirkpatrick

Keywords Paper

source separation, semantic encoding, data distributions, unsupervised evaluations

0

0

0

0

14:32

02/02/2021

Guiding Non-Autoregressive Neural Machine Translation Decoding with Reordering Information

Qiu Ran, Yankai Lin, Peng Li, Jie Zhou

Keywords Paper

0

0

0

0

14:24

04/07/2020

Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order

Yi Liao, Xin Jiang, Qun Liu

Keywords Paper

Autoregressive Generation, natural tasks, natural generation, natural NLG

0

0

0

0

12:11

04/07/2020

On the Linguistic Representational Power of Neural Machine Translation Models

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi and
Hassan Sajjad, James Glass

Keywords Paper

Linguistic Models, natural processing, artificial intelligence, translating languages

0

0

0

0

19:17

08/12/2020

CharBERT: Character-aware Pre-trained Language Model

Wentao Ma, Yiming Cui, Chenglei Si and
Ting Liu, Shijin Wang, Guoping Hu

Keywords Paper

0

0

0

0

14:20

19/04/2021

NLQuAD: A non-factoid long question answering data set

Amir Soleimani, Christof Monz, Marcel Worring

Keywords Paper

0

0

0

0

9:01

06/12/2021

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

Yichong Leng, Xu Tan, Linchen Zhu and
Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiangyang Li, Edward Lin, Tie-Yan Liu

Keywords Paper

0

0

0

0

13:44

04/07/2020

Self-Attention with Cross-Lingual Position Representation

Liang Ding, Longyue Wang, Dacheng Tao

Keywords Paper

natural tasks, WMT'17 tasks, Cross-Lingual Representation, Position encoding

0

0

0

0

7:46

16/11/2020

Asking without Telling: Exploring Latent Ontologies in Contextual Representations

Julian Michael, Jan A. Botha, Ian Tenney

Keywords Paper

pretrained encoders, elmo, bert, latent learning

0

0

0

0

12:45

16/11/2020

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

Uma Roy, Noah Constant, Rami Al-Rfou and
Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords Paper

language-agnostic retrieval, cross-lingual tasks, cross-lingual retrieval, alignment

0

0

0

0

12:07

19/08/2021

Keep the Structure: A Latent Shift-Reduce Parser for Semantic Parsing

Yuntao Li, Bei Chen, Qian Liu and
Yan Gao, Jian-Guang Lou, Yan Zhang, Dongmei Zhang

Keywords Paper

Natural Language Processing, Natural Language Semantics

0

0

0

0

12:37

02/02/2021

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps

Qi Zhu, Chenyu Gao, Peng Wang, Qi Wu

Keywords Paper

0

0

0

0

15:58