The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

04/07/2020

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

Xiaodong Liu, Yu Wang, Jianshu Ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao

Keywords: Natural Understanding, NLU tasks, classification, regression

Abstract Paper Similar Papers

Abstract: We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multi-task knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

Generalizing Natural Language Analysis through Span-relation Representations

Zhengbao Jiang, Wei Xu, Jun Araki, Graham Neubig

Keywords Paper

Natural Analysis, Natural processing, dependency parsing, semantic labeling

0

0

0

0

8:30

02/02/2021

A General Class of Transfer Learning Regression without Implementation Cost

Shunya Minami, Song Liu, Stephen Wu and
Kenji Fukumizu, Ryo Yoshida

Keywords Paper

0

0

0

0

14:13

04/07/2020

SyntaxGym: An Online Platform for Targeted Evaluation of Language Models

Jon Gauthier, Jennifer Hu, Ethan Wilcox and
Peng Qian, Roger Levy

Keywords Paper

Targeted Models, syntactic evaluations, evaluations, computational community

0

0

0

0

11:55

26/04/2020

Gradients as Features for Deep Representation Learning

Fangzhou Mu, Yingyu Liang, Yin Li

Keywords Paper

representation learning, gradient features, deep learning

0

0

0

0

5:07

03/05/2021

MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

Tsz Him Cheung, Dit-Yan Yeung

Keywords Paper

automated data augmentation, deep learning, data augmentation, latent space

0

0

0

0

5:11

16/11/2020

Improving Neural Topic Models using Knowledge Distillation

Alexander Miserlis Hoyle, Pranav Goel, Philip Resnik

Keywords Paper

topic models, knowledge distillation, probabilistic models, pretrained transformers

0

0

0

0

10:37

06/12/2020

Boosting Adversarial Training with Hypersphere Embedding

Tianyu Pang, Xiao Yang, Yinpeng Dong and
Kun Xu, Jun Zhu, Hang Su

Keywords Paper

0

0

0

0

2:59

06/12/2020

Counterexample-Guided Learning of Monotonic Neural Networks

Aishwarya Sivaraman, Golnoosh Farnadi, Todd Millstein, Guy Van den Broeck

Keywords Paper

0

0

0

0

3:22

06/12/2020

Network-to-Network Translation with Conditional Invertible Neural Networks

Robin Rombach, Patrick Esser, Bjorn Ommer

Keywords Paper

0

0

0

0

3:25

03/05/2021

A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention

Grégoire Mialon, Dexiong Chen, Alexandre d'Aspremont, Julien Mairal

Keywords Paper

attention, bioinformatics, transformers, optimal transport, kernel methods

0

0

0

0

5:29

30/11/2020

MTNAS: Search Multi-Task Networks for Autonomous Driving

Hao Liu, Dong Li, JinZhang Peng and
Qingjie Zhao, Lu Tian, Yi Shan

Keywords Paper

0

0

0

0

9:06

26/04/2020

Compositional languages emerge in a neural iterated learning model

Yi Ren, Shangmin Guo, Matthieu Labeau and
Shay B. Cohen, Simon Kirby

Keywords Paper

Compositionality, Multi-agent, Emergent language, Iterated learning

0

0

0

0

5:07

06/12/2020

PLLay: Efficient Topological Layer based on Persistent Landscapes

Kwangho Kim, Jisu Kim, Manzil Zaheer and
Joon Kim, Frederic Chazal, Larry Wasserman

Keywords Paper

0

0

0

0

3:09

04/07/2020

Deep Contextualized Self-training for Low Resource Dependency Parsing

Guy Rotman, Roi Reichart

Keywords Paper

Low Parsing, sequence tasks, Deep Self-training, Neural parsing

0

0

0

0

11:41

14/06/2020

PointAugment: An Auto-Augmentation Framework for Point Cloud Classification

Ruihui Li, Xianzhi Li, Pheng-Ann Heng, Chi-Wing Fu

Keywords Paper

auto-augmentation framework, point cloud processing, sample-aware, jointly optimizing, classification

0

0

0

0

5:01

19/10/2020

Dimension relation modeling for click-through rate prediction

Zihao Zhao, Zhiwei Fang, Yong Li and
Changping Peng, Yongjun Bao, Weipeng Yan

Keywords Paper

recommendation, deep learning, neural networks

0

0

0

0

6:18

05/04/2021

Larq Compute Engine: Design, Benchmark and Deploy State-of-the-Art Binarized Neural Networks

Tom Bannink, Adam Hillier, Lukas Geiger and
Tim de Bruin, Leon Overweel, Jelmer Neeven, Koen Helwegen

Keywords Paper

0

0

0

0

22:15

16/11/2020

Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining

Chengyu Wang, Minghui Qiu, Jun Huang, Xiaofeng He

Keywords Paper

nlp tasks, fine-tuning, learning process, multi-domain tasks

0

0

0

0

9:58

02/02/2021

Exploiting Behavioral Consistence for Universal User Representation

Jie Gu, Feng Wang, Qinghui Sun and
Zhiquan Ye, Xiaoxiao Xu, Jingmin Chen, Jun Zhang

Keywords Paper

0

0

0

0

14:06

12/07/2020

Operation-Aware Soft Channel Pruning using Differentiable Masks

Minsoo Kang, Bohyung Han

Keywords Paper

Applications - Computer Vision

0

0

0

0

14:56

03/05/2021

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding

Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu and
Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, Zhifeng Chen

Keywords Paper

0

0

0

0

5:07

16/11/2020

Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification

Prithviraj Sen, Marina Danilevsky, Yunyao Li and
Siddhartha Brahma, Matthias Boehm, Laura Chiticariu, Rajasekar Krishnamurthy

Keywords Paper

interpretability models, sentence classification, le, human-machine models

0

0

0

0

9:42

04/07/2020

TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing

Ziqing Yang, Yiming Cui, Zhipeng Chen and
Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu

Keywords Paper

Natural Processing, supervised tasks, text classification, reading comprehension

0

0

0

0

10:36

06/12/2021

A Framework to Learn with Interpretation

Jayneel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc

Keywords Paper

deep learning, interpretability

0

0

0

0

14:05

06/12/2021

Topographic VAEs learn Equivariant Capsules

T. Anderson Keller, Max Welling

Keywords Paper

deep learning, generative model, graph learning

0

0

0

0

9:58

22/11/2021

Single-Modal Entropy based Active Learning for Visual Question Answering

Dong-Jin Kim, Jae Won Cho, Jinsoo Choi and
Yunjae Jung, In So Kweon

Keywords Paper

Visual Question Answering, Vision and Language, Active Learning

0

0

0

0

2:42

02/02/2021

Self-Progressing Robust Training

Minhao Cheng, Pin-Yu Chen, Sijia Liu and
Shiyu Chang, Cho-Jui Hsieh, Payel Das

Keywords Paper

0

0

0

0

14:34

18/07/2021

Leveraging Language to Learn Program Abstractions and Search Heuristics

Catherine Wong, Kevin Ellis, Josh Tenenbaum, Jacob Andreas

Keywords Paper

Algorithms, Multimodal Learning

0

0

0

0

5:18

02/02/2021

Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text

Nishtha Madaan, Inkit Padhi, Naveen Panwar, Diptikalyan Saha

Keywords Paper

0

0

0

0

20:15

02/02/2021

DenserNet: Weakly Supervised Visual Localization Using Multi-Scale Feature Aggregation

Dongfang Liu, Yiming Cui, Liqi Yan and
Christos Mousas, Baijian Yang, Yingjie Chen

Keywords Paper

0

0

0

0

16:15

19/04/2021

Bootstrapping relation extractors using syntactic search by examples

Matan Eyal, Asaf Amrami, Hillel Taub-Tabib, Yoav Goldberg

Keywords Paper

0

0

0

0

9:55

26/04/2020

Scalable Neural Methods for Reasoning With a Symbolic Knowledge Base

William W. Cohen, Haitian Sun, R. Alex Hofer, Matthew Siegler

Keywords Paper

question-answering, knowledge base completion, neuro-symbolic reasoning, multihop reasoning

0

0

0

0

5:05

22/09/2020

AutoRec: An automated recommender system

Ting-Hsiang Wang, Xia Hu, Haifeng Jin and
Qingquan Song, Xiaotian Han, Zirui Liu

Keywords Paper

Recommender Systems, Hyperparameter Tuning, Neural Architecture Search, Automated Machine Learning, Model Search

0

0

0

0

3:18

23/08/2020

AutoML pipeline selection: Efficiently navigating the combinatorial space

Chengrun Yang, Jicong Fan, Ziyang Wu, Madeleine Udell

Keywords Paper

pipeline search, greedy algorithms, experiment design, AutoML, tensor decomposition, submodular optimization, meta-learning

0

0

0

0

13:40

06/12/2020

Deep Imitation Learning for Bimanual Robotic Manipulation

Fan Xie, Alexander Chowdhury, Clara De Paolis Kaluza and
Linfeng Zhao, Lawson Wong, Rose Yu

Keywords Paper

0

0

0

0

3:12

02/02/2021

Representing the Unification of Text Featurization using a Context-Free Grammar

Doruk Kilitcioglu, Serdar Kadioglu

Keywords Paper

0

0

0

0

15:28

15/11/2020

Formulog: Datalog for SMT-Based Static Analysis

Aaron Bembenek, Michael Greenberg, Stephen Chong

Keywords Paper

Datalog, SMT solving

0

0

0

0

15:05

18/07/2021

XOR-CD: Linearly Convergent Constrained Structure Generation

Fan Ding, Jianzhu Ma, Jinbo Xu, Yexiang Xue

Keywords Paper

Probabilistic Methods

0

0

0

0

5:14

26/04/2020

Reducing Transformer Depth on Demand with Structured Dropout

Angela Fan, Edouard Grave, Armand Joulin

Keywords Paper

reduction, regularization, pruning, dropout, transformer

0

0

0

0

5:01

06/12/2020

MATE: Plugging in Model Awareness to Task Embedding for Meta Learning

Xiaohan Chen, Zhangyang Wang, Siyu Tang, Krikamol Muandet

Keywords Paper

0

0

0

0

3:19