Multiplicative Interactions and Where to Find Them

26/04/2020

Multiplicative Interactions and Where to Find Them

Siddhant M. Jayakumar, Wojciech M. Czarnecki, Jacob Menick, Jonathan Schwarz, Jack Rae, Simon Osindero, Yee Whye Teh, Tim Harley, Razvan Pascanu

Keywords: multiplicative interactions, hypernetworks, attention

Abstract Paper Similar Papers

Abstract: We explore the role of multiplicative interaction as a unifying framework to describe a range of classical and modern neural network architectural motifs, such as gating, attention layers, hypernetworks, and dynamic convolutions amongst others. Multiplicative interaction layers as primitive operations have a long-established presence in the literature, though this often not emphasized and thus under-appreciated. We begin by showing that such layers strictly enrich the representable function classes of neural networks. We conjecture that multiplicative interactions offer a particularly powerful inductive bias when fusing multiple streams of information or when conditional computation is required. We therefore argue that they should be considered in many situation where multiple compute or information paths need to be combined, in place of the simple and oft-used concatenation operation. Finally, we back up our claims and demonstrate the potential of multiplicative interactions by applying them in large-scale complex RL and sequence modelling tasks, where their use allows us to deliver state-of-the-art results, and thereby provides new evidence in support of multiplicative interactions playing a more prominent role when designing new neural network architectures.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

05/01/2021

Meta Module Network for Compositional Visual Reasoning

Wenhu Chen, Zhe Gan, Linjie Li and
Yu Cheng, William Wang, Jingjing Liu

Keywords Paper

0

0

0

0

5:13

12/07/2020

Forecasting sequential data using Consistent Koopman Autoencoders

Omri Azencot, N. Benjamin Erichson, Vanessa Lin, Michael Mahoney

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

14:06

12/07/2020

Representing Unordered Data Using Multiset Automata and Complex Numbers

Justin DeBenedetto, David Chiang

Keywords Paper

General Machine Learning Techniques

0

0

0

0

15:10

06/12/2020

Collegial Ensembles

Etai Littwin, Ben Myara, Sima Sabah and
Joshua Susskind, Shuangfei Zhai, Oren Golan

Keywords Paper

0

0

0

0

3:17

06/12/2021

Rethinking Neural Operations for Diverse Tasks

Nicholas Roberts, Mikhail Khodak, Tri Dao and
Liam Li, Christopher Ré, Ameet S Talwalkar

Keywords Paper

deep learning, machine learning

0

0

0

0

10:26

02/02/2021

Meta-Transfer Learning for Low-Resource Abstractive Summarization

Yi-Syuan Chen, Hong-Han Shuai

Keywords Paper

0

0

0

0

19:10

05/01/2021

MPRNet: Multi-Path Residual Network for Lightweight Image Super Resolution

Armin Mehri, Parichehr B. Ardakani, Angel D. Sappa

Keywords Paper

0

0

0

0

4:57

06/12/2021

Learning Transferable Adversarial Perturbations

Krishna kanth Nakka, Mathieu Salzmann

Keywords Paper

deep learning, optimization, adversarial robustness and security

0

0

0

0

12:00

16/11/2020

Counterfactual Generator: A Weakly-Supervised Method for Named Entity Recognition

Xiangji Zeng, Yunliang Li, Yuchen Zhai, Yin Zhang

Keywords Paper

named recognition, neural models, counterfactual generator, structural model

0

0

0

0

10:20

26/04/2020

Counterfactuals uncover the modular structure of deep generative models

Michel Besserve, Arash Mehrjou, Rémy Sun, Bernhard Schölkopf

Keywords Paper

generative models, causality, counterfactuals, representation learning, disentanglement, generalization, unsupervised learning

0

0

0

0

5:42

03/05/2021

Generalized Multimodal ELBO

Thomas Sutter, Imant Daunhawer, Julia E Vogt

Keywords Paper

self-supervised, generative learning, ELBO, VAE, Multimodal

0

0

0

0

5:15

16/11/2020

Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

Keywords Paper

document-level translation, document-level systems, context-aware architecture, transformer

0

0

0

0

6:36

26/04/2020

Computation Reallocation for Object Detection

Feng Liang, Chen Lin, Ronghao Guo and
Ming Sun, Wei Wu, Junjie Yan, Wanli Ouyang

Keywords Paper

Neural Architecture Search, Object Detection

0

0

0

0

5:29

02/02/2021

Revealing Hidden Preconditions and Effects of Compound HTN Planning Tasks – A Complexity Analysis

Conny Olz, Susanne Biundo, Pascal Bercher

Keywords Paper

0

0

0

0

18:29

03/05/2021

Neural Delay Differential Equations

Qunxi Zhu, Yao Guo, Wei Lin

Keywords Paper

Delay differential equations, neural networks

0

0

0

0

4:57

06/12/2021

Improving Compositionality of Neural Networks by Decoding Representations to Inputs

Mike Wu, Noah Goodman, Stefano Ermon

Keywords Paper

deep learning, machine learning, adversarial robustness and security, generative model

0

0

0

0

12:36

14/06/2020

Dataless Model Selection With the Deep Frame Potential

Calvin Murdock, Simon Lucey

Keywords Paper

deep learning, sparse approximation theory, deep network architectures, model selection, sparsity, mutual coherence

0

0

0

1

5:00

19/10/2020

Flexible IR pipelines with capreolus

Andrew Yates, Kevin Martin Jose, Xinyu Zhang, Jimmy Lin

Keywords Paper

neural information retrieval, retrieval pipeline, ad hoc ranking

0

0

0

0

10:00

08/12/2020

E.T.: Entity-Transformers. Coreference augmented Neural Language Model for richer mention representations via Entity-Transformer blocks

Nikolaos Stylianou, Ioannis Vlahavas

Keywords Paper

0

0

0

0

8:49

04/11/2020

Ansor: Generating High-Performance Tensor Programs for Deep Learning

Lianmin Zheng, Chengfan Jia, Minmin Sun and
Zhao Wu, Cody Hao Yu, Ameer Haj-Ali, Yida Wang, Jun Yang, Danyang Zhuo, Koushik Sen, Joseph E. Gonzalez, Ion Stoica

Keywords Paper

0

0

0

0

20:10

02/02/2021

End-to-end Semantic Role Labeling with Neural Transition-based Model

Hao Fei, Meishan Zhang, Bobo Li, Donghong Ji

Keywords Paper

0

0

0

0

18:47

16/11/2020

Neural Topic Modeling with Cycle-Consistent Adversarial Training

Xuemeng Hu, Rui Wang, Deyu Zhou, Yuxuan Xiong

Keywords Paper

neural modeling, deep models, adversarial-neural model, adversarially network

0

0

0

1

9:57

03/05/2021

More or Less: When and How to Build Convolutional Neural Network Ensembles

Abdul Wasay, Stratos Idreos

Keywords Paper

empirical study, ensemble learning, computer vision, machine learning systems

0

0

0

0

4:39

06/12/2020

Incorporating Interpretable Output Constraints in Bayesian Neural Networks

Wanqian Yang, Lars Lorch, Moritz Graule and
Himabindu Lakkaraju, Finale Doshi-Velez

Keywords Paper

0

0

0

0

3:02

18/07/2021

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

Jianfei Chen, Lianmin Zheng, Zhewei Yao and
Dequan Wang, Ion Stoica, Michael Mahoney, Joseph E Gonzalez

Keywords Paper

Algorithms, Large Scale Learning

0

0

0

0

18:54

04/07/2020

Differentiable Window for Dynamic Local Attention

Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, Xiaoli Li

Keywords Paper

Dynamic Attention, dynamic selection, NLP tasks, machine translation

0

0

0

0

9:51

12/07/2020

Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup

Jang-Hyun Kim, Wonho Choo, Hyun Oh Song

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:40

04/07/2020

A Formal Hierarchy of RNN Architectures

William Merrill, Gail Weiss, Yoav Goldberg and
Roy Schwartz, Noah A. Smith, Eran Yahav

Keywords Paper

Formal Architectures, RNN architectures, weighted machine, LSTM

0

0

0

0

9:55

12/07/2020

Deep Molecular Programming: A Natural Implementation of Binary-Weight ReLU Neural Networks

Marko Vasic, Cameron Chalk, Sarfraz Khurshid, David Soloveichik

Keywords Paper

Deep Learning - Theory

0

0

0

0

14:25

16/11/2020

On the Sparsity of Neural Machine Translation Models

Yong Wang, Longyue Wang, Victor Li, Zhaopeng Tu

Keywords Paper

nmt architectures, over-parameterization, underutilization resources, redundant parameters

0

0

0

0

6:18

14/09/2020

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Paper

deep learning, information redundancy, pruning

0

0

0

0

14:48

07/09/2020

Automated Search for Resource-Efficient Branched Multi-Task Networks

David Brüggemann, Menelaos Kanakis, Stamatios Georgoulis, Luc Van Gool

Keywords Paper

multi task, neural architecture search, resource efficient networks, dense prediction, encoder branching, proxyless resource loss, differentiable search space, branched networks, tree-like networks, Gumbel-Softmax

0

0

0

0

8:31

06/12/2020

On 1/n neural representation and robustness

Josue Nassar, Piotr Sokol, Sueyeon Chung and
Daniel D Harris, Memming Park

Keywords Paper

0

0

0

0

3:14

06/12/2021

Adaptive wavelet distillation from neural networks through interpretations

Wooseok Ha, Chandan Singh, Francois Lanusse and
Srigokul Upadhyayula, Bin Yu

Keywords Paper

deep learning, interpretability

0

0

0

0

14:56

03/05/2021

A Design Space Study for LISTA and Beyond

Tianjian Meng, Xiaohan Chen, Yifan Jiang, Zhangyang Wang

Keywords Paper

0

0

0

0

5:50

12/07/2020

Augmenting Continuous Time Bayesian Networks with Clocks

Nicolai Engelmann, Dominik Linzner, Heinz Koeppl

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

14:49

06/12/2021

GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

Chen Zhu, Renkun Ni, Zheng Xu and
Kezhi Kong, W. Ronny Huang, Tom Goldstein

Keywords Paper

deep learning, transformers, vision

0

0

0

0

13:17

18/07/2021

UnICORNN: A recurrent model for learning very long time dependencies

T. Konstantin Rusch, Siddhartha Mishra

Keywords Paper

Deep Learning, Architectures

0

0

0

0

6:17

07/09/2020

Towards Convolutional Neural Networks Compression via Global&Progressive Product Quantization

Weihan Chen, Peisong Wang, Jian Cheng

Keywords Paper

convolutional neural network compression, product quantization

0

0

0

0

5:03

23/08/2020

Spectrum-guided adversarial disparity learning

Zhe Liu, Lina Yao, Lei Bai and
Xianzhi Wang, Can Wang

Keywords Paper

adversarial autoencoder, generative models, intraclass variability, activity recognition

0

0

0

0

14:30