What Makes Training Multi-Modal Classification Networks Hard?

14/06/2020

What Makes Training Multi-Modal Classification Networks Hard?

Weiyao Wang, Du Tran, Matt Feiszli

Keywords: video classification, multi-modal, overfitting, action recognition, acoustic event detection

Abstract Paper Similar Papers

Abstract: Consider end-to-end training of a multi-modal vs. a uni-modal network on a task with multiple input modalities: the multi-modal network receives more information, so it should match or outperform its uni-modal counterpart. In our experiments, however, we observe the opposite: the best uni-modal network can outperform the multi-modal network. This observation is consistent across different combinations of modalities and on different tasks and benchmarks for video classifications. This paper identifies two main causes for this performance drop: first, multi-modal networks are often prone to overfitting due to increased capacity. Second, different modalities overfit and generalize at different rates, so training them jointly with a single optimization strategy is sub-optimal. We address these two problems with a technique we call Gradient-Blending, which computes an optimal blending of modalities based on their overfitting behaviors. We demonstrate that Gradient Blending outperforms widely-used baselines for avoiding overfitting and achieves state-of-the-art accuracy on various tasks including human action recognition, ego-centric action recognition, and acoustic event detection.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CVPR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

05/01/2021

Adaptive Streaming of 360-Degree Videos With Reinforcement Learning

Sohee Park, Minh Hoai, Arani Bhattacharya, Samir R. Das

Keywords Paper

0

0

0

0

4:51

06/12/2021

Improved Transformer for High-Resolution GANs

Long Zhao, Zizhao Zhang, Ting Chen and
Dimitris Metaxas, Han Zhang

Keywords Paper

transformers, generative model

0

0

0

0

12:11

22/11/2021

Parameter Efficient Dynamic Convolution via Tensor Decomposition

Zejiang Hou, Sun-Yuan Kung

Keywords Paper

dynamic convolution, input-dependent reparameterization, parameter efficiency, tensor decomposition

0

0

0

0

3:58

05/01/2021

Exploiting the Redundancy in Convolutional Filters for Parameter Reduction

Kumara Kahatapitiya, Ranga Rodrigo

Keywords Paper

0

0

0

0

5:10

03/05/2021

VA-RED$^2$: Video Adaptive Redundancy Reduction

Bowen Pan, Rameswar Panda, Camilo L Fosco and
Chung-Ching Lin, Alex J Andonian, Yue Meng, Kate Saenko, Aude Oliva, Rogerio Feris

Keywords Paper

0

0

0

0

5:02

06/12/2021

Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks

Woochul Kang, Daeyeon Kim

Keywords Paper

deep learning, machine learning, vision

0

0

0

0

13:17

02/02/2021

Distribution Adaptive INT8 Quantization for Training CNNs

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper

0

0

0

0

16:42

26/04/2020

Network Deconvolution

Chengxi Ye, Matthew Evanusa, Hua He and
Anton Mitrokhin, Tom Goldstein, James A. Yorke, Cornelia Fermuller, Yiannis Aloimonos

Keywords Paper

convolutional networks, network deconvolution, whitening

0

0

0

0

4:59

26/04/2020

Learning Nearly Decomposable Value Functions Via Communication Minimization

Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang

Keywords Paper

Multi-agent reinforcement learning, Nearly decomposable value function, Minimized communication

0

0

0

0

5:00

02/02/2021

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu

Keywords Paper

0

0

0

0

14:30

22/11/2021

Knowing What, Where and When to Look: Video Action modelling with Attention

Juan-Manuel Perez-Rua, Brais Martinez, Xiatian Zhu and
Antoine S Toisoul, Victor A Escorcia, Tao Xiang

Keywords Paper

Action recognition, Fine-grained action, video attention, Spatial attention, Channel attention, Temporal attention, Spatio-temporal attention, Feature refinement

0

0

0

0

2:46

14/06/2020

Factorized Higher-Order CNNs With an Application to Spatio-Temporal Emotion Estimation

Jean Kossaifi, Antoine Toisoul, Adrian Bulat and
Yannis Panagakis, Timothy M. Hospedales, Maja Pantic

Keywords Paper

tensor methods, deep learning, spatiotemporal, emotion, cnn, tensor decomposition, low-rank, valence, arousal

0

0

0

0

1:01

06/12/2020

When Do Neural Networks Outperform Kernel Methods?

Behrooz Ghorbani, Song Mei, Theodor Misiakiewicz, Andrea Montanari

Keywords Paper

0

0

0

0

3:16

22/11/2021

DISCO: accurate Discrete Scale Convolutions

Ivan Sosnovik, Artem Moskalev, Arnold W.M. Smeulders

Keywords Paper

equivariance, symmetry, invariance, scale, convolutions, dilation, tracking, image classification

0

0

0

0

8:38

03/05/2021

Early Stopping in Deep Networks: Double Descent and How to Eliminate it

Reinhard Heckel, Fatih Furkan Yilmaz

Keywords Paper

early stopping, double descent

0

0

0

0

4:37

02/02/2021

DIBS: Diversity Inducing Information Bottleneck in Model Ensembles

Samarth Sinha, Homanga Bharadhwaj, Anirudh Goyal and
Hugo Larochelle, Animesh Garg, Florian Shkurti

Keywords Paper

0

0

0

0

16:26

14/06/2020

EventSR: From Asynchronous Events to Image Reconstruction, Restoration, and Super-Resolution via End-to-End Adversarial Learning

Lin Wang, Tae-Kyun Kim, Kuk-Jin Yoon

Keywords Paper

event-based vision, image super-resolution, image restoration, image reconstruction, unsupervised and adversarial learning

0

0

0

0

1:03

06/12/2021

Early Convolutions Help Transformers See Better

Tete Xiao, Piotr Dollar, Mannat Singh and
Eric Mintun, Trevor Darrell, Ross B Girshick

Keywords Paper

deep learning, optimization, transformers

0

0

0

0

9:23

22/11/2021

Fine-grained Multi-Modal Self-Supervised Learning

Duo Wang, Salah Karout

Keywords Paper

self-supervised learning, multi-modal learning

0

0

0

0

2:46

18/07/2021

Towards Better Robust Generalization with Shift Consistency Regularization

Shufei Zhang, Zhuang Qian, Kaizhu Huang and
Qiufeng Wang, Rui Zhang, Xinping Yi

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

0

5:44

03/05/2021

Self-Supervised Learning of Compressed Video Representations

Youngjae Yu, Sangho Lee, Gunhee Kim, Yale Song

Keywords Paper

self-supervised learning, Compressed videos

0

0

0

0

4:34

06/12/2020

Sparse Graphical Memory for Robust Planning

Scott Emmons, Ajay Jain, Misha Laskin and
Thanard Kurutach, Pieter Abbeel, Deepak Pathak

Keywords Paper

0

0

0

0

3:23

26/04/2020

SpikeGrad: An ANN-equivalent Computation Model for Implementing Backpropagation with Spikes

Johannes C. Thiele, Olivier Bichler, Antoine Dupret

Keywords Paper

spiking neural network, neuromorphic engineering, backpropagation

0

0

0

0

5:21

14/06/2020

MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks

Animesh Karnewar, Oliver Wang

Keywords Paper

msg-gan, multi-scale gradients, generative adversarial network, high-res image synthesis, gan training stability

0

0

0

0

1:00

06/12/2020

Convolutional Tensor-Train LSTM for Spatio-Temporal Learning

Jiahao Su, Wonmin Byeon, Jean Kossaifi and
Furong Huang, Jan Kautz, Anima Anandkumar

Keywords Paper

0

0

0

0

3:29

06/12/2021

Revisiting ResNets: Improved Training and Scaling Strategies

Irwan Bello, William Fedus, Xianzhi Du and
Ekin Dogus Cubuk, Aravind Srinivas, Tsung-Yi Lin, Jonathon Shlens, Barret Zoph

Keywords Paper

machine learning, vision, semi-supervised learning

0

0

0

0

13:59

06/12/2020

Triple descent and the two kinds of overfitting: where & why do they appear?

Stéphane d'Ascoli, Levent Sagun, Giulio Biroli

Keywords Paper

Algorithms -> Active Learning; Algorithms -> Classification; Algorithms -> Ranking and Preference Learning, Theory -> Learning Theory

0

0

0

0

3:28

14/06/2020

Forward and Backward Information Retention for Accurate Binary Neural Networks

Haotong Qin, Ruihao Gong, Xianglong Liu and
Mingzhu Shen, Ziran Wei, Fengwei Yu, Jingkuan Song

Keywords Paper

model compression, binary neural networks, deep learning, quantization, computer vision

0

0

0

0

1:00

18/07/2021

Self Normalizing Flows

T. Anderson Keller, Jorn Peters, Priyank Jaini and
Emiel Hoogeboom, Patrick Forré, Max Welling

Keywords Paper

Deep Learning, Generative Models

0

1

1

0

4:24

14/06/2020

GAN Compression: Efficient Architectures for Interactive Conditional GANs

Muyang Li, Ji Lin, Yaoyao Ding and
Zhijian Liu, Jun-Yan Zhu, Song Han

Keywords Paper

generative adversarial networks, model compression, distillation, neural architecture search, image and video synthesis

0

0

0

0

1:00

14/06/2020

Deep Unfolding Network for Image Super-Resolution

Kai Zhang, Luc Van Gool, Radu Timofte

Keywords Paper

super-resolution, unfolding, degradation model, gaussian kernel, deblurring

0

0

0

0

1:01

14/06/2020

Closed-Loop Matters: Dual Regression Networks for Single Image Super-Resolution

Yong Guo, Jian Chen, Jingdong Wang and
Qi Chen, Jiezhang Cao, Zeshuai Deng, Yanwu Xu, Mingkui Tan

Keywords Paper

computer vision, image super-resolution, dual regression scheme, closed-loop

0

0

0

0

1:01

06/12/2021

Shared Independent Component Analysis for Multi-Subject Neuroimaging

Hugo Richard, Pierre Ablin, Bertrand Thirion and
Alexandre Gramfort, Aapo Hyvarinen

Keywords Paper

representation learning

0

0

0

0

14:21

05/01/2021

DynaVSR: Dynamic Adaptive Blind Video Super-Resolution

Suyoung Lee, Myungsub Choi, Kyoung Mu Lee

Keywords Paper

0

0

0

0

4:56

06/12/2021

Network-to-Network Regularization: Enforcing Occam's Razor to Improve Generalization

Rohan Ghosh, Mehul Motani

Keywords Paper

theory, deep learning, machine learning

0

0

0

0

14:07

26/04/2020

Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness

Tianyu Pang, Kun Xu, Yinpeng Dong and
Chao Du, Ning Chen, Jun Zhu

Keywords Paper

Trustworthy Machine Learning, Adversarial Robustness, Training Objective, Sample Density

0

0

0

0

4:10

06/12/2021

Representation Learning Beyond Linear Prediction Functions

Ziping Xu, Ambuj Tewari

Keywords Paper

theory, deep learning, optimization, representation learning, few shot learning

0

0

0

0

11:00

13/04/2021

Faster & more reliable tuning of neural networks: Bayesian optimization with importance sampling

Setareh Ariafar, Zelda Mariet, Dana Brooks and
Jennifer Dy, Jasper Snoek

Keywords Paper

0

0

0

0

3:01

18/07/2021

Bayesian Attention Belief Networks

Shujian Zhang, Xinjie Fan, Bo Chen, Mingyuan Zhou

Keywords Paper

, Applications, Program Understanding and Generation, Deep Learning, Bayesian Deep Learning

0

0

0

0

4:28

12/07/2020

Decoupled Greedy Learning of CNNs

Eugene Belilovsky, Michael Eickenberg, Edouard Oyallon

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

16:04