Mish: A Self Regularized Non-Monotonic Activation Function

07/09/2020

Mish: A Self Regularized Non-Monotonic Activation Function

Diganta Misra

Keywords: activation functions, non-linear dynamics, loss landscapes

Abstract Paper Code Similar Papers

Abstract: We propose Mish, a novel self-regularized non-monotonic activation function which can be mathematically defined as: f(x) = xtanh(softplus(x)). As activation functions play a crucial role in the performance and training dynamics in neural networks, we validated experimentally on several well-known benchmarks against the best combinations of architectures and activation functions. We also observe that data augmentation techniques have a favorable effect on benchmarks like ImageNet-1k and MS-COCO across multiple architectures. For example, Mish outperformed Leaky ReLU on YOLOv4 with a CSP-DarkNet-53 backbone on average precision (AP-50 val) by 2.1% in MS-COCO object detection and ReLU on ResNet-50 on ImageNet-1k in Top-1 accuracy by 1% while keeping all other network parameters and hyperparameters constant. Furthermore, we explore the mathematical formulation of Mish in relation with the Swish family of functions and propose an intuitive understanding on how the first derivative behavior may be acting as a regularizer helping the optimization of deep neural networks. Code is publicly available at https://github.com/digantamisra98/Mish.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at BMVC 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

22/11/2021

Model Composition: Can Multiple Neural Networks Be Combined into a Single Network Using Only Unlabeled Data?

Amin Banitalebi-Dehkordi, Xinyu Kang, Yong Zhang

Keywords Paper

Model Composition, Combining Neural Networks, Pseudo Label, Self Training, Label Aggregation, Combining Models

0

0

0

0

2:58

03/05/2021

Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets?

Zhiyuan Li, Yi Zhang, Sanjeev Arora

Keywords Paper

equivariance, fully-connected, sample complexity separation, convolutional neural networks

0

0

0

0

15:18

06/12/2021

Robust and Decomposable Average Precision for Image Retrieval

Elias Ramzi, Nicolas THOME, Clément Rambour and
Nicolas Audebert, Xavier Bitot

Keywords Paper

deep learning

0

0

0

0

8:13

22/11/2021

SamplingAug: On the Importance of Patch Sampling Augmentation for Single Image Super-Resolution

Shizun Wang, Ming Lu, Kaixin Chen and
Jiaming Liu, Xiaoqi Li, Chuang Zhang, Ming Wu

Keywords Paper

Super-Resolution, Patch Sampling

0

0

0

0

2:18

22/11/2021

Adaptive End-to-End Budgeted Network Learning via Inverse Scale Space

Zuyuan Zhong, Chen Liu, Yanwei Fu

Keywords Paper

deep learning, network architecture, growing network, budgeted network learning, pruning

0

0

0

0

2:58

03/05/2021

BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction

Yuhang Li, Ruihao Gong, Xu Tan and
Yang Yang, Peng Hu, Qi Zhang, fengwei yu, Wei Wang, Shi Gu

Keywords Paper

Second-order analysis, Mixed Precision, Post Training Quantization

0

0

0

0

4:36

06/12/2021

Revisiting Hilbert-Schmidt Information Bottleneck for Adversarial Robustness

Zifeng Wang, Tong Jian, Aria Masoomi and
Stratis Ioannidis, Jennifer Dy

Keywords Paper

deep learning, robustness, adversarial robustness and security

0

0

0

0

13:49

06/12/2020

Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming

Sumanth Dathathri, Krishnamurthy Dvijotham, Alexey Kurakin and
Aditi Raghunathan, Jonathan Uesato, Rudy Bunel, Shreya Shankar, Jacob Steinhardt, Ian Goodfellow, Percy Liang, Pushmeet Kohli

Keywords Paper

0

0

0

0

3:23

06/12/2021

Generic Neural Architecture Search via Regression

Yuhong Li, Cong Hao, Pan Li and
Jinjun Xiong, Deming Chen

Keywords Paper

deep learning, machine learning, self-supervised learning, vision, language

0

0

0

0

14:56

03/05/2021

DrNAS: Dirichlet Neural Architecture Search

Xiangning Chen, Ruochen Wang, Minhao Cheng and
Xiaocheng Tang, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

5:00

06/12/2021

Post-Training Quantization for Vision Transformer

Zhenhua Liu, Yunhe Wang, Kai Han and
Wei Zhang, Siwei Ma, Wen Gao

Keywords Paper

deep learning, transformers, vision

0

0

0

0

5:52

03/05/2021

BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization

Huanrui Yang, Lin Duan, Yiran Chen, Hai Li

Keywords Paper

DNN compression, bit-level sparsity, Mixed-precision quantization

0

0

0

0

4:58

06/12/2021

Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization

Yusuke Iwasawa, Yutaka Matsuo

Keywords Paper

deep learning, optimization, transformers, domain adaptation

0

0

0

0

13:50

18/07/2021

Sparsifying Networks via Subdifferential Inclusion

Sagar Verma, Jean-Christophe Pesquet

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

5:10

06/12/2021

Channel Permutations for N:M Sparsity

Jeff Pool, Chong Yu

Keywords Paper

optimization

0

0

0

0

12:41

12/07/2020

Second-Order Provable Defenses against Adversarial Attacks

Sahil Singla, Soheil Feizi

Keywords Paper

Adversarial Examples

0

0

0

0

12:45

06/12/2020

Semi-Supervised Neural Architecture Search

Renqian Luo, Xu Tan, Rui Wang and
Tao Qin, Enhong Chen, Tie-Yan Liu

Keywords Paper

0

0

0

0

3:20

06/12/2020

MMA Regularization: Decorrelating Weights of Neural Networks by Maximizing the Minimal Angles

Zhennan Wang, Canqun Xiang, Wenbin Zou, Chen Xu

Keywords Paper

0

0

0

0

3:23

14/06/2020

Generalized Zero-Shot Learning via Over-Complete Distribution

Rohit Keshari, Richa Singh, Mayank Vatsa

Keywords Paper

deep learning, zero-shot leaning, cvae, triplet loss, center loss

0

0

0

0

0:50

18/07/2021

Understanding self-supervised learning dynamics without contrastive pairs

Yuandong Tian, Xinlei Chen, Surya Ganguli

Keywords Paper

Deep Learning, Optimization for Deep Networks

0

0

0

0

18:16

14/06/2020

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Tianzhe Wang, Kuan Wang, Han Cai and
Ji Lin, Zhijian Liu, Hanrui Wang, Yujun Lin, Song Han

Keywords Paper

efficiency, model compression, joint design, neural architecture search, channel pruning, mixed-precision quantization

0

0

0

0

1:00

06/12/2020

Efficient Exact Verification of Binarized Neural Networks

Kai Jia, Martin Rinard

Keywords Paper

0

0

0

0

3:20

26/04/2020

NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search

Xuanyi Dong, Yi Yang

Keywords Paper

Neural Architecture Search, AutoML, Benchmark

0

0

0

0

4:56

06/12/2020

RepPoints v2: Verification Meets Regression for Object Detection

Yihong Chen, Zheng Zhang, Yue Cao and
Liwei Wang, Stephen Lin, Han Hu

Keywords Paper

, Theory -> Learning Theory

0

0

0

0

3:22

26/08/2020

Lipschitz Continuous Autoencoders in Application to Anomaly Detection

Young-geun Kim, Yongchan Kwon, Hyunwoong Chang, Myunghee Cho Paik

Keywords Paper

0

0

0

0

10:24

06/12/2020

ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

Yibo Yang, Hongyang Li, Shan You and
Fei Wang, Chen Qian, Zhouchen Lin

Keywords Paper

0

0

0

0

3:19

03/05/2021

Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

Aojun Zhou, Yukun Ma, Junnan Zhu and
Jianbo Liu, Zhijie Zhang, Kun Yuan, Wenxiu Sun, Hongsheng Li

Keywords Paper

sparsity, efficient training and inference.

0

0

0

0

5:09

06/12/2021

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

Sheng Liu, Xiao Li, Yuexiang Zhai and
Chong You, Zhihui Zhu, Carlos Fernandez-Granda, Qing Qu

Keywords Paper

deep learning, machine learning, robustness, generative model

0

0

0

0

6:45

02/02/2021

Adversarial Turing Patterns from Cellular Automata

Nurislam Tursynbek, Ilya Vilkoviskiy, Maria Sindeeva, Ivan Oseledets

Keywords Paper

0

0

0

0

14:50

02/02/2021

DPFPS: Dynamic and Progressive Filter Pruning for Compressing Convolutional Neural Networks from Scratch

Xiaofeng Ruan, Yufan Liu, Bing Li and
Chunfeng Yuan, Weiming Hu

Keywords Paper

0

0

0

0

14:38

26/04/2020

Training binary neural networks with real-to-binary convolutions

Brais Martinez, Jing Yang, Adrian Bulat, Georgios Tzimiropoulos

Keywords Paper

binary networks

0

0

0

0

4:41

14/06/2020

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz

Keywords Paper

neural architecture search, nas, differentiable, hardware aware, latency, reinforcement learning

0

0

0

0

4:59

05/01/2021

Spike-Thrift: Towards Energy-Efficient Deep Spiking Neural Networks by Limiting Spiking Activity via Attention-Guided Compression

Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel

Keywords Paper

0

0

0

0

5:22

06/12/2021

Algorithmic stability and generalization of an unsupervised feature selection algorithm

xinxing wu, Qiang Cheng

Keywords Paper

deep learning

0

0

0

0

12:41

06/12/2020

Adversarial robustness via robust low rank representations

Pranjal Awasthi, Himanshu Jain, Ankit Singh Rawat, Aravindan Vijayaraghavan

Keywords Paper

0

0

0

1

3:14

06/12/2021

On learning sparse vectors from mixture of responses

Nikita Polyanskii

Keywords Paper

generative model

0

0

0

0

10:55

12/07/2020

Laplacian Regularized Few-Shot Learning

Imtiaz Ziko, Jose Dolz, Eric Granger, Ismail Ben Ayed

Keywords Paper

Unsupervised and Semi-Supervised Learning

0

0

0

0

15:12

14/06/2020

GreedyNAS: Towards Fast One-Shot NAS With Greedy Supernet

Shan You, Tao Huang, Mingmin Yang and
Fei Wang, Chen Qian, Changshui Zhang

Keywords Paper

neural architecture search, supernet, one-shot nas, single path, greedy algorithm, exploration and exploitation, searching efficiency

0

0

0

0

1:01

12/07/2020

Neural Kernels Without Tangents

Vaishaal Shankar, Alex Fang, Wenshuo Guo and
Sara Fridovich-Keil, Jonathan Ragan-Kelley, Ludwig Schmidt, Benjamin Recht

Keywords Paper

General Machine Learning Techniques

0

0

0

0

14:27

15/06/2020

Learning nonlinear loop invariants with gated continuous logic networks

Jianan Yao, Gabriel Ryan, Justin Wong and
Suman Jana, Ronghui Gu

Keywords Paper

Loop Invariant Inference, Continuous Logic Networks, Program Verification

0

0

0

0

14:18