Improved Transformer for High-Resolution GANs

06/12/2021

Improved Transformer for High-Resolution GANs

Long Zhao, Zizhao Zhang, Ting Chen, Dimitris Metaxas, Han Zhang

Keywords: transformers, generative model

Abstract Paper Similar Papers

Abstract: Attention-based models, exemplified by the Transformer, can effectively model long range dependency, but suffer from the quadratic complexity of self-attention operation, making them difficult to be adopted for high-resolution image generation based on Generative Adversarial Networks (GANs). In this paper, we introduce two key ingredients to Transformer to address this challenge. First, in low-resolution stages of the generative process, standard global self-attention is replaced with the proposed multi-axis blocked self-attention which allows efficient mixing of local and global attention. Second, in high-resolution stages, we drop self-attention while only keeping multi-layer perceptrons reminiscent of the implicit neural function. To further improve the performance, we introduce an additional self-modulation component based on cross-attention. The resulting model, denoted as HiT, has a nearly linear computational complexity with respect to the image size and thus directly scales to synthesizing high definition images. We show in the experiments that the proposed HiT achieves state-of-the-art FID scores of 31.87 and 2.95 on unconditional ImageNet $128 \times 128$ and FFHQ $256 \times 256$, respectively, with a reasonable throughput. We believe the proposed HiT is an important milestone for generators in GANs which are completely free of convolutions.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

05/01/2021

A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition

Ayush Srivastava, Oshin Dutta, Jigyasa Gupta and
Sumeet Agarwal, Prathosh AP

Keywords Paper

0

0

0

0

4:29

14/06/2020

GAN Compression: Efficient Architectures for Interactive Conditional GANs

Muyang Li, Ji Lin, Yaoyao Ding and
Zhijian Liu, Jun-Yan Zhu, Song Han

Keywords Paper

generative adversarial networks, model compression, distillation, neural architecture search, image and video synthesis

0

0

0

0

1:00

06/12/2021

Early Convolutions Help Transformers See Better

Tete Xiao, Piotr Dollar, Mannat Singh and
Eric Mintun, Trevor Darrell, Ross B Girshick

Keywords Paper

deep learning, optimization, transformers

0

0

0

0

9:23

06/12/2021

Global Filter Networks for Image Classification

Yongming Rao, Wenliang Zhao, Zheng Zhu and
Jiwen Lu, Jie Zhou

Keywords Paper

machine learning, robustness, transformers, vision

0

0

0

0

9:28

14/06/2020

Forward and Backward Information Retention for Accurate Binary Neural Networks

Haotong Qin, Ruihao Gong, Xianglong Liu and
Mingzhu Shen, Ziran Wei, Fengwei Yu, Jingkuan Song

Keywords Paper

model compression, binary neural networks, deep learning, quantization, computer vision

0

0

0

0

1:00

06/12/2020

RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference

Oindrila Saha, Aditya Kusupati, Harsha Simhadri and
Manik Varma, Prateek Jain

Keywords Paper

0

0

0

0

3:30

26/04/2020

Network Deconvolution

Chengxi Ye, Matthew Evanusa, Hua He and
Anton Mitrokhin, Tom Goldstein, James A. Yorke, Cornelia Fermuller, Yiannis Aloimonos

Keywords Paper

convolutional networks, network deconvolution, whitening

0

0

0

0

4:59

06/12/2021

Efficient Equivariant Network

Lingshen He, Yuxuan Chen, zhengyang shen and
Yiming Dong, Yisen Wang, Zhouchen Lin

Keywords Paper

deep learning, vision

0

0

0

0

8:20

06/12/2021

Gaussian Kernel Mixture Network for Single Image Defocus Deblurring

Yuhui Quan, Zicong Wu, Hui Ji

Keywords Paper

deep learning

0

0

0

0

13:56

14/06/2020

Factorized Higher-Order CNNs With an Application to Spatio-Temporal Emotion Estimation

Jean Kossaifi, Antoine Toisoul, Adrian Bulat and
Yannis Panagakis, Timothy M. Hospedales, Maja Pantic

Keywords Paper

tensor methods, deep learning, spatiotemporal, emotion, cnn, tensor decomposition, low-rank, valence, arousal

0

0

0

0

1:01

14/06/2020

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution

Xiaoyu Xiang, Yapeng Tian, Yulun Zhang and
Yun Fu, Jan P. Allebach, Chenliang Xu

Keywords Paper

space-time video super-resolution, high-resolution, slow motion, one-stage, fast and accurate, feature temporal interpolation, deformable convlstm, temporal alignment, temporal aggregation, video restoration

0

0

0

0

1:00

14/06/2020

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

Qilong Wang, Banggu Wu, Pengfei Zhu and
Peihua Li, Wangmeng Zuo, Qinghua Hu

Keywords Paper

channel attention, efficient, adaptive 1d convolution, deep cnns, image classifcation, object detection, instance segmentation

0

0

0

0

0:57

06/12/2021

SOFT: Softmax-free Transformer with Linear Complexity

Jiachen Lu, Jinghan Yao, Junge Zhang and
Xiatian Zhu, Hang Xu, Weiguo Gao, Chunjing XU, Tao Xiang, Li Zhang

Keywords Paper

robustness, transformers, language

0

0

0

0

8:04

14/06/2020

Non-Local Neural Networks With Grouped Bilinear Attentional Transforms

Lu Chi, Zehuan Yuan, Yadong Mu, Changhu Wang

Keywords Paper

attention, non-local, bilinear, image classification, video classification, grouped, data-adaptive

0

0

0

0

1:01

02/02/2021

TRQ: Ternary Neural Networks With Residual Quantization

Yue Li, Wenrui Ding, Chunlei Liu and
Baochang Zhang, Guodong Guo

Keywords Paper

0

0

0

0

15:21

05/01/2021

Exploiting the Redundancy in Convolutional Filters for Parameter Reduction

Kumara Kahatapitiya, Ranga Rodrigo

Keywords Paper

0

0

0

0

5:10

06/12/2020

Unfolding the Alternating Optimization for Blind Super Resolution

LOG luo, Yan Huang, Shang Li and
Liang Wang, Tieniu Tan

Keywords Paper

0

0

0

0

3:16

14/06/2020

Meta-Transfer Learning for Zero-Shot Super-Resolution

Jae Woong Soh, Sunwoo Cho, Nam Ik Cho

Keywords Paper

zero-shot super-resolution, meta learning, transfer learning

0

0

0

0

0:59

03/05/2021

Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation

Biao Zhang, Ankur Bapna, Rico Sennrich, Orhan Firat

Keywords Paper

multilingual transformer, multilingual translation, language-specific modeling, conditional computation

0

0

0

0

15:04

06/12/2020

ARMA Nets: Expanding Receptive Field for Dense Prediction

Jiahao Su, Shiqi Wang, Furong Huang

Keywords Paper

0

0

0

0

3:36

02/02/2021

Frequency Consistent Adaptation for Real World Super Resolution

Xiaozhong Ji, Guangpin Tao, Yun Cao and
Ying Tai, Tong Lu, Chengjie Wang, Jilin Li, Feiyue Huang

Keywords Paper

0

0

0

0

14:32

14/06/2020

What Makes Training Multi-Modal Classification Networks Hard?

Weiyao Wang, Du Tran, Matt Feiszli

Keywords Paper

video classification, multi-modal, overfitting, action recognition, acoustic event detection

0

0

0

0

1:01

14/06/2020

Efficient Dynamic Scene Deblurring Using Spatially Variant Deconvolution Network With Optical Flow Guided Training

Yuan Yuan, Wei Su, Dandan Ma

Keywords Paper

dynamic scene deblurring, deconvolution neural network, bi-directional optical flow, deformable convolution, deep learning, image restoration

0

0

0

0

0:57

14/06/2020

A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection

Yongri Piao, Zhengkun Rong, Miao Zhang and
Weisong Ren, Huchuan Lu

Keywords Paper

rgb-d, salient object dection, knowledge distillation, attention, computer vision, cnn

0

0

0

0

1:00

22/11/2021

Knowing What, Where and When to Look: Video Action modelling with Attention

Juan-Manuel Perez-Rua, Brais Martinez, Xiatian Zhu and
Antoine S Toisoul, Victor A Escorcia, Tao Xiang

Keywords Paper

Action recognition, Fine-grained action, video attention, Spatial attention, Channel attention, Temporal attention, Spatio-temporal attention, Feature refinement

0

0

0

0

2:46

22/11/2021

Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation

Bin Ren, Hao Tang, Nicu Sebe

Keywords Paper

cross view, MLP, image translation, image generation

0

0

0

0

9:58

02/02/2021

Longitudinal Deep Kernel Gaussian Process Regression

Junjie Liang, Yanting Wu, Dongkuan Xu, Vasant G Honavar

Keywords Paper

0

0

0

0

16:27

06/12/2020

Sparse Graphical Memory for Robust Planning

Scott Emmons, Ajay Jain, Misha Laskin and
Thanard Kurutach, Pieter Abbeel, Deepak Pathak

Keywords Paper

0

0

0

0

3:23

06/12/2021

Group Equivariant Subsampling

Jin Xu, Hyunjik Kim, Thomas Rainforth, Yee Teh

Keywords Paper

deep learning, representation learning

0

0

0

0

11:27

06/12/2021

Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks

Woochul Kang, Daeyeon Kim

Keywords Paper

deep learning, machine learning, vision

0

0

0

0

13:17

26/08/2020

A principled approach for generating adversarial images under non-smooth dissimilarity metrics

Aram-Alexandre Pooladian, Chris Finlay, Tim Hoheisel, Adam Oberman

Keywords Paper

0

0

0

0

14:46

03/05/2021

CT-Net: Channel Tensorization Network for Video Classification

Kunchang Li, xianhang li, Yali Wang and
Jun Wang, Yu Qiao

Keywords Paper

3D Convolution, Video Classification, Channel Tensorization

0

0

0

0

4:59

06/12/2020

Convolutional Tensor-Train LSTM for Spatio-Temporal Learning

Jiahao Su, Wonmin Byeon, Jean Kossaifi and
Furong Huang, Jan Kautz, Anima Anandkumar

Keywords Paper

0

0

0

0

3:29

03/05/2021

AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights

Byeongho Heo, Sanghyuk Chun, Seong Joon Oh and
Dongyoon Han, Sangdoo Yun, Gyuwan Kim, Youngjung Uh, Jung-Woo Ha

Keywords Paper

effective learning rate, normalize layer, scale-invariant weights, momentum optimizer

0

0

0

0

5:16

02/02/2021

Continuous Self-Attention Models with Neural ODE Networks

Jing Zhang, Peng Zhang, Baiwen Kong and
Junqiu Wei, Xin Jiang

Keywords Paper

0

0

0

0

15:25

22/11/2021

Parameter Efficient Dynamic Convolution via Tensor Decomposition

Zejiang Hou, Sun-Yuan Kung

Keywords Paper

dynamic convolution, input-dependent reparameterization, parameter efficiency, tensor decomposition

0

0

0

0

3:58

05/01/2021

OverNet: Lightweight Multi-Scale Super-Resolution With Overscaling Network

Parichehr Behjati, Pau Rodriguez, Armin Mehri and
Isabelle Hupont, Carles Fernandez Tena, Jordi Gonzalez

Keywords Paper

0

0

0

0

4:24

06/12/2021

Spectrum-to-Kernel Translation for Accurate Blind Image Super-Resolution

Guangpin Tao, Xiaozhong Ji, Wenzhuo Wang and
Shuo Chen, Chuming Lin, Yun Cao, Tong Lu, Donghao Luo, Ying Tai

Keywords Paper

deep learning, optimization, vision, generative model

0

0

0

0

12:00

30/11/2020

Lossless Image Compression Using a Multi-Scale Progressive Statistical Model

Honglei Zhang, Francesco Cricri, Hamed R. Tavakoli and
Nannan Zou, Emre Aksu, Miska M. Hannuksela

Keywords Paper

0

0

0

0

9:33

06/12/2021

Self-Adaptable Point Processes with Nonparametric Time Decays

Zhimeng Pan, Zheng Wang, Jeff M Phillips, Shandian Zhe

Keywords Paper

deep learning, kernel methods

0

0

0

0

10:01