Transformer-based Monocular Depth Estimation with Attention Supervision

22/11/2021

Transformer-based Monocular Depth Estimation with Attention Supervision

Wenjie Chang, Yueyi Zhang, Zhiwei Xiong

Keywords: transformer, depth estimation

Abstract Paper Code Similar Papers

Abstract: Transformer, which excels in capturing long-range dependencies, has shown great performance in a variety of computer vision tasks. In this paper, we propose a hybrid network with a Transformer-based encoder and a CNN-based decoder for monocular depth estimation. The encoder follows the architecture of classical Vision Transformer. To better exploit the potential of the Transformer encoder, we introduce the Attention Supervision to the Transformer layer, which enhances the representative ability. The down-sampling operations before the Transformer encoder lead to degradation of the details in the predicted depth map. Thus, we devise an Attention-based Up-sample Block and deploy it to compensate the texture features. Experiments on both indoor and outdoor datasets demonstrate that the proposed method achieves the state-of-the-art performance on both quantitative and qualitative evaluations. The source code and trained models can be downloaded at https://github.com/WJ-Chang-42/ASTransformer.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at BMVC 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Post-Training Quantization for Vision Transformer

Zhenhua Liu, Yunhe Wang, Kai Han and
Wei Zhang, Siwei Ma, Wen Gao

Keywords Paper

deep learning, transformers, vision

0

0

0

0

5:52

18/07/2021

Generative Adversarial Transformers

Drew A. Hudson, Larry Zitnick

Keywords Paper

Deep Learning, Architectures

0

0

0

0

5:15

14/06/2020

A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection

Yongri Piao, Zhengkun Rong, Miao Zhang and
Weisong Ren, Huchuan Lu

Keywords Paper

rgb-d, salient object dection, knowledge distillation, attention, computer vision, cnn

0

0

0

0

1:00

14/06/2020

Hierarchical Scene Coordinate Classification and Regression for Visual Localization

Xiaotian Li, Shuzhe Wang, Yi Zhao and
Jakob Verbeek, Juho Kannala

Keywords Paper

visual localization, camera relocalization, scene coordinate regression

0

0

0

0

1:01

14/06/2020

Image Demoireing with Learnable Bandpass Filters

Bolun Zheng, Shanxin Yuan, Gregory Slabaugh, Aleš Leonardis

Keywords Paper

image demoireing, learnbale bandpass filter, advanced sobel loss, multiscale supervising, degradation model, implicit dct, local tone mapping, global tone mapping, multiscale cnn, aim2019 demoireing challenge.

0

0

0

0

1:00

14/06/2020

RDCFace: Radial Distortion Correction for Face Recognition

He Zhao, Xianghua Ying, Yongjie Shi and
Xin Tong, Jingsi Wen, Hongbin Zha

Keywords Paper

radial distortion correction, face recognition, spatial transformer network, cascaded network, fisheye camera, wide-angle camera

0

0

0

0

1:00

06/12/2021

Shape Registration in the Time of Transformers

Giovanni Trappolini, Luca Cosmo, Luca Moschella and
Riccardo Marin, Simone Melzi, Emanuele Rodolà

Keywords Paper

deep learning, reinforcement learning and planning, transformers

0

0

0

0

10:17

06/12/2020

CoMIR: Contrastive Multimodal Image Representation for Registration

Nicolas Pielawski, Elisabeth Wetzer, Johan Öfverstedt and
Jiahao Lu, Carolina Wählby, Joakim Lindblad, Natasa Sladoje

Keywords Paper

0

0

0

0

2:55

22/11/2021

Repaint: Improving the Generalization of Down-Stream Visual Tasks by Generating Multiple Instances of Training Examples

Amin Banitalebi-Dehkordi, Yong Zhang

Keywords Paper

Texture Bias, Repaint, Image Generation, Semantic Synthesis, Down-Stream Task, VAE-GAN

0

0

0

0

2:45

06/12/2020

GramGAN: Deep 3D Texture Synthesis From 2D Exemplars

Tiziano Portenier, Siavash Arjomand Bigdeli, Orcun Goksel

Keywords Paper

0

0

0

0

3:17

30/11/2020

HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning

Axel Barroso-Laguna, Yannick Verdie, Benjamin Busam, Krystian Mikolajczyk

Keywords Paper

0

0

0

0

10:03

03/05/2021

$i$-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

Kibok Lee, Yian Zhu, Kihyuk Sohn and
Chun-Liang Li, Jinwoo Shin, Honglak Lee

Keywords Paper

self-supervised learning, unsupervised representation learning, data augmentation, MixUp, contrastive representation learning

0

0

0

0

5:04

05/01/2021

Task-Assisted Domain Adaptation With Anchor Tasks

Zhizhong Li, Linjie Luo, Sergey Tulyakov and
Qieyun Dai, Derek Hoiem

Keywords Paper

0

0

0

0

5:02

14/06/2020

Variational Context-Deformable ConvNets for Indoor Scene Parsing

Zhitong Xiong, Yuan Yuan, Nianhui Guo, Qi Wang

Keywords Paper

rgb-d, semantic segmentation, spatial-context, adaptive receptive-field, scene parsing, deformable convolution networks

0

0

0

0

1:01

14/06/2020

An Investigation Into the Stochasticity of Batch Whitening

Lei Huang, Lei Zhao, Yi Zhou and
Fan Zhu, Li Liu, Ling Shao

Keywords Paper

batch normalization, whitening, stochasticity analysis, conditioning, optimization, generalization, stochastic noise, deep learning, gans, classification

0

0

0

0

5:00

06/12/2021

ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

Yufei Xu, Qiming ZHANG, Jing Zhang, Dacheng Tao

Keywords Paper

machine learning, transformers, vision

0

0

0

0

10:16

14/06/2020

Analyzing and Improving the Image Quality of StyleGAN

Tero Karras, Samuli Laine, Miika Aittala and
Janne Hellsten, Jaakko Lehtinen, Timo Aila

Keywords Paper

generative modeling, image synthesis, representation learning

0

0

0

0

1:01

14/06/2020

Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation

Myeongjin Kim, Hyeran Byun

Keywords Paper

domain adaptation, segmentation, texture

0

0

0

0

1:01

06/12/2021

Adversarial Reweighting for Partial Domain Adaptation

Xiang Gu, Xi Yu, yan yang and
Jian Sun, Zongben Xu

Keywords Paper

domain adaptation

0

0

0

1

9:03

14/06/2020

Single Image Reflection Removal With Physically-Based Training Images

Soomin Kim, Yuchi Huo, Sung-Eui Yoon

Keywords Paper

reflection removal, physical-based rendering, deep learning, layer decomposition, image processing

0

0

0

0

4:56

06/12/2021

Grounding inductive biases in natural images: invariance stems from variations in data

Diane Bouchacourt, Mark Ibrahim, Ari Morcos

Keywords Paper

machine learning, transformers

0

0

0

0

14:19

14/06/2020

MaskGAN: Towards Diverse and Interactive Facial Image Manipulation

Cheng-Han Lee, Ziwei Liu, Lingyun Wu, Ping Luo

Keywords Paper

facial image manipulation, face segmentation, image synthesis, generative adversarial network

0

0

0

0

1:00

14/06/2020

Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment

Qiuyu Chen, Wei Zhang, Ning Zhou and
Peng Lei, Yi Xu, Yu Zheng, Jianping Fan

Keywords Paper

image aesthetics assessment, kernel embedding, adaptive convolution, parameter-free, aspect ratio

0

0

0

0

1:01

08/12/2020

Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks

Lichao Sun, Congying Xia, Wenpeng Yin and
Tingting Liang, Philip Yu, Lifang He

Keywords Paper

0

0

0

0

9:52

14/06/2020

AugFPN: Improving Multi-Scale Feature Learning for Object Detection

Chaoxu Guo, Bin Fan, Qian Zhang and
Shiming Xiang, Chunhong Pan

Keywords Paper

object detection, augfpn, consistent supervision, residual feature augmentation, soft roi selection

0

0

0

0

1:00

06/12/2021

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis

Xudong XU, Xingang Pan, Dahua Lin, Bo Dai

Keywords Paper

generative model

0

0

0

0

6:47

22/11/2021

SIR-SRGAN: Super-Resolution Generative Adversarial Networks with Self-Interpolation Ranker

Jun-Hong Huang, Hai-Kun Wang, Zhi-Wu Liao

Keywords Paper

SISR, Ranker

0

0

0

0

2:02

02/02/2021

Depth Privileged Object Detection in Indoor Scenes via Deformation Hallucination

Zhijie Zhang, Yan Liu, Junjie Chen and
Li Niu, Liqing Zhang

Keywords Paper

0

0

0

0

16:32

06/12/2020

Memory-Efficient Learning of Stable Linear Dynamical Systems for Prediction and Control

Giorgos Mamakoukas, Orest Xherija, Todd Murphey

Keywords Paper

Optimization -> Non-Convex Optimization, Optimization -> Stochastic Optimization

0

0

0

0

3:13

14/06/2020

Learning Texture Transformer Network for Image Super-Resolution

Fuzhi Yang, Huan Yang, Jianlong Fu and
Hongtao Lu, Baining Guo

Keywords Paper

super-resolution, reference, image enhancement

0

0

0

0

1:01

14/06/2020

Image Search With Text Feedback by Visiolinguistic Attention Learning

Yanbei Chen, Shaogang Gong, Loris Bazzani

Keywords Paper

vision and language, image search, text feedback, attention mechanism, transformer, multimodal learning, representation learning, composition, image retrieval, interactive image search

0

0

0

0

1:00

14/06/2020

Normal Assisted Stereo Depth Estimation

Uday Kusupati, Shuo Cheng, Rui Chen, Hao Su

Keywords Paper

multi view stereo, 3d vision, deep learning, depth estimation, surface normal estimation, cost volume, cost aggregation, auxiliary supervision

0

0

0

0

1:01

14/06/2020

Total Deep Variation for Linear Inverse Problems

Erich Kobler, Alexander Effland, Karl Kunisch, Thomas Pock

Keywords Paper

inverse problem, variational method, deep learning, convolutional neural network, optimal control problem, gradient flow, image denoising, single image super-resolution, magnetic resonance imaging, computed tomography

0

0

0

0

4:56

14/06/2020

3DRegNet: A Deep Neural Network for 3D Point Registration

G. Dias Pais, Srikumar Ramalingam, Venu Madhav Govindu and
Jacinto C. Nascimento, Rama Chellappa, Pedro Miraldo

Keywords Paper

3d registration, deep learning, pose regression, classification

0

0

0

0

0:59

06/12/2020

Shared Space Transfer Learning for analyzing multi-site fMRI data

Tony Yousefnezhad, Alessandro Selvitella, Daoqiang Zhang and
Andrew Greenshaw, Russell Greiner

Keywords Paper

0

0

0

0

3:06

18/07/2021

Markpainting: Adversarial Machine Learning meets Inpainting

David G Khachaturov, Ilia Shumailov, Yiren Zhao and
Nicolas Papernot, Ross Anderson

Keywords Paper

Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

4:28

22/11/2021

Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation

Bin Ren, Hao Tang, Nicu Sebe

Keywords Paper

cross view, MLP, image translation, image generation

0

0

0

0

9:58

07/09/2020

DESC: Domain Adaptation for Depth Estimation via Semantic Consistency

Adrian Lopez-Rodriguez, Krystian Mikolajczyk

Keywords Paper

domain adaptation, depth estimation, monocular, depth, domain, KITTI, Virtual KITTI

0

0

0

0

9:58

14/06/2020

Learning Fused Pixel and Feature-Based View Reconstructions for Light Fields

Jinglei Shi, Xiaoran Jiang, Christine Guillemot

Keywords Paper

light field, view synthesis, feature-based reconstruction, pixel-based reconstruction, deep learning, angular super-resolution

0

0

0

0

4:56

26/04/2020

Image-guided Neural Object Rendering

Justus Thies, Michael Zollhöfer, Christian Theobalt and
Marc Stamminger, Matthias Nießner

Keywords Paper

Neural Rendering, Neural Image Synthesis

0

0

0

0

4:41