SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition

Abstract: Arbitrary text appearance poses a great challenge in scene text recognition tasks. Existing works mostly handle with the problem in consideration of the shape distortion, including perspective distortions, line curvature or other style variations. Rectification (i.e., spatial transformers) as the preprocessing stage is one popular approach and extensively studied. However, chromatic difficulties in complex scenes have not been paid much attention on. In this work, we introduce a new learnable geometric-unrelated rectification, Structure-Preserving Inner Offset Network (SPIN), which allows the color manipulation of source data within the network. This differentiable module can be inserted before any recognition architecture to ease the downstream tasks, giving neural networks the ability to actively transform input intensity rather than only the spatial rectification. It can also serve as a complementary module to known spatial transformations and work in both independent and collaborative ways with them. Extensive experiments show the proposed transformation outperforms existing rectification networks and has comparable performance among the state-of-the-arts.

SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition

Chengwei Zhang, Yunlu Xu, Zhanzhan Cheng, Shiliang Pu, Yi Niu, Fei Wu, Futai Zou

Comments

Similar Papers

Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation

Hang Gao, Xizhou Zhu, Stephen Lin, Jifeng Dai

Keywords Abstract Paper

Effective Receptive Fields, Deformation Modeling, Dynamic Inference

Counterfactuals uncover the modular structure of deep generative models

Michel Besserve, Arash Mehrjou, Rémy Sun, Bernhard Schölkopf

Keywords Abstract Paper

generative models, causality, counterfactuals, representation learning, disentanglement, generalization, unsupervised learning

U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation

Junho Kim, Minjae Kim, Hyeonwoo Kang, Kwang Hee Lee

Keywords Abstract Paper

Image-to-Image Translation, Generative Attentional Networks, Adaptive Layer-Instance Normalization

Hierarchical Scene Coordinate Classification and Regression for Visual Localization

Xiaotian Li, Shuzhe Wang, Yi Zhao and Jakob Verbeek, Juho Kannala

Keywords Abstract Paper

visual localization, camera relocalization, scene coordinate regression

Geometry Processing with Neural Fields

Guandao Yang, Serge Belongie, Bharath Hariharan, Vladlen Koltun

Keywords Abstract Paper

Unsupervised Intra-Domain Adaptation for Semantic Segmentation Through Self-Supervision

Fei Pan, Inkyu Shin, Francois Rameau and Seokju Lee, In So Kweon

Keywords Abstract Paper

domain adaptation, semantic segmentation, self-supervised learning, unsupervised learning, transfer learning.

STEFANN: Scene Text Editor Using Font Adaptive Neural Network

Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal

Keywords Abstract Paper

scene image, scene text editor, font adaptive, font generation, font color transfer, single observation, computer vision, deep learning

DeepVoxels++: Enhancing the Fidelity of Novel View Synthesis from 3D Voxel Embeddings

Tong He, John Collomosse, Hailin Jin, Stefano Soatto

Keywords Abstract Paper

Robust and Generalizable Visual Representation Learning via Random Convolutions

Zhenlin Xu, Deyi Liu, Junlin Yang and Colin Raffel, Marc Niethammer

Keywords Abstract Paper

robustness, domain generalization, representation learning, data augmentation

Tensor Component Analysis for Interpreting the Latent Space of GANs

James Oldfield, Markos Georgopoulos, Yannis Panagakis and Mihalis A Nicolaou, Ioannis Patras

Keywords Abstract Paper

GANs, interpretable directions, image editing, image synthesis, tensor methods

Evolving Normalization-Activation Layers

Hanxiao Liu, Andy Brock, Karen Simonyan, Quoc V Le

Keywords Abstract Paper

Fashion Editing With Adversarial Parsing Learning

Haoye Dong, Xiaodan Liang, Yixuan Zhang and Xujie Zhang, Xiaohui Shen, Zhenyu Xie, Bowen Wu, Jian Yin

Keywords Abstract Paper

fashion editing, image generation, image synthesis, gan, generative adversarial network, image manipulation, human parsing, segmentation, image editing, virtual try-on

Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition

Mark Boss, Varun Jampani, Raphael Braun and Ce Liu, Jonathan Barron, Hendrik PA Lensch

Keywords Abstract Paper

vision, graph learning

Swapping Autoencoder for Deep Image Manipulation

Taesung Park, Jun-Yan Zhu, Oliver Wang and Jingwan Lu, Eli Shechtman, Alexei Efros, Richard Zhang

Keywords Abstract Paper

Learning bijective feature maps for linear ICA

Alexander Camuto, Matthew Willetts, Chris Holmes and Brooks Paige, Stephen Roberts

Keywords Abstract Paper

Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation

Bin Ren, Hao Tang, Nicu Sebe

Keywords Abstract Paper

cross view, MLP, image translation, image generation

Non-Local Neural Networks With Grouped Bilinear Attentional Transforms

Lu Chi, Zehuan Yuan, Yadong Mu, Changhu Wang

Keywords Abstract Paper

attention, non-local, bilinear, image classification, video classification, grouped, data-adaptive

Learning Deformable Tetrahedral Meshes for 3D Reconstruction

Jun Gao, Wenzheng Chen, Tommy Xiang and Alec Jacobson, Morgan McGuire, Sanja Fidler

Keywords Abstract Paper

Applications -> Computer Vision; Deep Learning -> CNN Architectures, Applications -> Body Pose, Face, and Gesture Analysis

Progressive Open-Domain Response Generation with Multiple Controllable Attributes

Haiqin Yang, Xiaoyuan Yao, Yiqun Duan and Jianping Shen, Jie Zhong, Kun Zhang

Keywords Abstract Paper

Machine Learning, Learning Generative Models, Dialogue

Unsupervised Learning for Intrinsic Image Decomposition From a Single Image

Yunfei Liu, Yu Li, Shaodi You, Feng Lu

Keywords Abstract Paper

intrinsic image decomposition, unsupervised learning, distribution, priors, independence constraint, physical consistency constraint

Efficient Dynamic Scene Deblurring Using Spatially Variant Deconvolution Network With Optical Flow Guided Training

Keywords Paper

Keywords Paper

Keywords Paper

Xiaotian Li, Shuzhe Wang, Yi Zhao and
Jakob Verbeek, Juho Kannala

Keywords Paper

Keywords Paper

Fei Pan, Inkyu Shin, Francois Rameau and
Seokju Lee, In So Kweon

Keywords Paper

Keywords Paper

Keywords Paper

Zhenlin Xu, Deyi Liu, Junlin Yang and
Colin Raffel, Marc Niethammer

Keywords Paper

James Oldfield, Markos Georgopoulos, Yannis Panagakis and
Mihalis A Nicolaou, Ioannis Patras

Keywords Paper

Keywords Paper

Haoye Dong, Xiaodan Liang, Yixuan Zhang and
Xujie Zhang, Xiaohui Shen, Zhenyu Xie, Bowen Wu, Jian Yin

Keywords Paper

Mark Boss, Varun Jampani, Raphael Braun and
Ce Liu, Jonathan Barron, Hendrik PA Lensch

Keywords Paper

Taesung Park, Jun-Yan Zhu, Oliver Wang and
Jingwan Lu, Eli Shechtman, Alexei Efros, Richard Zhang

Keywords Paper

Alexander Camuto, Matthew Willetts, Chris Holmes and
Brooks Paige, Stephen Roberts

Keywords Paper

Keywords Paper

Keywords Paper

Jun Gao, Wenzheng Chen, Tommy Xiang and
Alec Jacobson, Morgan McGuire, Sanja Fidler

Keywords Paper

Haiqin Yang, Xiaoyuan Yao, Yiqun Duan and
Jianping Shen, Jie Zhong, Kun Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Peng Wang, Lingjie Liu, Yuan Liu and
Christian Theobalt, Taku Komura, Wenping Wang

Keywords Paper

Keywords Paper

Sen He, Wentong Liao, Hamed R. Tavakoli and
Michael Yang, Bodo Rosenhahn, Nicolas Pugeault

Keywords Paper

Keywords Paper

Keywords Paper

Pradyumna Reddy, Zhifei Zhang, Matthew Fisher and
Hailin Jin, Zhaowen Wang, Niloy Mitra

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Mark Boss, Varun Jampani, Kihwan Kim and
Hendrik P.A. Lensch, Jan Kautz

Keywords Paper

Ankit Vani, Max Schwarzer, Yuchen Lu and
Eeshan Dhekane, Aaron Courville

Keywords Paper