Abstract:
A big, diverse and balanced training data is the key to the success of deep neural network training. However, existing publicly available datasets used in facial landmark localization are usually much smaller than those for other computer vision tasks. To mitigate this issue, this paper presents a novel Separable Batch Normalization (SepBN) method. Different from the classical BN layer, the proposed SepBN module learns multiple sets of mapping parameters to adaptively scale and shift the normalized feature maps via a feed-forward attention mechanism. The channels of an input tensor are divided into several groups and different mapping parameter combinations are calculated for each group according to the attention weights to improve the parameter utilization. The experimental results obtained on several well-known benchmarking datasets demonstrate the effectiveness and merits of the proposed method.