Abstract:
In recent years, we have witnessed the great success of convolutional neural networks in a wide range of visual applications. However, these networks are typically deficient due to the high cost in storage and computation, which prohibits their further extensions to resource-limited applications. In this paper, we introduce Global&Progressive Product Quantization(G&P PQ), an end-to-end product quantization based network compression method, to merge the separate quantization and finetuning process into a consistent training framework. Compared to existing two-stage methods, we avoid the time-consuming process of choosing layer-wise finetuning hyperparameters and also make the network capable of learning complex dependencies among layers by quantizing globally and progres- sively. To validate the effectiveness, we benchmark G&P PQ by applying it to ResNet-like architectures for image classification and demonstrate state-of- the-art tradeoff in terms of model size vs. accuracy with extensive compression configurations.