Abstract:
Models trained on synthetic images often face degraded generalization to real data. To remedy such domain gaps, synthetic training often starts with ImageNet pretrained models in domain generalization and adaptation as they contain the representation from real images. However, the role of ImageNet representation is seldom discussed despite common practices that leverage this knowledge implicitly to maintain generalization ability. An example is the careful hand tuning of learning rates across different network layers which can be laborious and non-scalable. We treat this as a learning without forgetting problem and propose a learning-to-optimize (L2O) method to automate layer-wise learning rates. With comprehensive experiments, we demonstrate that the proposed method can significantly improve the synthetic-to-real generalization performance without seeing and training on real data, while benefiting downstream tasks such as domain adaptation.