Abstract:
Domain adaptation aims to alleviate the gap between source and target data drawn from different distributions. Most of the related works seek either for a latent space where source and target data share the same distribution, or for a transformation of the source distribution to match the target one. In this paper, we introduce an original scenario where the former trained source model is directly reused on target data, requiring only finding a transformation from the target domain to the source domain. As a first approach to tackle this problem, we propose a greedy coordinate-wise transformation leveraging on optimal transport. Beyond being fully independent of the model initially learned on the source data, the achieved transformation has the following three assets: scalability, interpretability and feature-type free (continuous and/or categorical). Our procedure is numerically evaluated on various real datasets, including domain adaptation benchmarks and also a challenging fraud detection dataset with very imbalanced classes. Interestingly, we observe that transforming a small subset of the target features leads to accuracies competitive with “classical” domain adaptation methods.