Heterogeneous Domain Adaptation with Positive and Unlabeled Data
Heterogeneous unsupervised domain adaptation (HUDA) is the most challenging domain adaptation setting where the feature space differs between source and target domains, and the target domain has only unlabeled data. Existing HUDA methods assume that both positive and negative examples are available in the source domain, which may not be satisfied in some real applications. This paper addresses a new challenging setting called positive and unlabeled heterogeneous domain adaptation (PU-HDA), a HUDA setting where the source domain only has positives. PU-HDA can also be viewed as an extension of PU learning where the positive and unlabeled examples are sampled from different domains. A naive combination of existing HUDA and PU learning methods is ineffective in PU-HDA due to the gap in label distribution between the source and target domains. To overcome this issue, we propose a novel method, positive-adversarial domain adaptation (PADA), which can predict likely positive examples from the unlabeled target data and simultaneously align the feature spaces to reduce the distribution divergence between the whole source data and the likely positive target data. PADA achieves this by a unified adversarial training framework for learning a classifier to predict positive examples and a feature transformer to transform the target feature space to that of the source. Specifically, they are both trained to fool a common discriminator that determines whether the likely positive examples are from the target or source domain. We experimentally show that PADA outperforms several baseline methods, such as the naive combination of HUDA and PU learning.
READ FULL TEXT