Heterogeneous Unsupervised Cross-domain Transfer Learning
Transfer learning addresses the problem of how to leverage previously acquired knowledge (a source domain) to improve the efficiency of learning in a new domain (the target domain). Although transfer learning has been widely researched in the last decade, existing research still has two restrictions: 1) the feature spaces of the domains must be homogeneous; and 2) the target domain must have at least a few labeled instances. These restrictions significantly limit transfer learning models when transferring knowledge across domains, especially in the big data era. To completely break through both of these bottlenecks, a theorem for reliable unsupervised knowledge transfer is proposed to avoid negative transfers, and a Grassmann manifold is applied to measure the distance between heterogeneous feature spaces. Based on this theorem and the Grassmann manifold, this study proposes two heterogeneous unsupervised knowledge transfer (HeUKT) models - known as RLG and GLG. The RLG uses a linear monotonic map (LMM) to reliably project two heterogeneous feature spaces onto a latent feature space and applies geodesic flow kernel (GFK) model to transfers knowledge between two the projected domains. The GLG optimizes the LMM to achieve the highest possible accuracy and guarantees that the geometric properties of the domains remain unchanged during the transfer process. To test the overall effectiveness of two models, this paper reorganizes five public datasets into ten heterogeneous cross-domain tasks across three application fields: credit assessment, text classification, and cancer detection. Extensive experiments demonstrate that the proposed models deliver superior performance over current benchmarks, and that these HeUKT models are a promising way to give computers the associative ability to judge unknown things using related known knowledge.
READ FULL TEXT