Unsupervised Heterogeneous Coupling Learning for Categorical Representation
Chengzhang Zhu, Longbing Cao, Jianpin Yin, TPAMI
This works the challenges facing deep models in learning categorical representations on small to large yet complex categorical data with various coupling relationships. We show the power of shallow unsupervised learning of such couplings.
Complex categorical data is often hierarchically coupled with heterogeneous relationships between attributes and attribute values and the couplings between objects. Such value-to-object couplings are heterogeneous with complementary and inconsistent interactions and distributions. Limited research exists on unlabeled categorical data representations, ignores the heterogeneous and hierarchical couplings, underestimates data characteristics and complexities, and overuses redundant information, etc. Deep representation learning of unlabeled categorical data is challenging, overseeing such value-to-object couplings, complementarity and inconsistency, and requiring large data, disentanglement, and high computational power. This work introduces a shallow but powerful UNsupervised heTerogeneous couplIng lEarning (UNTIE) approach for representing coupled categorical data by untying the interactions between couplings and revealing heterogeneous distributions embedded in each type of couplings. UNTIE is efficiently optimized w.r.t. a kernel k-means objective function for unsupervised representation learning of heterogeneous and hierarchical value-to-object couplings. Theoretical analysis shows that UNTIE can represent categorical data with maximal separability while effectively represents heterogeneous couplings and disclose their roles in categorical data. The UNTIE-learned representations make significant performance improvement against the state-of-the-art categorical representations and deep representation models on 25 categorical data sets with diversified characteristics.
Chengzhang Zhu, Longbing Cao and Jianpin Yin. Unsupervised Heterogeneous Coupling Learning for Categorical Representation. IEEE Transaction on Pattern Recognition and Machine Intelligence, 2020. BibTeX
Coupling and interaction learning
More research work on learning complex couplings and interactions: https://datasciences.org/coupling-learning/