Data Science Lab

SepDiff: Self-Encoding Parameter Diffusion for Learning Latent Semantics
Zhangkai Wu, Xuhui Fan, Jin Li, Zhilin Zhao, Hui Chen, Longbing Cao. KDD Research Track, 2025.

The recently proposed Bayesian Flow Networks (BFNs) show great potential in modeling parameter spaces via a diffusion process, offering a unified strategy for handling continuous, discrete data. However, these parameter diffusion models cannot learn high-level semantic representation from the parameter space since common encoders, which encode data into one static representation, cannot capture semantic changes in parameters. This motivates a new direction: learning semantic representations hidden in the parameter spaces to characterize noisy data. Accordingly, we propose a representation learning framework named SepDiff which operates in the parameter space to obtain parameter-wise latent semantics that exhibit progressive structures. Specifically, SepDiff proposes a self-encoder to learn latent semantics directly from parameters, rather than from observations. The encoder is then integrated into parameter diffusion model, enabling representation learning with various formats of observations. Mutual information terms further promote the disentanglement of latent semantics and capture meaningful semantics simultaneously. We illustrate seven representation learning tasks in SepDiff via expanding this parameter diffusion model, and extensive quantitative experimental results demonstrate the superior effectiveness of SepDiff in learning parameter representation.