The Data Science Lab
since 2005
  • Home
  • Research
      • Research grants
      • Research interests
      • Research leadership
      • Student theses
      • Humanoid Ameca
      • AI Server
        • GPU
        • Request
        • Allocation
  • Consultancy
      • Consulting projects
      • Cooperate training
      • Enterprise innovation
      • Impact cases
      • Our clients
      • Partnership
  • People
      • Awards and honors
      • Staff
      • Team members
  • Activities
      • Events and services
      • Talks
      • Tutorials
      • Workshops
  • Publications
  • Communities
      • ACM ANZKDD Chapter
      • Big data summit
      • Data Analytics book series
      • DSAA conferences
      • IEEE TF-DSAA
      • IEEE TF-BESC
      • JDSA Springer
      • DataSciences.Info
      • MQ's DSAI
  • Resources
      • Actionable knowledge discovery
      • Agent mining
      • AI: Artificial-intelligence
      • AI4Tech: AI enabling technologies
      • AI4Finance: AI for FinTech
      • AI robots & humanoid AI
      • Algorithmic trading
      • Banking analytics
      • Behavior analytics, computing, informatics
      • Coupling and interaction learning
      • COVID-19 global research and modeling
      • Data science knowledge map
      • Data science dictionary
      • Data science terms
      • Data science tools
      • Data science thinking
      • Domain driven data mining
      • Educational data mining
      • Large-scale statistical learning
      • Metasynthetic engineering
      • Market surveillance
      • Negative Sequence Analysis
      • Non-IID Learning
      • Pattern relation analysis
      • Recommender systems
      • Smart beach analytics
      • Social security analytics
      • Tax analytics
  • About us
Trong Dinh Thac Do. Non-IID Latent Variable Models, PhD thesis, Feb 2019

Latent Variable Model (LVM) is the statistical model that aims to uncover hidden information behind data. These models have been widely used for real-world applications such as community detection, link prediction or recommender systems. However, LVM faces significant challenges in modeling complex relations since LVM assumes that the data are independent and identically distributed (IID). However, real-world data are often coupled in terms of object attributes, object relations, or even hidden variable relations. For example, in social networks, users that indicate a similar `age’, `location’ and `high school’ are often friends. To this end, non-IID learning has the potential to describe the above hierarchical relations in real-world data which are typically not independent or identically distributed (non-IID).

In this thesis, we are interested in determining the relations behind observations and hidden variables in LVM. More specifically, we focus on coupling relations in non-IID data in terms of various LVM, including Latent Class Model (LCM), Latent Feature Model (LFM), and Latent Factor Model-Matrix Factorization (LFM-MF). In particular, we aim to model the following relations: (1) relations between attributes in observed data (e.g., user/item metadata such as `location’ of a user or `genre’ of a movie); (2) relations between different sources of observed data (e.g., metadata and user’s friendships); and (3) relations between latent variables in LVM. We also apply Bayesian Nonparametric (BNP) techniques to the proposed LVM models to automatically tune the number of latent variables in LVM for efficient computation.

Furthermore, to work with large and sparse data, we introduce several methods for better inference of the proposed LVM models. The empirical analysis of both proposed models reveals that our models significantly outperform state-of-the-art models in the same family. Together with improved optimization techniques (i.e., BNP and inference methods), our proposed models indicate their potential for online modeling of large, sparse data.

About us
School of Computing, Faculty of Science and Engineering, Macquarie University, Australia
Level 3, 4 Research Park Drive, Macquarie University, NSW 2109, Australia
Tel: +61-2-9850 9583
Staff: firstname.surname(a)mq.edu.au
Students: firstname.surname(a)student.mq.edu.au
Contacts@datasciences.org