The Data Science Lab
since 2005
  • Home
  • Research
      • Research grants
      • Research interests
      • Research leadership
      • Student theses
      • Humanoid Ameca
      • AI Server
        • GPU
        • Request
        • Allocation
  • Consultancy
      • Consulting projects
      • Cooperate training
      • Enterprise innovation
      • Impact cases
      • Our clients
      • Partnership
  • People
      • Awards and honors
      • Staff
      • Team members
  • Activities
      • Events and services
      • Talks
      • Tutorials
      • Workshops
  • Publications
  • Communities
      • ACM ANZKDD Chapter
      • Big data summit
      • Data Analytics book series
      • DSAA conferences
      • IEEE TF-DSAA
      • IEEE TF-BESC
      • JDSA Springer
      • DataSciences.Info
      • MQ's DSAI
  • Resources
      • Actionable knowledge discovery
      • Agent mining
      • AI: Artificial-intelligence
      • AI4Tech: AI enabling technologies
      • AI4Finance: AI for FinTech
      • AI robots & humanoid AI
      • Algorithmic trading
      • Banking analytics
      • Behavior analytics, computing, informatics
      • Coupling and interaction learning
      • COVID-19 global research and modeling
      • Data science knowledge map
      • Data science dictionary
      • Data science terms
      • Data science tools
      • Data science thinking
      • Domain driven data mining
      • Educational data mining
      • Large-scale statistical learning
      • Metasynthetic engineering
      • Market surveillance
      • Negative Sequence Analysis
      • Non-IID Learning
      • Pattern relation analysis
      • Recommender systems
      • Smart beach analytics
      • Social security analytics
      • Tax analytics
  • About us
Yanshan Xiao. SVM-based Instance Learning in Complex Data, PhD thesis, Jul 2011

In traditional supervised learning, there exist several assumptions: (1) the training label is associated with a single instance; (2) both of the positive class and the negative class are available to learn the classifier; (3) the concept underneath the data is stable and will not change over time. However, these assumptions may not always satisfy. This thesis focuses on the complex data which does not meet all these assumptions. The complex data studied in this thesis includes: multiple-instance data, streaming Web page data, and positive and unlabelled data.

To cope with the above challenges, this thesis aims at designing novel algorithms to deal with the complex data. Firstly, this thesis proposes a novel multiple-instance learning (MIL) method, named SMILE (Similarity-based Multiple-Instance LEarning) by introducing similarity weights to each instance in the positive bags. Secondly, this thesis presents a novel multi-instance streaming learning framework to classify Web pages in time-evolving data streams. Thirdly, this thesis puts forward a robust PU learning (RPUL) approach, by associating the undetermined instances with two instance weights, which indicate the probability of an undetermined instance towards the positive and negative class, respectively. Substantial experiments have shown that our proposed approaches in this thesis are able to cope with the challenges compared with the traditional methods. Keywords: Support Vector Machine, Multiple-Instance Learning, Web Page Stream Classification, Positive and Unlabelled Instance Learning.

About us
School of Computing, Faculty of Science and Engineering, Macquarie University, Australia
Level 3, 4 Research Park Drive, Macquarie University, NSW 2109, Australia
Tel: +61-2-9850 9583
Staff: firstname.surname(a)mq.edu.au
Students: firstname.surname(a)student.mq.edu.au
Contacts@datasciences.org