Longbing Cao, Philip S Yu, Guansong Pang
Abstract
Complex behaviors are widely seen in artificial and natural intelligent systems, on the internet, social and online networks, multi-agent systems, and brain systems. The in-depth understanding of complex behaviors has been increasingly recognized as a crucial means for disclosing interior driving forces, causes and impact on businesses in handling many challenging issues. However, traditional behavior modeling mainly relies on qualitative methods from behavioral science and social science perspectives. The so-called behavior analysis in data analytics and learning often focuses on human demographic and business usage data, in which behavior-oriented elements are hidden in routinely collected transactional data. As a result, it is ineffective or even impossible to deeply scrutinize native behavior intention, lifecycle, dynamics and impact on complex problems and business issues. In this tutorial, we will present an overview of behavior analytics, review and discuss state-of-the-art and newly emerged techniques for complex behavior analytics, which cover high impact behavior sequence analysis, impact-oriented combined behavior analysis, high utility behavior analysis, nonoccurring behavior analysis, coupled/group/collective behavior analysis, statistical modeling of coupled behaviors, probabilistic modeling of sparse rating behaviors, understanding behavior choice and attraction, behavior analysis with recurrent networks, behavior analysis in visual data, behavior learning from demonstrations. We will show that in-depth behavior analytics creates new opportunities, directions and means for, learning and analysis of complex behaviors in both physical and virtual organizations.
Outline
The tutorial includes the following contents:
- Overview of Behavior Informatics : the qualitative analytics addresses the task of behavior reasoning and verification, while the quantitative research targets behavior learning, analysis and evaluation;
- What is Behavior : we present an abstract behavior model, which captures both intrinsic and contextual properties of behaviors from both subjective and objective perspectives, and an overview of behavior informatics;
- High Impact and High Utility Behavior Analysis : algorithms and case studies are discussed to identify behaviors associated with high impact and utility;
- Nonoccurring Behavior Analysis : algorithms and case studies are explored about mining negative behavior sequences in an efficient way;
- Coupled/Group/Collective Behavior Analysis : algorithms and case studies are provided for identifying and analyzing group/community behaviors and anomalies;
- Statistical Modeling of Coupled Behaviors : statistical models are presented to induce explicit and implicit couplings between attributes into statistical models for modeling behaviors;
- Understanding Behavior Drivers : choice modeling, attraction learning and deep models will be introduced to learn user choice and attraction and sequential interactions between session/context-based users and items and the recommendations of next-best items;
- Behavior Analysis with Recurrent Networks : we will introduce how deep recurrent networks are used to address behavior sequence analysis problems using several representative cases;
- Behavior Analysis in Visual Data : traditional approaches and recently emerged deep learning-based approaches for learning visual behaviors are introduced;
- Behavior Learning from Demonstrations : we will introduce several advanced imitation learning approaches, which focus on imitating human behaviors in a set of demonstrations data;
- Challenges and Prospects : open issues and potential are discussed for complex behavior modeling, analysis and mining in terms of both qualitative and quantitative aspects.
Tutors’ short biography
Longbing Cao holds a PhD in Pattern Recognition and Intelligent Systems in Chinese Academy of Sciences and another PhD in Computing Science at UTS. He has published some 300 publications, four monographs, and four edited books in recent 15 years. He has been working on data science and analytics research, education, development, and enterprise applications since he was a CTO and then joined UTS. Motivated by real-world significant and common challenges, he has been leading the team to develop theories, tools and applications for new areas including non-IID learning, actionable knowledge discovery, behavior informatics, and complex intelligent systems, in addition to issues generally concerned in artificial intelligence, knowledge discovery, machine learning, and their enterprise applications. In data science and analytics, he initiated the Data Science and Knowledge Discovery lab at UTS in 2007, the Advanced Analytics Institute in 2011, the degrees Master of Analytics (Research) and PhD in Analytics in 2011 which are recognized as the world first degrees in data science, the IEEE Task Force on Data Science and Advanced Analytics (DSAA) and IEEE Task Force on Behavior, Economic and Soci-cultural Computing in 2013, the IEEE Conference on Data Science and Advanced Analytics (DSAA), the ACM SIGKDD Australia and New Zealand Chapter in 2014, and the International Journal of Data Science and Analytics with Springer in 2015. He served as program and general chairs of conferences such as KDD2015. In enterprise data science innovation, his team has successfully delivered many large projects for government and business organizations in over 10 domains including finance/capital markets, banking, health and car insurance, health, telco, recommendation, online business, education, and the public sector including ATO, DFS, DHS, DIBP and IP Australia, resulting in billions of dollar savings and mentions in government, industry, media and OECD reports. In 2013, AAI was the only organization specially mentioned in the Government’s first big data paper: Big Data Strategy – Issues Paper.
Philip S Yu is a professor in the Department of Computer Science at the University of Illinois at Chicago and also holds the Wexler Chair in Information Technology. His main research interests include data mining (especially on graph/network mining), social network, privacy preserving data publishing, data stream, database systems, and Internet applications and technologies. He has published more than 650 papers in refereed journals and conferences. He holds or has applied for more than 300 US patents. He is a Fellow of the ACM and the IEEE. He is the Editor-in-Chief of ACM Transactions on Knowledge Discovery from Data. He is on the steering committee of the IEEE Conference on Data Mining and ACM Conference on Information and Knowledge Management and was a member of the IEEE Task Force on Data Science and Advanced Analytics and IEEE Data Engineering’s steering committees. He had also served as an associate editor, conference chair, program chair of many conferences. He is one of the key researchers in founding and promoting the field of heterogeneous information network analysis and he has collaborated with many researchers in designing novel techniques to address heterogeneous information networks and their applications in real-world scenarios.
Guansong Pang is a Ph.D. candidate in the Advanced Analytics Institute at the University of Technology Sydney. Before joining AAI, he received his Master of Philosophy degree in artificial intelligence from Monash University Australia. His research interests include data mining, non-IID outlier detection and feature selection. He has published more than 10 papers in refereed conferences and journals, such as AAAI, IJCAI, ICDM, CIKM, Data Mining and Knowledge Discovery Journal (DMKD), Journal of Artificial Intelligence Research (JAIR), Information Processing & Management Journal (IP&M). He has served the community as program committee member of IJCAI 2017-2018, AAAI 2017-2018, PAKDD2017-2018, CIKM2017, and reviewer of IEEE Transactions on Knowledge and Data Engineering, Information Retrieval Journal, IEEE Intelligent Systems, International Journal of Data Science and Analytics. He has delivered several tutorials/seminars about outlier detection to different universities and groups, and conference presentations to KDD, AAAI, IJCAI, ICDM and CIKM.
Intended Audience
Any audience who may be interested in representing, modeling, analyzing, learning, managing and utilizing behaviors in varied domains, for example, in human behaviors, web systems, online networks, multi-agent systems, human brain systems, visual systems, and games would find it very helpful in attending this tutorial. While no specific knowledge is required from the audience, people who are familiar with the data mining and machine learning will find it more beneficial in understanding the algorithms and case studies to be introduced in this tutorial.
Related Tutorials
- Longbing Cao, Philip S. Yu, Can Wang. Behavior Informatics: Modeling, Analysis and Mining of Complex Behaviors, IJCAI 2013. Beijing, China. August 3-9, 2013. http://ijcai13.org/program/tutorial/TC2
- Longbing Cao, Philip S. Yu, Guansong Pang, Chengzhang Zhu. Non-IID Learning, KDD 2017. Halifax, Canada. August 13-17, 2017. http://www.kdd.org/kdd2017/tutorials
- Longbing Cao. Learning Non-IID Big Data, CIKM 2014. Shanghai, China. November 3-7, 2014. http://cikm2014.fudan.edu.cn/index.php
- Longbing Cao. Non-IIDness Learning in Big Data, PAKDD 2014. Tainan, Taiwan. 13-16 May, 2014. http://140.116.164.190/~pakddweb/index.php?page=Tutorials
The IJCAI13 tutorial presents a holistic view of behavior informatics, covering both behavior representation and behavior analytics. The PAKDD2014, CIKM2014, and KDD2017 tutorials were among the most popular tutorials at those conferences, they mainly focus on non-IID data learning, which also cover a short overview of coupled (group and collective) behavior analysis.
Important references
- Longbing Cao, Philip S Yu (Eds). Behavior Computing: Modeling, Analysis, Mining and Decision, Springer, 2012.
- Trong Dinh Thac Do and Longbing Cao. Metadata‐dependent Infinite Poisson Factorization for Efficiently Modelling Sparse and Large Matrices in Recommendation, IJCAI2018.
- Liang Hu, Songlei Jian, Longbing Cao, Qingkui Chen. Interpretable Recommendation via Attraction Modeling: Learning Multilevel Attractiveness over Multimodal Movie Contents, IJCAI2018.
- Lizhen Wang, Xuguang Bao, Longbing Cao. Interactive Probabilistic Post-mining of User-preferred Spatial Co-location Patterns, ICDE2018
- Chengzhang Zhu, Longbing Cao, Qiang Liu, Jianpin Yin and Vipin Kumar. Heterogeneous Metric Learning of Categorical Data with Hierarchical Couplings. IEEE Transactions on Knowledge and Data Engineering, DOI: 10.1109/TKDE.2018.2791525, 2018.
- Songlei Jian, Liang Hu, Longbing Cao, and Kai Lu. Metric-based Auto-Instructor for Learning Mixed Data Representation. AAAI2018.
- Trong Dinh Thac Do, Longbing Cao. Coupled Poisson Factorization Integrated with User/Item Metadata for Modeling Popular and Sparse Ratings in Scalable Recommendation. AAAI2018.
- Shoujin Wang, Liang Hu, Longbing Cao, Xiaoshui Huang, Defu Lian and Wei Liu. Attention‐based Transactional Context Embedding for Next‐Item Recommendation. AAAI2018.
- Guansong Pang, Longbing Cao, Ling Chen, Defu Lian and Huan Liu. Sparse Modeling-based Sequential Ensemble Learning for Effective Outlier Detection in High-dimensional Numeric Data. AAAI2018.
- Longbing Cao. Behavior Informatics to Discover Behavior Insight for Active and Tailored Client Management. KDD2017 (Industry invited talks), 2017.
- Defu Lian, Rui Liu, Yong Ge, Kai Zheng, Xing Xie and Longbing Cao. Discrete Content-aware Matrix Factorization. KDD2017.
- Guansong Pang, Hongzuo Xu, Longbing Cao and Wentao Zhao. Selective Value Coupling Learning for Detecting Outliers in High-Dimensional Categorical Data. CIKM2017.
- Jia Xu, Wei Wei, Longbing Cao. Copula-Based High Dimensional Cross-Market Dependence Modeling, DSAA2017, 734-743
- Liang Hu, Longbing Cao, Jian Cao, Zhipeng Gu, Guandong Xu, Jie Wang. Improving the Quality of Recommendations for Users and Items in the Tail of Distribution. ACM Trans. Info Sys., 2017.
- Herath, S., Harandi, M., & Porikli, F. (2017). Going deeper into action recognition: A survey. Image and vision computing, 60, 4-21.
- Hussein, A., Gaber, M. M., Elyan, E., & Jayne, C. (2017). Imitation learning: A survey of learning methods. ACM Computing Surveys (CSUR), 50(2), 21.
- Longbing Cao, Xiangjun Dong and Zhigang Zheng. e-NSP: Efficient Negative Sequential Pattern Mining. Artificial Intelligence, 235: 156-182, http://dx.doi.org/10.1016/j.artint.2016.03.001, 2016.
- Guansong Pang, Longbing Cao, Ling Chen. Outlier Detection in Complex Categorical Data by Modelling the Feature Value Couplings. IJCAI2016.
- Guansong Pang, Longbing Cao, Ling Chen, Huan Liu. Unsupervised Feature Selection for Outlier Detection by Modelling Hierarchical Value-Feature Couplings. ICDM2016.
- Zhigang Zheng, Wei Wei, Chunming Liu, Wei Cao, Longbing Cao, Maninder Bhatia. An effective contrast sequential pattern mining approach to taxpayer behavior analysis, World Wide Web 19(4): 633-651 (2016).
- Bin Shen, Longbing Cao, Min Yao, Yunjun Gao. Mining preferred navigation patterns by consolidating both selection and time preferences, World Wide Web 19(5): 979-1007 (2016).
- Philippe Fournier-Viger, Cheng-Wei Wu, Vincent S. Tseng, Longbing Cao, Roger Nkambou. Mining Partially-Ordered Sequential Rules Common to Multiple Sequences, IEEE Trans. Knowledge and Data Engineering, 27(8): 2203-2216 (2015).
- Longbing Cao. Coupling Learning of Complex Interactions, Information Processing and Management, 51(2): 167-186 (2015).
- Jingyu Shao, Junfu Yin, Wei Liu, Longbing Cao. Mining actionable combined patterns of high utility and frequency. DSAA 2015: 1-10.
- Can Wang, Longbing Cao, Chi-Hung Chi. Formalization and Verification of Group Behavior Interactions. IEEE T. Systems, Man, and Cybernetics: Systems 45(8): 1109-1124 (2015).
- Longbing Cao, Philip S. Yu, Vipin Kumar. Nonoccurring Behavior Analytics: A New Area. IEEE Intelligent Systems 30(6): 4-11 (2015).
- Wei Cao, Longbing Cao. Financial Crisis Forecasting via Coupled Market State Analysis, IEEE Intelligent Systems, 30(2): 18-25 (2015).
- Can Wang, Dong, Xiangjun; Zhou, Fei; Longbing Cao, Chi, Chi-Hung. Coupled Attribute Similarity Learning on Categorical Data, IEEE Transactions on Neural Networks and Learning Systems, 26(4): 781-797 (2015).
- Longbing Cao. Behavior Informatics: A New Perspective. IEEE Intelligent Systems (Trends and Controversies), 29(4): 62-80, 2014.
- Longbing Cao and Thorsten Joachims. Behavior Computing, IEEE Intelligent Systems, 29(4): 62-66, 2014.
- Longbing Cao, Yu, Philip S; Motoda, Hiroshi; Williams, Graham. Special issue on behavior computing (editorial), Knowledge and Information Systems, 37(2): 245-249, 2013.
- Longbing Cao. Combined Mining: Analyzing Object and Pattern Relations for Discovering and Constructing Complex but Actionable Patterns, WIREs Data Mining and Knowledge Discovery, 3(2): 140-155, 2013.
- Can Wang, Zhong She, Longbing Cao. Coupled Attribute Analysis on Numerical Data, IJCAI 2013.
- Can Wang, Zhong She, Longbing Cao. Coupled Clustering Ensemble: Incorporating Coupling Relationships Both between Base Clusterings and Objects, ICDE 2013.
- Can Wang, Zhong She, Longbing Cao. Coupled Clustering Ensemble: Incorporating Coupling Relationships Both between Base Clusterings and Objects, ICDE2013.
- Jinjiu Li, Can Wang, Longbing Cao, Philip S. Yu. Efficient Selection of Globally Optimal Rules on Large Imbalanced Data Based on Rule Coverage Relationship Analysis, SDM 2013.
- Fangfang Li, Guandong Xu, Longbing Cao, Zhendong Niu. Coupled Group-based Matrix Factorization for Recommender System, WISE 2013.
- Xin Cheng, Duoqian Miao, Can Wang, Longbing Cao. Coupled Term-Term Relation Analysis for Document Clustering, IJCNN2013.
- Wei Cao, Longbing Cao, Yin Song. Coupled Market Behavior Based Financial Crisis Detection, IJCNN2013.
- Junfu Yin, Zhigang Zheng, Longbing Cao, Yin Song, Wei Wei. Efficiently Mining Top-K High Utility Sequential Patterns, ICDM2013: 1259-1264.
- Junfu Yin, Zhigang Zheng, Longbing Cao. USpan: An Efficient Algorithm for Mining High Utility Sequential Patterns, KDD 2012, 660-668.
- Cao, L., Ou, Y., Yu, P.S. Coupled Behavior Analysis with Applications, IEEE Transactions on Knowledge and Data Engineering, 24 (8): 1378-1392, 2012.
- Yin, J., Zheng, Z., Cao, L. USpan: An Efficient Algorithm for Mining High Utility Sequential Patterns, KDD 2012, 660-668, 2012.
- Song, Y., Cao, L. et al. Coupled Behavior Analysis for Capturing Coupling Relationships in Group-based Market Manipulations. KDD 2012, 976-984, 2012.
- Yin Song and Longbing Cao. Graph-based Coupled Behavior Analysis: A Case Study on Detecting Collaborative Manipulations in Stock Markets, IJCNN 2012, 1-8, 2012.
- Wei Wei, Jinjiu Li, Longbing Cao, Yuming Ou, Jiahang Chen, Effective Detection of Sophisticated Online Banking Fraud in Extremely Imbalanced Data, World Wide Web Journal, 1-27, 2012.
- Longbing Cao, Huaifeng Zhang, Yanchang Zhao, Dan Luo, Chengqi Zhang. Combined Mining: Discovering Informative Knowledge in Complex Data, IEEE Trans. SMC Part B.
- Can Wang, Mingchun Wang, Zhong She, Longbing Cao. CD: A Coupled Discretization Algorithm, PAKDD 2012.
- Can Wang, Longbing Cao, Minchun Wang, Jinjiu Li, Wei Wei, Yuming Ou. Coupled Nominal Similarity in Unsupervised Learning, CIKM 2011.
- Yanshan Xiao, bo liu, Jie Yin, Longbing Cao, Chengqi Zhang. Similarity-Based Approach for Positive and Unlabeled Learning, IJCAI 2011, 1577-1582.
- Wang, C., Cao, L. Formalization and Verification of Group Behavior Interactions, UTS/AAI Technical Report, 2011.
- Wang, H., Kläser, A., Schmid, C., & Liu, C. L. (2011). Action recognition by dense trajectories. In CVPR (pp. 3169-3176).
- Longbing Cao, Yuming Ou, Philip S YU, Gang Wei. Detecting Abnormal Coupled Sequences and Sequence Changes in Group-based Manipulative Trading Behaviors, KDD 2010, 85-94.
- Longbing Cao, In-depth Behavior Understanding and Use: the Behavior Informatics Approach, Information Science, 180(17); 3067-3085, 2010.
- Huaifeng Zhang, Yanchang Zhao, Longbing Cao, Chengqi Zhang and Hans Bohlscheid, Customer Activity Sequence Classification for Debt Prevention in Social Security, Comput. Sci. & Technol., 24(6): 1000-1009, 2009.
- Zhigang Zheng, Yanchang Zhao, Ziye Zuo, Longbing Cao, Huaifeng Zhang, Yanchang Zhao, Chengqi Zhang. An Efficient GA-Based Algorithm for Mining Negative Sequential Patterns, PAKDD 2010, 262-273.
- Yanchang Zhao, Huaifeng Zhang, Longbing Cao, Jian Pei, Shanshan Wu, Chengqi Zhang and Hans Bohlscheid, Debt Detection in Social Security by Sequence Classification Using Both Positive and Negative Patterns, ECML-PKDD 2009, 648-663, 2009.
- Yeffet, L., & Lior W. (2009). Local trinary patterns for human action recognition. In CVPR (pp. 492-497).
- Yanchang Zhao, Huaifeng Zhang, Longbing Cao, Chengqi Zhang and Hans Bohlscheid. Mining Both Positive and Negative Impact-Oriented Sequential Rules From Transactional Data, PAKDD 2009, 656-663.
- Yanchang Zhao, Huaifeng Zhang, Longbing Cao, and Chengqi Zhang, Efficient Mining of Event-Targeted Negative Sequential Rules, WI 2008, 336-342.
- Longbing Cao, Yanchang Zhao, Chengqi Zhang. Mining Impact-Targeted Activity Patterns in Imbalanced Data, IEEE Trans. on Knowledge and Data Engineering, 20(8): 1053-1066, 2008.
- Longbing Cao and Ou, Yuming. Market Microstructure Patterns Powering Trading and Surveillance Agents, Journal of Universal Computer Science, 14(14): 2288-2308, 2008.
- Longbing Cao, Yanchang Zhao, Chengqi Zhang, Huaifeng Zhang. Activity Mining: from Activities to Actions, International Journal of Information Technology & Decision Making, 7(2): 259-273, 2008.
- Laptev, I., Marszalek, M., Schmid, C., & Rozenfeld, B. (2008). Learning realistic human actions from movies. In CVPR (pp. 1-8).