Introduction
Social security analytics, or social security data mining (SSDM) refers to the data mining, knowledge discovery and machine learning of social security and social welfare-related objectives, problems and data. SSDM aims to explore challenging issues in social welfare systems, policies, operations, compliance, and practice, and to discover actionable systems, tools and evidence to strengthen government service objectives, improve and optimize service quality and policy making, enhance the efficiency and effectiveness of social security/social welfare policies and processes, detect and prevent overpayments, frauds, and noncompliance behaviors.
Social welfare is a typical function and benefit of citizen-centered and human-centered government and organizations. Every year a significant proportion of government budgets and taxpayer’s money are committed to generally over one-third national populations. The quality of social security services, policies, operations deeply and seriously affect or determine many average people’s life. In the modern history, several regional and global financial crises were related to or induced by social welfare disasters.
Research Topics
The research topics include but are not limited to the following areas:
- Earnings analysis: reviewing, estimating and predicting client’s income, detecting undeclared earnings for predicting accurate entitled payments;
- Overpayment detection: detecting and predicting overpayment to clients and reasons, recommending tailored repayment arrangements, etc.;
- Debt recovery: analyzing debt profile, factors driving debt formation, detecting and predicting debt occurrences, recommending strategies and actions to prevent and intervene debts;
- Fraud detection: detecting and predicting internal staff-related fraud and client fraud, analyzing factors causing fraud, and recommending actions and strategies to prevent and intervene frauds;
- Client risk modeling: modeling and predicting client risk of under-declarations of income, undeclaration of circumstance changes, manipulation of entitlement and eligibility, etc.;
- Customer interaction analysis: modeling and optimizing sequential interactions and communications between clients and staff over various channels including call centres, over-the-counters, electronic communications, letters, etc.;
- Service quality evaluation: quantifying and evaluating the quality of social welfare services through different channels, recommending strategies to improve services and service objectives;
- Policy evaluation: quantifying and evaluating the effect of new policies, policy changes, and existing policies, recommending strategies to improve policies or policy effect.
The following diagram summarizes some of the research topics and areas for SSDM.
Our Experience and References
Since 2004, we’ve led and investigated many projects funded by Australian Department of Human Services, Centrelink, and Australian Research Council on social security analytics. Some of our projects addressed billions of dollars of overpayments, leading to millions of dollars in savings. Algorithms and systems were delivered to the government.
[1] Longbing Cao. Social Security and Social Welfare Data Mining: An Overview, IEEE Trans. SMC Part C, 42(6): 837-853 (2012). BibTeX
[2] Longbing Cao. Zhao Y., Zhang, C. Mining Impact-Targeted Activity Patterns in Imbalanced Data, IEEE Trans. on Knowledge and Data Engineering, 20(8): 1053-1066, 2008. BibTeX
[3] Yanchang Zhao, Huaifeng Zhang, Shanshan Wu, Jian Pei, Longbing Cao, Chengqi Zhang and Hans Bohlscheid. Debt Detection in Social Security by Sequence Classification Using Both Positive and Negative Patterns, ECML/PKDD2009, 648-663, 2009. BibTeX
[4] Huaifeng Zhang, Yanchang Zhao, Longbing Cao, Chengqi Zhang and Hans Bohlscheid. Rare Class Association Rule Mining with Multiple Imbalanced Attributes, Rare Association Rule Mining and Knowledge Discovery: Technologies for Infrequent and Critical Event Detection, Information Science Reference, 2009.
[5] Yanchang Zhao, Huaifeng Zhang, Longbing Cao, Hans Bohlscheid, Yuming Ou, Chengqi Zhang. Data Mining Applications in Social Security, Data Mining for Business Applications, Springer, pp. 81-96, 2009.
[6] Huaifeng Zhang, Yanchang Zhao, Longbing Cao, Chengqi Zhang. Class Association Rule Mining with Multiple Imbalanced Attributes, AI2007, LNCS4830, 827-831, 2007. BibTeX
[7] Yanchang Zhao, Huaifeng Zhang, Longbing Cao, Chengqi Zhang and Hans Bohlscheid. Mining Both Positive and Negative Impact-Oriented Sequential Rules From Transactional Data, PAKDD2009, pp.656-663.Customer Activity Sequence Classification for Debt Prevention in Social Security, Journal of Computer Science and Technology, 24(6): 1000-1009 (2009). BibTeX
[9] Yanchang Zhao, Huaifeng Zhang, Longbing Cao, Chengqi Zhang and Hans Bohlscheid. Efficient Mining of Event-Oriented Negative Sequential Rules, WI 08, pp. 336-342. BibTeX
[10] Yanchang Zhao, Huaifeng Zhang, Longbing Cao, Chengqi Zhang. Combined Pattern Mining: from Learned Rules to Actionable Knowledge, LNCS 5360/2008, 393-403, 2008. BibTeX
[11] Huaifeng Zhang, Yanchang Zhao, Longbing Cao and Chengqi Zhang. Combined Association Rule Mining, PAKDD2008. BibTeX
[12] Yanchang Zhao, Longbing Cao, Yvonne Morrow, Yuming Ou, Jiarui Ni, and Chengqi Zhang. Discovering Debtor Patterns of Centrelink Customers. AusDM2006. BibTeX
Please also refer to the social security data mining website for more information about the relevant concepts, projects, research activities, and publications.