Below, we list some of the terms selected from the book: Domain Driven Data Mining.
Actionability measures the ability of a pattern to suggest a user to take some concrete
actions to his/her advantage in the real world. The pattern satisfies both technical
and business performance needs from both objective and subjective perspectives.
It particularly measures the ability to suggest business decision-making actions.
Actionable knowledge discovery is an iterative optimization process toward the
actionable pattern set, considering the surrounding business environment and problem
states. It is a loop-closed and iterative refinement process, multiple feedbacks,
iterations and refinement are involved in the understanding of data, resources, the
roles and utilization of relevant intelligence, the presentation of patterns, the delivery
specification, and knowledge validation.
Actionable knowledge delivery aims to deliver knowledge that is of solid foundation,
business-friendly, and can be taken over by business people for decision making
seamlessly. During the process and iterations of actionable knowledge
discovery, understanding and deliverables are progressively improved and enhanced
toward the final deliverables satisfying user and business needs and supporting direct
decision-making action-taking. Its main objective is to enhance the actionability
of identified patterns for smart problem-solving.
Actionable pattern satisfies both technical and business interestingness needs, is
business-friendly and understandable, reflects user preferences and business needs,
and can be seamlessly taken over by business people for decision-making action taking.
Actionable patterns can support business problem-solving by taking actions
recommended by the pattern, and correspondingly transform the problem status
from an initially non-optimal status to a greatly improved one.
Agent-driven data mining (ADDM) refers to the contributions made by multiagents
for enhancing data mining tasks. ADDM can contribute to the problem solving
of many data mining issues, eg., agent-based data mining infrastructure and architecture,
agent-based interactive mining, agent-based user interaction, automated
pattern mining, agent-based distributed data mining, multi-agent dynamic mining,
multi-agent mobility mining, agent-based multiple data source mining, agent-based
peer-to-peer data mining, and multi-agent web mining.
Agent mining namely agents and data mining interaction and integration, is a new
research area that fulfills the respective strengths of both agents and data mining to
handle either critical challenges in an individual party or mutual issues. Agent mining
studies the methodologies, principles, techniques and applications of the integration
and interaction between agents and data mining, as well as the community that
focuses on the study of agent mining. The interaction and integration between agents
and data mining are comprehensive, multiple dimensional, and inter-disciplinary.
Business interestingness Business interestingness of a pattern is determined from
domain-oriented personal, social, economic, user preference and/or psychoanalytic
aspects. It consists of both subjective and objective aspects.
Closed-loop mining The discovery of patterns is through a process is with closed-loop
feedback and iterations. Actionable knowledge discovery in a constraint-based
context is more likely to be a closed-loop rather than open process. A closed-loop
process indicates that the outputs of data mining are fed back to change relevant
parameters or factors in particular stages. The feedback and change effect may be
embodied through analyzing and adjusting the relationships between outputs and
particular parameters and factors, and eventually tuning the parameters and factors
accordingly.
Cluster pattern more than two patterns are correlated to each other in terms of
pattern merging method G into a cluster. Atomic patterns are combined in terms of
certain relationships from the structural (for instance, Peer-to-Peer relation, Master-
Slave relation) or timeframe (for example Independent relation, Concurrent relation
or Sequential relation, or Hybrid relation) perspectives.
Constraint refers to conditions applied on or involved in the process of actionable
knowledge discovery and delivery, including domain constraints, data constraints,
interestingness constraints, and deliverable constraints.
Contrast pattern results from the mining process in which one considers the mining
of patterns/models that contrast two or more datasets, classes, conditions, time
periods, and so forth. It captures the situations or contexts (the conditional contrast
bases) where small changes in patterns to the base make big differences in matching
datasets.
Coupled sequence refers to multiple sequences of itemsets, which are coupled
with each other in terms of certain relationships. An example is the trade sequence,
buy sequence and sell sequence in stock markets, in which they are coupled in terms
of trading mechanisms, trading rules and investment intention etc.
Combined association rule consists of association rules identified from multiple
datasets, which are combined into one combined pattern in terms of a certain relationship.
Combined association cluster is a set of combined association rules based on a
combined rule pair, where the rules in the cluster share a same underlying pattern
but have different additional pattern increments on the left side.
Combined association pair consists of a pair of association rules.
Combined mining is a two to multi-step data mining and post-analysis procedure,
consisting of mining atomic patterns, merging atomic pattern sets into combined
pattern set, or merging dataset-specific combined patterns into the higher level of
combined pattern set. It directly analyzes complex data from multiple sources or
with heterogeneous features such as covering demographics, behavior and business
impacts. The aim of combined mining is to identify more informative knowledge
that can provide an informative and comprehensive presentation of problem solutions.
The deliverables of combined mining are combined patterns.
Combined pattern consists of multiple components, a pair or cluster of atomic
patterns, identified in individual sources or based on individual methods. As a result
of combined mining, the delivery of combined patterns presents an in-depth and
more comprehensive indication for taking decision-making actions, which make the
patterns informative and more actionable than patterns composed of single aspects
only, or identifying by single method-based results.
Data constraint Constraints on particular data, may be embodied in terms of aspects
such as very large volume, ill-structure, multimedia, diversity, high dimensionality,
high frequency and density, distribution and privacy, dynamics and changes.
Data intelligence reveals interesting stories and/or indicators hidden in data about
a business problem. The intelligence of data emerges in the form of interesting
patterns and actionable knowledge. It consists of multi-level of data intelligence,
namely explicit intelligence, implicit intelligence, syntactic intelligence, and semantic
intelligence.
Decremental cluster pattern also called decremental pattern cluster, is a special
cluster of combined patterns, within which a former atomic pattern has an additional
pattern increment compared to its next adjacent constituent pattern.
Decremental pair pattern also called decremental pattern pair, is a pair of combined
patterns which are paired in terms of certain relationship, within which the
first atomic pattern has a pattern increment part compared to the second constituent.
Deliverable constraint refers to conditions on deliverables such as business rules,
processes, information flow, presentation, etc. may need to be integrated into the
domain environment. For instance, learned patterns can be converted into operationalizable
business rules for business peoples use.
Derivative pattern is a derived pattern on top of an underlying pattern, namely by
appending additional pattern components on to the base pattern. When it is applied
to the impact-oriented pattern, the extension leads to the difference between the
outcomes of the constituent patterns. The derivative relationship can be unordered
or ordered.
Discriminative pattern or discriminating pattern, refers to those patterns drawing
distinctions from other candidates, usually taken in consideration based on class,
category, significance, or impact difference etc. Its opposite form is often called
indiscriminative pattern.
Domain constraint includes the domain and characteristics of a problem, domain
terminology, specific business process, policies and regulations, particular user profiling
and favorite deliverables.
Domain driven data mining also called Domain Driven Actionable Knowledge
Delivery, building on top of the traditional data-centered pattern mining framework, refers
to the set of methodologies, frameworks, approaches, techniques, tools and systems
that involve human, domain, organizational and social, and network and web factors
in the environment, for the discovery and delivery of actionable knowledge.
Domain factor consists of the involvement of domain knowledge and experts, the
consideration of constraints, and the development of in-depth patterns, which are
essential for filtering subtle concerns while capturing incisive issues.
Domain intelligence refers to the intelligence that emerges from the involvement
of domain factors and resources in pattern mining, which wrap not only a problem
but its target data and environment. The intelligence of domain is embodied through
the involvement into KDD process, modeling and systems. It consists of qualitative
and quantitative domain intelligence.
Dynamic chart is a pattern presentation method, which presents the dynamics of
sequential patterns, activity interaction, and impact change, and the formation of
associated pairs and clusters in terms of pattern interestingness.
Emerging patterns are sets of items whose frequency changes significantly from
one dataset to another. It describes significant changes (differences or trends) between
two classes of data.
General pattern refers to the pattern mined based on technical significance associated
with the algorithm used.
Human intelligence refers to (1) explicit or direct involvement of human knowledge
or a human as a problem-solving constituent, etc., and (2) implicit or indirect
involvement of human knowledge or a human as a system component.
Impact-oriented pattern An impact-oriented pattern consists of two components,
namely the left-hand itemsets and the right-hand target impact associated with the
left-hand itemsets. It means that the occurrence of the left-hand itemsets likely results
in the impact defined on the right hand side.
Impact-reversed pattern An impact-reversed pattern consists of an underlying activity
pattern and a derivative pattern with an incremental component. In the reversal from one
patterns impact (T1) to the others (T2), the extra itemset plays an important
role.
Incremental cluster pattern also called incremental pattern cluster, is a cluster of
combined patterns coupled in terms of certain relationships, within which additional
pattern increments are appended to every previously adjacent constituent patterns.
Incremental pair pattern also called incremental pattern pair, is a pair of combined
patterns which are paired in terms of certain relationship, within which the
second atomic pattern has an additional pattern increment part compared to the first
constituent. For instance, a contrast pattern consisting of an underlying pattern and
a derivative pattern.
In-depth pattern also called deep pattern, uncovers not only appearance dynamics
and rules but also inside driving forces, reflects not only technical concerns but also
business expectations, and discloses not only generic knowledge but also something
that can support straightforward decision-making actions. It is a pattern actionable
in the business world. In-depth pattern is either filtered and summarized in terms of
business expectations on top of general pattern(s), or itself discloses deep data intelligence.
In-depth pattern mining discovers more interesting and actionable patterns
from a domain-specific perspective.
Interestingness measures the significance of a pattern learned on a dataset through
a certain method. The pattern interestingness is specified in terms of technical interestingness
and business interestingness, from both objective and subjective perspectives.
Interestingness constraint determines what makes a rule, pattern and finding
more interesting than the other.
Intelligence meta-synthesis involves, synthesizes and uses ubiquitous intelligence
surrounding actionable knowledge discovery and delivery in complex data and environment.
Knowledge actionability Given a pattern P, its actionable capability is described
as being the degree to which can satisfy both technical interestingness and business
one. If both technical and business interestingness, or a hybrid interestingness
measure integrating both aspects, are satisfied, it is called an actionable pattern.
Market microstructure data refers to the data acquired in capital markets, which
is produced in terms of the theory of market microstructure and trading rules. Market
microstructure data presents special data complexities, such as high frequency, high
density, massive quantity, data stream, time series, mutliple coupled sequences etc.
Market microstructure pattern refers to the pattern learned on market microstructure
data.
Multi-feature combined mining is a kind of combined mining which learns patterns
by involving multiple feature sets, usually heterogeneous. For instance, a combined pattern
may consist of demographic features, business policy-related features,
and customer behavioral data.
Multi-method combined mining is a kind of combined mining which learns by involving
multiple data mining methods. It consists of serial multi-method combined
mining, parallel multi-method combined mining, and closed-loop multi-method
combined mining.
Multi-source combined mining is a kind of combined mining which learns patterns
by involving multiple data sets, usually distributed and heterogeneous.
Network intelligence refers to the intelligence that emerges from both web and
broad-based network information, facilities, services and processing surrounding a
data mining problem and system. It involves both web intelligence and broad-based
network intelligence.
Objective technical interestingness measures to what extent the findings satisfy
business needs and user preferences based on the objective criteria.
Objective technical interestingness is embodied by measures capturing the complexities
of a pattern and its statistical significance. It could be a set of criteria.
Organizational factor refers to many aspects existing in an organization, such as
organizational goals, actors, roles, structures, behavior, evolution, dynamics, interaction,
process, organizational/business regulation and convention, workflow and
actors surrounding a real-world data mining problem.
Organizational intelligence refers to the intelligence that emerges from involving
organization-oriented factors and resources into pattern mining. The organizational
intelligence is embodied through its involvement in the KDD process, modeling and
systems.
Pair pattern consists of two atomic patterns that are co-related to each other in
terms of a pattern merging method into a pair.
Pattern summarization is a process of data mining, which summarizes learned
patterns into higher level of patterns.
Pattern merging is a process of data mining, which merges multiple relevant patterns
into one or a set of combined patterns. For instance, local patterns from corresponding
data miners are merged into global pattern sets, merging atomic pattern
sets into combined pattern set, or merging dataset-specific combined patterns into
the higher level of combined pattern set.
Pattern increment refers to the additional component on top of an underlying pattern
(a prefix or postfix) to form a derivative pattern, or an incremental pattern. For
instance, with an underlying pattern U, different pattern increment V1, ..., VN
may be added to U, to form into different derivative pattern U, V1, U,V2, . . . , U, Vn.
Pattern interaction refers to the process and protocol in which patterns are interacted
with each other to form into certain new patterns. Cluster patterns and pair patterns may be
resulted from pattern interaction. Many pattern interaction mechanisms can be created,
for instance, pattern clustering, classification of patterns etc.
Pattern impact refers to the business impact associated with a pattern or a set of
patterns. For instance, a frequent sequence is likely associated with the occurrence
of government debt, here government debt is the impact.
Post analysis refers to techniques that are used to post-process learned patterns,
for instance, to prune rules, reduce redundancy, summarize learned rules, merge
patterns, match expected patterns by similarity difference, and the extraction of actions
from learned rules.
Post mining refers to the pattern mining on learned patterns, or on learned patterns
combined with additional data. The main difference between post analysis and post
mining is whether another round of pattern mining process is conducted on the
learned pattern set or not.
Reverse pattern also called impact-reserved pattern, is a pattern corresponding
to another pattern, which triggers the impact change from one to another, usually
opposite impact.
Social factor refers to aspects related to human social intelligence such as social
cognition, emotional intelligence, consensus construction, and group decision;
animat/agent-based social intelligence aspects such as swarm/collective intelligence
aspects, behavior/group dynamics aspects, as well as many common aspects such as
collective interaction, social behavior network, social interaction rules, protocols,
norms, trust and reputation, and privacy, risk, and security in a social context, etc.
Social intelligence refers to the intelligence that emerges from the group interactions,
behaviors and corresponding regulation surrounding a data mining problem.
Social intelligence covers both human social intelligence and animat/agent-based
social intelligence.
Subjective business interestingness measures business and user concerns from
the subjective perspectives such as psychoanalytic factors.
Subjective technical interestingness focuses and is based on technical means, and
recognize to what extent a pattern is of interest to a particular technical method.
Technical interestingness The technical interestingness of a pattern is highly dependent
on certain technical measures specified for a data mining method. Technical
interestingness is further measured in terms of objective technical measures and
subjective technical measures.
Ubiquitous intelligence refers to the emergence of intelligence from many related
aspects surrounding a data mining task, such as in-depth data intelligence, domain
intelligence, human intelligence, network and web intelligence, and/or organizational
and social intelligence. Real-world data mining applications often involve
multiple aspects of intelligence, a key task for actionable knowledge discovery and
delivery is to synthesize such ubiquitous intelligence. For this, methodologies, techniques
and tools for intelligence meta-synthesis in domain driven data mining is
necessary. The theory of M-computing, M-interaction and M-space provide a solution
for this purpose.
Underlying pattern also called base pattern, a base of a combined pattern, on top
of which new pattern(s) is(are) generated. An underlying pattern may be taken as
prefix or postfix of a derivative pattern.