designing information fusion processes to exploit human, contextual, and sensor surveillance data...
TRANSCRIPT
Designing Information Fusion Processes to Exploit Human, Contextual, and Sensor Surveillance Data for Decision Support
James LlinasResearch Professor, Director (Emeritus)
Center for Multisource Information FusionState University of New York at Buffalo
Buffalo, New York, [email protected]
NATO ASI Prediction and Recognition of Piracy Efforts Using Collaborative Human-Centric Information Systems, Salamanca, Spain (19-30 September, 2011)
Data Fusion Node
Common Representation for
all Data Fusion Processes
Data Fusion Tree
Common Representation for
all Data Fusion Architectures
Integration of Data Fusion and
Resource Management
Trees
Common Representation for
all Information System
Architectures
Fusion Node Paradigm
USERor nextfusionnode
DATAASSOCIATION
RESOURCE MGT CONTROLSSOURCE SENSOR STATUS
STATEESTIMATION
& PREDICTION
DATAALIGNMENT
SOURCESor prior fusion
nodes
F = Fusion Node
M = Management NodeF
F
F
F
F M
M
M
M
M
M
M
M
Sensor 1
Sensor 2
Sensor 3
Resourcex
Resourcey
F
F
F
F
F
F
F
F
Sensor 1
Sensor 2
Sensor 3
Sensor 4
Sensor 5
Framing Design Impacts:The Data Fusion Node as a Core Construct*
* Steinberg et al, 1999
Fusion Node Paradigm
USERor nextfusionnode
DATAASSOCIATION
RESOURCE MGT CONTROLSSOURCE SENSOR STATUS
STATEESTIMATION
& PREDICTION
DATAALIGNMENT
SOURCESor prior fusion
nodes
COMMONREFERENCING
Human Role in Surveillance
• In many modern surveillance environments, humans can play an important role as Observer– Quality of observation is sensibly Uncalibrated
• Errors, false alarms, biases not very well known, quantified
– Reporting is in either uncontrolled or possibly controlled language• Must deal with Linguistic Uncertainty
• Humans provide not only “raw” observational information such as features etc (but they normally report at the Entity Level*), but can also provide judgments regarding:– Relationships between entities– Estimates of intangible states (emotions, intent)
* Affects, bounds choices in the point of fusion; e.g., Raw Data Fusion may be infeasible
The Soft Front-end Input
UnconstrainedVocabulary
(Possibly different languages)
Semantics
Language Processing
AutomatedText
Extraction
Semantic GraphsTypical
Atomic, RawData Input
(Digitized)
Computational Linguistics,NLP
Phrase-levelentity
Perception, Cognition Affected by Stress, OpTempo
One Immediate Issue: Linguistic Uncertainty*
Five Components*1. Vagueness/vague predicates—multi-valued logics2. Context dependence (of (1) and other terms)3. Ambiguity; multiple word meanings4. Underspecificity; unbounded interpretation (“rainy days
ahead”)5. Indeterminancy of theoretical terms; i.e., aging of
terminologyAnd……Coreference Resolution, etc in NLP
* Regan, H.M., et al A TAXONOMY AND TREATMENT OF UNCERTAINTY FOR ECOLOGY AND CONSERVATION BIOLOGY, Ecological Applications, 12(2), 2002, pp. 618–628,Dwivedi, A., et al, Handling Uncertainties—Using Probability Theory to Possibility Theory, Indian Inst of Technology Kanpur India, 2006
OftenNotaddressed
Source Characterization: No Generalizable Calibration Models
Perceptual and Cognitive
Errors in observation
RealWorld Truth
Error in oral expression
Error in audio capture
Error in audio -to -text
conversion Error in text extraction
Conversion
Soft Data
To Common Ref, Data Association
є4
є5
є2
є3
є1
Hard Data
Calibration(Truth)Target
Pd (Obs Params)StatisticallyQualified Errors
To Common Ref, Data Association
Idiosyncrasies of Human Location and Time Reporting
Methods Location Estimate Time Estimate
Absolute Reference Stated map coordinates Stated clock time
Relative to ObserverRelative to report location stamp (e.g. “200m west of my position”):
(ρ,θ)
Relative to report time stamp (e.g. “20 minutes ago”)
Relative to Reference Entity
Relative to designated (possibly mobile) object (e.g. “to the left of
the bend in the road”, “50 m south of the APV column ): (ρ,θ)
Relative to designated event (e.g. 5 minutes after x entered the
building”
Time alignment very tough; any
message can have past, present, and
future tenses mixed
Complex “OOSM” problems
Human Data InputImpacts to Information Fusion Process Design
• Detection processing —hidden, not accessible, unknown; Detection Fusion probably infeasible
• Preprocessing – requires good, efficient text extraction capability; GIGO problem; investment decision re NLP
• Common Referencing —many complexities; e.g., mixed uncertainties as linguistic in possibilistic form, sensor data in probabilistic form; re time, any given observational report can contain all 3 tenses of language– Major out-of-sequence data handling aspects– Uncertainty Normalization; Language Vagueness ~ Possibilistic
• Data Association —also many complexities; both strategy for scoring Semantic Similarity and Assignment problem formulation have major issues
• State Estimation – methods to connect Soft data to estimation algorithms are under study
12
Guiding Principles for Uncertainty Transformations• There is no single best way to execute these transformations—
some transformational framework is needed that constrains the formalism of the transformation —such as:
• Principles of :– Probability/Possibility Consistency– Insufficient Reason– Information Invariance– Preference preservation– Symmetry preservation– Ignorance preservation
• So these principles provide a basis to “preserve” something across the transformation—each one provides a different approach
• Said otherwise, the result of a transformation from one representation to another is a type of “best estimate” of the alternate representation for a given value of the input form—an estimate consistent with or framed by the “Principle” applied
Impact to Common Referencing:Uncertainty Transformation Schemes Required
1
2 2 1 21 2 1 1
log ( ) log log 1( 1) ( 1)
n n n nj
i i i i ii i i j i
ip p i
i j j
• The expression on the left hand side of this equation represents Shannon entropy and the one on right side represents the total possibilisticuncertainty (sum of nonspecificity and discord)
Example:(Reflexive)
Fundamental Basis for Transformations: Preservation Principle
State Estimation:The Need for a Discovery-Oriented Approach in
Irregular Warfare Domains
• Irregular Warfare, such as Counter-Insurgency and Piracy, have weak a priori deductive knowledge foundations
• Results from lack of fundamental action models, only brief historical data, other
• Requires inferencing and state estimation to be learning and discovery-based
Modeling Human Behavior: Modeling Strategies*
* Behavioral Modeling and Simulation: From Individuals to Societies, Greg L. Zacharias, Jean MacMillan, and Susan B. Van Hemel, Editors, Committee on Organizational Modeling from Individuals to Societies, National Research Council, 2008
Human Behavior Modeling Classification
In spite of this broad range of analyticalmethods applied, success has been
marginal*
Behavior and Relationship Discovery --via Inexact Graph-Matching
Batched & Cumulative Ontologically-enhanced Data Graph
?Set of Query Graphs depicting
Complex RelationshipsOf Interest
14
Data Association
CommonReferencing
Text Extraction
StreamingMultiple
Messages
Graphical Forms
Query-based Statistical Relational Learning
Commander’s IntelligenceRequirements
Soft Fusion/Association Problem Framework
UnknownDynamic
Real-worldCOIN Use
Case
No A Priori DynamicWorld Model
StreamingMulti-
messageSOFT Data
IntelligentMessageBatching
MessageBatch Data
Association
CumulativeMessage
BatchAssociation
MinimallyQualified
ObsvnlData
Automated NLP and
TextExtraction
Contextual and
OntologicalEnhancement
CumulatingObservational,Ontological,And ContextualEvidence
t [t, + (t – 1)]
Multiple“Dismounts”
Reporting Observational Data
Domain Ontology
Observed and
AssertedData
Graphical Structure of a “message”
An [Observation—Context—Ontology] Evidential Element(Not a Point Observation ala Hard Sensing)
Multiple Relationships
Disconnected semantic fragments
Synonyms
16
WORD PHRASE
SENTENCE
What are the Associable Quanta of this Evidence?
Design of the Association Process
Human Observer 1
Human Observer 2
HypothesesSelf-generated
by node/arccontent
HypothesesScored viaSemanticSimilarity
Scores that account
For uncertainty
Pick a node/arc,Search other graphFor associable elements(e.g. exploit ontology)
GraphSearch
HypothesesEvaluationBy high-
dimensionalAssignment
problemsolution
Apply Modern assignmentProblem solution
Good AssignmentSolution &
Graph Merging
Effective
Semantic
scoring
Interdependency withText, Semantic
Operations
17
Qualified Message Pair
Abstraction Level Trade Space for Some Specific Techniques
Technique Advantages Disadvantages
Sentence-LevelRaga & Raga
• Uses Random Indexing ; a computational/geometric technique for computing semantic closeness of two sentences—given training data
• and Instance Learning ; supervised machine learning algorithms that compare new problems with training data in memory
• Ability to train algorithm easily with limited and truthed data
• Easy implementation and inexpensive
• High dimensional spaces (high computational cost and complex)
• Metrics may not be accurate for large sentences • Contextual meaning of words may not be
accurate• Overall results about 70% agreement w human
judgment• Notion of Training Data for the unexpected
topics etc of COIN is problematical
Phrase-LevelPorzel et al
• Exploits domain/problem specific ontology • Scores phrase similarity to all feasible ontological
concepts via Dijkstra shortest path score; scores cheapest via separate algorithm
• Improves ability to measure semantic closeness of recognized phrases to truth utterances
• A priori mapping of entire lexicon of domain to the ontology of the domain must first be developed
• Domain extensibility unclear• Notion of Training Data for the unexpected
topics etc of COIN is problematical
Short-Text SegmentMetzler et al
• Somewhat easier to process computationally than phrase-level
• Hybrid approach using lexical, stemmed, and context-aided representations
• Lexical approach alone is weak--Vocabulary mismatch problem
• Precision-Recall performance not very good• Best performance with complex hybrid
representational and scoring approachWord-LevelVarious
• Easy to calculate similarity metrics• Inexpensive• Computationally efficient
• Sacrifice some degree of semantic and linguistic coherence
18
Similarity Metric Performance Survey in a WordNet Application*
Method Type Correlation (to Human Judgment)
Rada Path Length 0.59
Wu Path Length 0.74
Li Path Length 0.82
Leacock Path Length 0.82
Richardson Path Length 0.63
Resnik Information Content 0.79
Lin Information Content 0.82
Lord Information Content 0.79
Jiang & Conrath Information Content 0.83
Jiang-Conrath Metric Chosen
for Alpha versionOf DA process
* Varelas, G., Voutsakis, E., Raftopoulou, P., Petrakis, E., Milios, E.: Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web. In: 7th ACM Intern. Workshop on Web Information and Data Management (WIDM 2005), Bremen, Germany (2005) 10–16 19
Word-Based Approach: Extensive Literature Survey
Hypothesis Selection: Assignment Solution Aspects
• Formation of the Assignment matrix– Formed by smart search – Based on type– Uncertainty effect on scoring – Uncertainty Based Scoring
• Challenges– Size of matrix – Nodes with no association– Non-square matrices
• Hungarian Algorithm used in alpha version to determine best matches between nodes and arcs– Solve problem in polynomial time and computation complexity is O(n3) – Objective: maximize similarity scores between messages given the scores
are above a specified threshold– Highly suited and efficient for dense problems
Bellur, U. and Kulkarni, R. 2007. Improved matchmaking algorithm for semantic web services based on bipartite graph matching. In ICWS, pp. 86-93. 20
• Assume we have some measure of Node to Node and Edge to Edge similarity.
• Can view this as two coupled assignment problems:• Nodes to Nodes• Edges to Edges
• Subject to only assigning an edge to edge if the endpoints of the two edges agree.• Example: N1 -> E1 -> N2 and N3->E2->N4
• Can only assign e1 to e2 if also assigning (N1 to N3 and N2 to N4) OR (N1 to N4 and N2 to N3).
N1 N2
N3 N4
E1
E2
N1 N2
N3 N4
E1
E2
OR
Alternative Association Approach Problem Formulation
Technical Approach Integer Program Formulation
Bipartite Constraint
Edge Constraints
Transitivity Constraints
24
The Nature of Contextual Information
Contextual information is that information that surrounds the domain of primary interest and focus; such data can affect the generation of a comprehensive state estimate or influence the ability to correctly understand a focal state estimate– A kind of constraint-set that confines the formulation of an
estimate and/or the understanding of an interpretation or estimate
We acknowledge that we do not know the true state of the world with certainty , and therefore we can only have an estimate of Context
Context itself may have uncertain components Consideration of Context must include degrees of fidelity,
granularity and precision, contextual dynamics (eg weather, other)
25
• Support the development of Hybrid Estimation algorithms that incorporate Contextual Influences within the Estimation Process• The “A Priori” Case
• Support the Understanding of Estimates by examining Consistency with Contextual Influences; Augment Consistent Estimates accordingly, and Flag Inconsistent Estimates• The “A Posteriori” Case
• Support the Augmentation of Evidential (Observational) Data by enriching such data with Relevant Contextual information
Roles for Contextual Information
General Notions Regarding Exploitation of Contextual Information in Fusion Processes
The “A Priori”
Case
To the extent possible,exploit Contextual Information
at algorithm design-time—oftenrequires formulation of a Hybrid
Approach
Streaming Observational Data
For that additional Contextual Information unable to be integrated into algorithm designs, exploit it after Fused-Estimate formation (“posteriori”) as a means for :
--Consistency Checking of formed fusion estimates (with addtl CI)--Enhancing understanding of the formed estimates/hypotheses
General Notions Regarding Exploitation of Contextual Information in Fusion Processes
The “A Posteriori” Case
ExternalKnowledge
ContextualKnowledge
Proceduralized Context
Relevance
User CurrentSituationEstimate
Task
Y ExtractReformat
Representation
In some sense, these are “Validity rules”for the task-wise inferencing (~Constraints)
Static orDynamic? Static
Dynamic
ObservationalSampling Estimation
Revised SituationalInterpretation
Injection into Inferencing Process
Dynamic, Situation-dependent Context-defining Loop
Adaptive Reasoning LoopStart Here
What is Available?
MiddlewareRequirement
Context-based Information Retrieval (CBIR)Automated Relevant Context Enhancement
Role: Evidential Enhancement
PatrolReport
Contextual Information
Priority Intelligence
Requirements
Enhanced Evidential
Information
Context-based Information
Retrieval (CBIR)Forward Inference
Background Knowledge
Sources (BKS)
Previous SituationsResearch Cyc Axioms
Insurgent data
NGA GNS
ContextualData Bases
“ConTracker”—one example of an approach to contextual exploitation in tracking*
30
* George, J., Crassidis, J., and Singh, T., Threat Assessment Using Context-Based Tracking in a Maritime Environment, Intl Conf on Info Fusion, Seattle WA USA July 2009
Slide 31One View of the L1 Problem—Hybrid Solution Required
Traditional L1 Filter Operations
Observational (sensor(s) models
Exploits (requires) a priori object dynamic model
Current State Future StateCurrent MsmtsRecursiveEstimator
RecursiveEstimator
RequiredKnowledge
Base
Modified L1 Filter Operations
Current State Future StateCurrent MsmtsRecursiveEstimator
RecursiveEstimator
RequiredKnowledge
Base
As above plus a priori defined Relevant Contextual Information DB and knowledge associated with Adjunct Processes
Context-basedObservation Validation
Context Info Set #1
Context-basedPropagation Enhancement
Context Info Set #2
ConTracker Application: Small Boats in Harbor• Harbor is contextually-rich; How to deal with multiple relevant
contextual influences—extension to Maritime applications– Sea Lanes; Water Depth; High Value Ships; Marinas; ASR’s; etc– Concern for Anomalous Small Boat behaviors
• US Office of Naval Research Program (2008-09)– “MODELING AND USING CONTEXT IN DATA FUSION”*
Norfolk/Hampton Roads VA Harbor Area
Use cases include Small Boat types
Limited-scope –proof of concept
* Teamed with Silver Bullet Solutions, Inc
ConTracker Application: Small Boats in Harbor
ConTracker Design Concepts
• Basic α-β tracker• Process Noise Covariance indicative of Tracker accuracy
– (Used to propagate estimation error covariance)
• If Boat follows predicted model behavior closely, “Q” would be small—if inconsistent with model (including Contextual influences), Q large
• True Q unknown—estimate via Multiple Model Adaptive Estimation approach– Nominate range of Q pdf’s reflective of expected range of accelerations– Use Bayes-based approach [P(q|Ymsmt)] to estimate best q, given the
measurements (residual from each filter using different pdf)– Feed back to α-β tracker for update, propagation
34
ConTracker: Context aided Tracker
• Selected contextual information is integrated into the target model as trafficability factors in velocity level– Reasonable variations in velocity are allowed based on the
contextual info, i.e., allowed velocity variations would not be treated as erratic maneuversTracker follows the variations
– However, perverse variations in velocity (variations not due to trafficability influence) would account for erratic maneuver – anomalous behaviorLevel 2 hypothesis generator “red flags” the behavior
ConTracker--Basic Ideas
ConTracker, approximately an α-β tracker, incorporates Contextual Factors in its track propagation step
Measurements may reveal actual target motionInconsistent with Contextual factors
MMAE estimates the best conditional Process Noise Covariance given the current target measurement (Bayes), and feeds this back to the ConTracker to allow anImproved propagation estimate to the next measurementtime
If level of Process Noise Covariance is exceptional,above a defined TH, a “Red Flag” anomalous behavior Hypothesis is generated for Operator review
38
Trafficability of Depth/DraftFunction of Safety Factor (1.5) and Transition (0.8)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.5 1 1.5 2 2.5 3
Unitless Context Parameter (Depth/Draft for example)
Depth = Draft2% Trafficability
Depth / Draft
Depth = Safety Factor * Draft50% Trafficability
Transition = 0.8(from 2% to 98%)
tanh12
1
contextTwhere
eachand each
transition
factorsafety
4
DepthDraft
for example
Contextual Factors Algorithms--(SME-developed)
• Trafficability is the capability of sea waters to bear traffic.
• Refers to the extent to which the waters (each 0.5 km x 0.5 km grid square) will permit continued movement of any or all types of traffic.
• We can associate trafficability with the probability that a given vessel type should not be nudged away from a grid square.
• Therefore the trafficability of independent contexts are multiplied together.
shippingantiT
assetsvaluehighT
channelmarkedT
depthTT
Vessel Type (unitless context parameter)
Safety Factor
Transition
Sailboat 1.5 0.8
Ski boat 1.1 0.1
Recreational fishing boat
4 2
Tug boat 1.5 1
tanh12
1
depthT
transition
factorsafety
4
DraftDepth
DraftDepth
DraftDepth
DraftVesselValueHighDepth
Depth Trafficability
Vessel Type (unitless context parameter) Safety Factor
Transition
Sailboat 0.3 1
Ski boat 0.5 0.7
Recreational fishing boat
0.3 1
Tug boat 0 1
tanh12
1
channelmarkedT
transition
factorsafety
4
within marked channelF
fraction of grid square within a marked channel
laneshippingwithinF1
laneshippingwithinF1
laneshippingwithinF1
laneshippingwithinF1
Marked Channel Trafficability
Vessel Type (unitless context parameter)
Safety Factor Transition
Sailboat 0.99 0.01
Ski boat 0.8 0.2
Recreational fishing boat
0.99 0.01
Tug boat 0 1
tanh12
1
HVAT
transition
factorsafety
4
2
iR
high value assets colocated assetsi nearby assets
e N
1
1
1
2
Density (ρ) of High Value Assets is approximated by the number (N) of co-located HVA:
High Value Assets Trafficability
Vessel Type (unitless context parameter)
Safety Factor
Transition
Sailboat 0.99 0.01
Ski boat 0.99 0.01
Recreational fishing boat
0.9 0.1
Tug boat 0 1
tanh12
1
ASRT
transition
factorsafety
4
2
iR
anti shipping colocated reportsi nearby reports
e N
1
1
1
1
Density (ρ) of Anti Shipping Reports is approximated by the number (N) of co-located ASR
Anti Shipping Reports Trafficability
Thus, a key aspect of Contextual Exploitation involves quantifying the Influences or Constraints on the intended State Vector resulting from Contextual
factors--will require Domain Subject Matter Experts
Contextually-influenced Anomalous Behavior Flag
39
-2 0 0 L _ _ ----- ! c j
0 500 1 0 0 0 Time(sec)
(a) Boat D ired io n
Excessive variations in Process Noise Covariance
Trigger Anomalous Behavior Flagging
S1
• • •
s2
sN Filte
ring/
quer
y/se
lecti
on
Pre-processing
Pre-Processing
Pre-Processing
Asso
ciati
on/C
orre
latio
nReportLevel Fusion
Translation to graphlet
representation
Graph Matching& Fusion
Hard Sensor Data Processing Flow
S1
• • •
s2
sN Filte
ring/
quer
y/se
lecti
on
Pre-processing
Pre-Processing
Pre-Processing
Asso
ciati
on/C
orre
latio
n
ReportLevel Fusion
Translation to graphlet
representation
HCI• Graphlet to
Situation Display
• Hypothesis to graphlet
Soft Data Processing Flow
Focus of Attention
Situation Display
Constraint on the Fusion Point:Fusion at the Semantic Entity Level
“Messages”
42
Detailed Hard Data Fusion Subsystem
TraditionalDetection, Alignment, Feature Extraction
New Methods of Fusion-basedEntity Identification & Tracking
And Activity/Behavior Estimation
Fused Entity EstimatesFramed in Graphical Format
Passed to Association
43
Detailed Soft Data Fusion Subsystem
Digital Natural Language ProcessingText Extraction; Contextual Enhancement
Observational Uncertainty
Insertion
Hard-SoftData
Association
Entity-levelHard DataIntegration
StreamingEvidence
AccumulationInferencing
AndLearning
Summary
• Incorporating and fusing Soft Data and Contextual Data – along with more traditional Hard Sensor Data – imputes a wide variety of challenges to effective and efficient Fusion Process design
• All Core Fusion functions, Common Referencing, Data Association, and State Estimation are impacted
• Not discussed here but equally important are framing the Concepts of Employment of such technology– Overall uncertainty levels are inherently much higher – New Decision-Making paradigms, or use of Risk-centric DM
paradigms will likely be required, as contrasted with Max utility type schemes
• Testing for both correctness and performance, as well as effectiveness will also require new T&E paradigms to be developed– Eg notions of Truth are much more complex
44
Categories of Human Input: Variable Reliability
Third-party Reporting
Source Credibility
??
ReportCredibility
??
Direct Interaction
ReportCredibility
??
Source Credibility
?
Passive Observation
Report Credibility
Audio-TextUnit
TOC
Audio-TextUnit
Other (Hard) Source
Information
Data Characteristic
Hard Soft Remarks
Observation sampling rate
High Low Imputes requirements for adaptive, retrodiction-type processing (i.e. “Out-of-Sequence Measurement” type processing), as well as agile Temporal Reasoning
Semantic Content Limited to specific, usually singular Entities
Can be conceptually broader than single Entities
Imputes requirements to design an automated Semantic Labeling process, coupled to a rich Domain Ontology Requires ability to associate and infer at multiple levels of abstraction
Limited to Entity Attributes
Can include Judged Relationships
Accuracy, Precision
Relatively high, good repeatability (Precision)
Broadly low accuracy in attributes, high at the conceptual level
Imputes requirements for robust Common Referencing and Data Association
• Totally distinct from Hard Sensors• Philosophy: Relations not directly
observable—require reasoning over properties of entities
Brower, J., (2001) "Relations without Polyadic Properties: Albert the Great on the Nature and Ontological Status of Relations." Archiv für Geschichte der Philosophie 83: 225–57.
This line of thought suggests that relations are the result of a process of some type of comparison, ie [Brower, 2001], “an act of reasoning”.
Humans can also judge intangibles
--emotional state
Some Distinctions in Hard and Soft Observational Data
Graph-matching as a Discovery-based Approach for Soft Data Fusion : US Army Research Lab Project
Free Text
Attensity Natural Language Processor
RDFOntology
RDF
Observed Data Graph
Enhanced Data Graph (from ontology)
Dynamic NetworkCentrality Analysis
Dynamic GraphMatching Operations
ALERT
(Common Category Schema is Needed)
Target GraphsOf Interest
IntelAnalysts
Free Text
Attensity Natural Language Processor
RDFOntology
RDF
Observed Data Graph
Enhanced Data Graph (from ontology)
Dynamic NetworkCentrality Analysis
Dynamic GraphMatching Operations
ALERT
(Common Category Schema is Needed)
Target GraphsOf Interest
IntelAnalysts
COINProblem
Data Association