explanation infrastructure supporting transparency and accountability deborah l. mcguinness...
DESCRIPTION
Deborah McGuinness, June 2006 TAMI Background Overall TAMI Ambition: Explore privacy implications of semantic web technologies and determine viability. Goal: Assess applicability of a usage limitation model (as opposed to or in addition to a data collection model) to data mining/profiling applications. Explore technical challenges of provable accountability with explicit justifications in large scale, heterogeneous information systems (i.e., the Web). Develop public policy models to encourage transparent, accountable data miningTRANSCRIPT
Explanation Infrastructure Supporting
Transparency and Accountability
Deborah L. McGuinnessCo-Director and Senior Research Scientist
Knowledge Systems, AI Laboratory, Stanford University
Joint work with Li Ding, Cynthia Chang, Vasco Furtado, Paulo Pinheiro da Silva
Inference Web is joint work with Richard Fikes, Alyssa Glass, Honglei Zeng, Mayukh Bhaowal, Bill Millar, Dhyanesh Narayan, Priyendra Deshwal, …
TAMI/Portia Privacy and Accountability Workshop28-29 June 2006, Cambridge, MA USA
Deborah McGuinness, June 2006
General Motivation
• Interoperability – as more systems are use varied sources and multiple information manipulation engines, they benefit more from encodings that are shareable and interoperable
• Provenance - if users (humans and agents) are to use and integrate data from unknown, unreliable, or multiple sources, they need provenance metadata for evaluation
• Explanation/Justification – if information has been manipulated (i.e., by sound deduction or by heuristic processes), information manipulation trace information should be available
• Trust –if some sources are more trustworthy than others, representations should be available to encode, propagate, combine, and (appropriately) display trust values
Provide interoperable knowledge provenance infrastructure that supports explanations of sources, assumptions, learned information, and answers as an
enabler for trust.
Deborah McGuinness, June 2006
TAMI Background
• Overall TAMI Ambition: Explore privacy implications of semantic web technologies and determine viability.
• Goal: Assess applicability of a usage limitation model (as opposed to or in addition to a data collection model) to data mining/profiling applications.
• Explore technical challenges of provable accountability with explicit justifications in large scale, heterogeneous information systems (i.e., the Web).
• Develop public policy models to encourage transparent, accountable data mining
Deborah McGuinness, June 2006
Privacy and ExplanationAs more online data becomes available and as inference and interoperability increases, privacy protection will have to rely more on usage limitation rules and less on collection limitation rules
Usage Limits depend upon:• Transparency: knowledge provenance history of sources, data
manipulations, and inferences is maintained in an interoperable form. Explanation technology used to allow examination of knowledge provenance by authorized parties (who may be the general public).
• Accountability: ability to check whether the policies that govern data manipulations and inferences were in fact adhered to. Explanation technology used to examine inference usage.
Deborah McGuinness, June 2006
WWW Toolkit
Proof Markup Language (PML)CWM
(TAMI)
JTP(DAML/NIMD)
SPARK(CALO)
UIMA(NIMD/Exp Agg)
IW Explainer/Abstractor
IWBase
IWBrowser
IWSearch
Trust
Justification
Provenance
N3
KIF
SPARK-L
Text Analytics
IWTrust
provenanceregistration
search enginebased publishing
Expert friendlyVisualization
End-user friendly visualization
Trust computationSDS
(DAML/SNRC)OWL-S/BPEL
[Inference Web] Framework for explaining question answering tasks by abstracting, storing, exchanging, combining, annotating, filtering, segmenting, comparing, and rendering proofs and proof fragments provided by question answerers.
Inference Web Infrastructure
Deborah McGuinness, June 2006
PML Ontology
Deborah McGuinness, June 2006
How PML Works
Justification Trace
IWBase
NodeSet foo:ns1(hasConclusion …)
Query foo:query1(type arrest1 ?x)
Question foo:question1 (how is arrest1 classified?)
Mapping
NodeSet foo:ns2(hasConclusion …)
SourceUsage
hasAnswer
hasAntecendent
fromQuery
fromAnswer
…
isQueryFor
InferenceEngine
InferenceRule
hasVariableMapping
hasInferencEngine
hasRuleInferenceStep
Language hasLanguage
InferenceStep
Source
isConsequentOf
hasSourceUsage hasSource isConsequentOf
usageTime …
Deborah McGuinness, June 2006
TAMI ArchitectureData Sources
Inference Engine(s) & Truth Maintenance System
Justification Generator (why action does/doesn’t comply with policy)
Proof Generation Services (Inference Web)
Assertions with provenance
Usage Rules (Laws/Policy)
Action
Privacy Act background
Deborah McGuinness, June 2006
Privacy Act
Deborah McGuinness, June 2006
Sharing Restrictions• (b) Conditions of Disclosure.—
– No agency shall disclose any record which is contained in a system of records
– by any means of communication to any person or agency …
– except . . . with the prior written consent of, the individual to whom the record pertains,
– unless disclosure of the record would be— ….
• (3) for a routine use [5 USC § 552a(b)(3)]
Deborah McGuinness, June 2006
Routine Use• as defined in subsection (a)(7)
– “the use of such record for a purpose which is compatible with the purpose for which it was collected;” [5 USC § 552a(a)(7)]
• described under subsection (e)(4)(D) – “publish in the Federal Register upon establishment or
revision a notice of the existence and character of the system of records, which notice shall include—” [5 USC § 552a(e)(4)]
– “each routine use of the records contained in the system, including the categories of users and the purpose of such use” [5 USC § 552a(e)(4)(d)]
Deborah McGuinness, June 2006
a ROUTINE USE
a SHARING_PERMISSION
Recipient : [1,∞] OrganizationHasDataCategory :DataCategoryHasPurpose :AuthorizedPurpose
…
General Categories
Structured Components
One Simple Description
Deborah McGuinness, June 2006
Sample CodeRU1 a RoutineUse
; recipient FBI ; category DataCategory3; purpose CounterTerrorism
; dc:description "May share where TSA becomes aware of information that may be related to an individual identified in the TSDB.“
DataCategory3 a DataCategory; dc:description "Information about people who are known or reasonably suspected to be or have been engaged in conduct constituting, in preparation for, in aid of, or related to terrorism.".
Can share with FBI
Data about possible terrorists
So they can investigate whether a criminal law has been violated
Deborah McGuinness, June 2006
Explanation• Why was data sharing xyz acceptable?
– Using RoutineUse459• We shared with the FBI• Data belonging to DataCategory3• For the purpose of
CounterterrorismCriminalLawEnforcement• Provenance: RoutineUse459 is derived from
– SORN (statement of records notice) for “Secure Flight”– Published at 70 FR 36319– Encoded by DHS Office of the Privacy Officer
Deborah McGuinness, June 2006
Explanation• Why was data sharing xyz unacceptable?
– We checked all of the RoutineUses• We shared with the IRS• Data belonging to DataCategory3
– IRS is not authorized to receive DataCategory3• For the purpose of
CounterterrorismCriminalLawEnforcement• Provenance: all RoutineUses are derived from
– SORN for “Secure Flight”– Published at 70 FR 3620– Encoded by DHS Office of the Privacy Officer
Scenario 3 Examples(nerd level)
Deborah McGuinness, June 2006
IWBrowser - Browse & Debug
Deborah McGuinness, June 2006
IWBrowser - Provenance
Deborah McGuinness, June 2006
IWBrowser - Explanation
Deborah McGuinness, June 2006
A fragment of PML data1. <iw:NodeSet rdf:about="#ns__g463">2. <iw:hasConclusion> @prefix : <http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#> .3. @prefix data: <http://dig.csail.mit.edu/TAMI/cph/v2/data.ttl#> .4. data:arrest-1 a :NotJustifiedArrest; :charge <http://dig.csail.mit.edu/TAMI/law/USC-18-228> . 5. </iw:hasConclusion>6. <iw:hasEnglishString>data:arrest-1 [is-a] :NotJustifiedArrest;[and has] :charge <http://dig.csail.mit.edu/TAMI/law/USC-18-228>
.</iw:hasEnglishString>7. <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/N3.owl#N3"/>8. <iw:isConsequentOf rdf:resource="#is__g463"/>9. </iw:NodeSet> 10. <iw:InferenceStep rdf:about="#is__g463">11. <iw:hasAntecedent rdf:resource="#ns__g419"/>12. <iw:hasAntecedent rdf:resource="#ns__g420"/>13. <iw:hasAntecedent rdf:resource="#ns__g425"/>14. <iw:hasAntecedent rdf:resource="#ns__g479"/>15. <iw:hasAntecedent rdf:resource="#ns__g502"/>16. <iw:hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/CWM.owl#CWM"/>17. <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/GMP.owl#GMP"/>18. <iw:hasVariableMapping rdf:parseType="Resource">19. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>20. <iw:mapFrom>http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#A</iw:mapFrom>21. <iw:mapTo>http://dig.csail.mit.edu/TAMI/cph/v2/data.ttl#arrest-1</iw:mapTo>22. </iw:hasVariableMapping>23. <iw:hasVariableMapping rdf:parseType="Resource">24. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>25. <iw:mapFrom>http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#P</iw:mapFrom>26. <iw:mapTo>http://dig.csail.mit.edu/TAMI/background#ct-criminal-law-enforcement</iw:mapTo>27. </iw:hasVariableMapping>28. <iw:hasVariableMapping rdf:parseType="Resource">29. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>30. <iw:mapFrom>http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#C</iw:mapFrom>31. <iw:mapTo>http://dig.csail.mit.edu/TAMI/law/USC-18-228</iw:mapTo>32. </iw:hasVariableMapping>33. </iw:InferenceStep>
Deborah McGuinness, June 2006
Next Generation Browser
Deborah McGuinness, June 2006
Discussion• Exploring an explainable usage limitation model
supported by semantic technologies• Status:
– Use case written up and (very recently) encoded– Early prototype in place – JUST integrated with CWM with
use case 3 … refinement in process– Exploiting domain independent explanation tools– Exploring options that leverage knowledge of N3 and CWM
supporting more useful tools/APIs/interfaces– Exploring options that leverage knowledge of law supporting
context-sensitive and appropriate presentations
Deborah McGuinness, June 2006
Summary• Leverage points include:
– PML: justification, provenance, and trust interlingua– IW tools for interactive browsing, summarizing, searching, validation,
abstraction, trust representation, propagation, presentation… (domain independent)
– Experience explaining a wide variety of reasoners (JTP, SNARK, …), task processing engines (SPARK), learners (TAILOR), text analytics (UIMA), web services (SDS/BPEL), …
• Continuing work– Identify explanation (representation and presentation) requirements– Refining registration and presentation– Designing special purpose filters and abstractions
• Impact– Interoperable justifications supporting transparency and accountability– Has the potential to change publishing law (with markup) as well as presenting
judgments (with interactive justifications)– Audit trace
Deborah McGuinness, June 2006
More Information• Links
– http://iw.stanford.edu/: Inference Web home – http://iw.stanford.edu/doc/project/tami/: TAMI@Stanford– http://dig.csail.mit.edu/TAMI: TAMI home
• Papers– (Inference Web) Deborah L. McGuinness and Paulo Pinheiro da Silva.
Explaining Answers from the Semantic Web: The Inference Web Approach. Journal of Web Semantics. Vol.1 No.4., pages 397-413, 2004
– (PML) Paulo Pinheiro da Silva, Deborah L. McGuinness and Richard Fikes. A Proof Markup Language for Semantic Web Services. Information Systems. Volume 31, Issues 4-5, pages 381-395, 2006
– (TAMI) Daniel J. Weitzner, Hal Abelson, Tim Berners-Lee, Chris P. Hanson, Jim Hendler, Lalana Kagal, Deborah L. McGuinness, Gerald J. Sussman, K. Krasnow Waterman. Transparent Accountable Inferencing for Privacy Risk Management. Proceedings of AAAI Spring Symposium on The Semantic Web meets eGovernment. AAAI Press, Stanford University, USA 2006. Also available as MIT CSAIL Technical Report-2006-007 and Stanford KSL Technical Report KSL-06-03.
Deborah McGuinness, June 2006
Extras
Deborah McGuinness, June 2006
IWBrowser - Browse & Debug
Deborah McGuinness, June 2006
IWBrowser - Provenance
Deborah McGuinness, June 2006
IWBrowser - Explanation
Deborah McGuinness, June 2006
A fragment of PML data1. <NodeSet rdf:about="#ns__g208">2. <hasConclusion> @prefix : <http://dig.csail.mit.edu/TAMI/cdk/scenario3/data.n3#> .3. @prefix tr: <http://dig.csail.mit.edu/TAMI/cdk/scenario3/rules.n3#> .4. :Arrest05NY2343CRJFC a tr:JustifiedArrest . </hasConclusion>5. <hasEnglishString>:Arrest05NY2343CRJFC [is-a] tr:JustifiedArrest .</hasEnglishString>6. <hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/N3.owl#N3"/>7. <isConsequentOf rdf:resource="#is__g208"/>8. <rea:step rdf:resource="tami-scenario3-proof.n3#_g_L23C26"/>9. </NodeSet>10. <InferenceStep rdf:about="#is__g208">11. <hasAntecedent rdf:resource="#ns__g186"/>12. <hasAntecedent rdf:resource="#ns__g187"/>13. <hasAntecedent rdf:resource="#ns__g188"/>14. <hasAntecedent rdf:resource="#ns__g207"/>15. <hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/CWM.owl#CWM"/>16. <hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/GMP.owl#GMP"/>17. <hasVariableMapping rdf:parseType="Resource">18. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>19. <mapFrom>http://people.csail.mit.edu/lkagal/tami/tami-scenario3-filter.n3#A</mapFrom>20. <mapTo>http://dig.csail.mit.edu/TAMI/cdk/scenario3/data.n3#Arrest05NY2343CRJFC</mapTo>21. </hasVariableMapping>22. <hasVariableMapping rdf:parseType="Resource">23. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>24. <mapFrom>http://people.csail.mit.edu/lkagal/tami/tami-scenario3-filter.n3#B</mapFrom>25. <mapTo>http://dig.csail.mit.edu/TAMI/cdk/scenario3/data.n3#open-source-search-2-result-1</mapTo>26. </hasVariableMapping>27. </InferenceStep>
Deborah McGuinness, June 2006
Next Generation Browser
PML based Explanation:
Adding trust-tab to Wikipedia
Deborah McGuinness, June 2006
The Original Wikipedia Article
Deborah McGuinness, June 2006
Trust Tab
Multiple Trust Tab• citation based
• revision history based
Fragments colored according trust value
Explanation about a fragment (author, trust value)
Deborah McGuinness, June 2006
fragment
PML Tabhttp://inferenceweb.stanford.edu/2006/02/example1-iw-wiki.owl
fragment trust
author trust
<iw:NodeSet rdf:about="http://foto.stanford.edu/mediawiki-1.4.12/index.php/Natural_number"> <In mathematics, a natural number is either a positive integer … </iw:hasConclusion> <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/English.owl#English"/> <iw:isConsequentOf> <iw:InferenceStep> <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/Told.owl#Told"/> <iw:hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/CitationTrust.owl#CitationTrust"/> <iw:hasSourceUsage> <iw:SourceUsage> <iw:hasSource> <iw:Source rdf:about="http://inferenceweb.stanford.edu/wp/registry/PER/Alexandrov.owl#Alexandrov"/> </iw:hasSource> </iw:SourceUsage> </iw:hasSourceUsage> </iw:InferenceStep> </iw:isConsequentOf></iw:NodeSet>
<iw:AggregatedTrustRelation> <iw:hasTrustingParty rdf:resource="http://inferenceweb.stanford.edu/wp/registry/ORG/wikipedia.owl#wikipedia"/> <iw:hasTrustedParty rdf:resource="http://foto.stanford.edu/mediawiki-1.4.12/index.php/Natural_number"/> <iw:hasTrustValue rdf:datatype="http://www.w3.org/2001/XMLSchema#float">0.1766</iw:hasTrustValue></iw:AggregatedTrustRelation>
<iw:AggregatedTrustRelation> <iw:hasTrustingParty rdf:resource="http://inferenceweb.stanford.edu/wp/registry/ORG/wikipedia.owl#wikipedia"/> <iw:hasTrustedParty rdf:resource="http://inferenceweb.stanford.edu/wp/registry/PER/Alexandrov.owl#Alexandrov"/> <iw:hasTrustValue rdf:datatype="http://www.w3.org/2001/XMLSchema#float">0.1766</iw:hasTrustValue></iw:AggregatedTrustRelation>
PML based Explanation:
Explaining Cognitive Assistants – DARPA PAL Program’s CALO system
Deborah McGuinness, June 2006
Explanation ProcessInitial request and answer strategy<user>: Why are you doing <subtask>?<system>: I am trying to do <high-level-task> and <subtask>
is one subgoal in the process.
Follow-up questions for mixed initiative dialogue• <user>: Why are you doing <high-level-task>?• <user>: How did you learn to do <high-level-task> in this
way? • <user>: Why haven’t you completed <subtask> yet?• <user>: Why is <subtask> a subgoal of <high-level-task>?• <user>: When will you finish <subtask>?• <user>: What sources did you use to do <subtask>?
McGuinness, D.L.; Pinheiro da Silva, P.; Glass, A.; Wolverton, M. Explaining Task Processing in Cognitive Assistants. 2006. Technical Report, KSL-06-06, Knowledge Systems Lab., Stanford.
Deborah McGuinness, June 2006
CALO TM Explainer UI
Initial explanation,with links indicatingfollow-up queries
and alternate strategies.