interpretability, trust and moralityin...

11

INTERPRETABILITY , TRUST , AND MORALITY IN A UTONOMY D AVID D ANKS PHILOSOPHY & PSYCHOLOGY CARNEGIE MELLON

Upload: others

Post on 11-Sep-2020

1 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

INTERPRETABILITY, TRUST, AND

MORALITY IN AUTONOMY

DAVID DANKSPHILOSOPHY & PSYCHOLOGYCARNEGIE MELLON

Page 2: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

¨ When can someone ethically use/deploy an autonomous system?

Page 3: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

What do I mean by ‘autonomy’?

¨ Characterize ‘autonomy’ in terms of capabilities(= context-specific abilities)

¤ Planning (routes, action sequences, …)¤ Learning (environmental statistics, adaptation, …)¤ Deciding (action selection, classification, …)¤ …

¨ Richer & more useful than ‘levels’¤ Though obviously also more complicated…

Page 4: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

Ethical deployment

¨ Necessary condition for ethical deployment: Reasonable belief that the “system” will behave (approximately) as the user intends¤ Agnostic about whether “system” is human or artificial¤ User’s intentions can be quite high-level (“drive safely”)

¨ Alternate formulation: User must trust the “system”¤ In relevant respects, not necessarily all of them

Page 5: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

Varieties of (psychological) trust

1. Reliability/Predictability of Trustee¤ “Behavior”-focused¤ Knowing what the trustee will do¤ Coordination & prediction in known circumstances

2. Understanding the Trustee¤ Belief/value-focused¤ Knowing why the trustee will do¤ Coordination & prediction in novel circumstances

Page 6: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

Trust & ethical deployment

¨ ⇒ If system will use autonomous capabilities, then deployer must have “understanding trust”

Specify desired system behavior before deployment?

“Reliability trust” is sufficient

Yes “Understanding trust” is required

No

Page 7: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

Trust & interpretability

¨ Interpretability is key for “understanding trust”

¨ ⇒ Necessary condition for ethical deployment is: “System behavior for Goal is interpretable by User”¤ Note: Interpretability is not a property of System alone

Page 8: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

Trust & interpretability

¨ Routes to interpretability & “understanding trust”1. Explicit requirement that System plan/learn/decide

similarly to humans2. Have deployment decisions made in collaboration

with (informed) developers3. Extended user experience in many different contexts

Page 9: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

What about system morality?

¨ Observation: Humans typically interpret others’ unethical behavior using “internal” features¤ Conjecture: Non-developers will usually interpret

unethical system behavior as due to an unethical nature

¨ Conjecture: If the developer cares about having an ethical system, then any u-trustworthy system will (mostly) act ethically

Page 10: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

Conclusions

¨ Ethical deployment requires trust (in system)¨ If a system employs autonomous capabilities, then

understanding-trust is required¨ Interpretability is necessary for understanding-trust

¤ Reminder: ‘interpretability’ is a three-argument relation

¨ ⇒ Interpretability is nec. for ethical deployment¤ Though many routes to this type of interpretability

¨ Conjecture: Alternate route to try to achieve ethical system behavior?

Page 11: INTERPRETABILITY, TRUST AND MORALITYIN AUTONOMYqav.cs.ox.ac.uk/.../img/DanksMoralityTrustWorkshop17.pdf · 2017. 7. 20. · Trust & interpretability ¨Routes to interpretability &

Thanks!

Human Factors in Model Interpretability: Industry

Interpretability of Machine Learning Models and ... · Interpretability of Machine Learning Models and Representations: ... measuring the interpretability of machine learning models

INTERPRETABILITY IN ROBINSON’S Qgmferreira/... · 2013. 2. 18. · INTERPRETABILITY IN ROBINSON’S Q FERNANDO FERREIRA AND GILDA FERREIRA Abstract. Edward Nelson published in 1986

i-Algebra: Towards Interactive Interpretability of Deep

Evaluating the Interpretability of Generative Models by

Enhancing interpretability of seismic data with spectral ... · Enhancing interpretability of seismic data with spectral decomposition phase components . Satinder Chopra*+and Kurt

Human Evaluation of Models Built for Interpretability

Rethinking Self-Attention: Towards Interpretability in Neural Parsing · 2020-05-02 · Rethinking Self-Attention: Towards Interpretability in Neural Parsing Khalil Mrini1, Franck

Machine Learning Interpretability with H2O …docs.h2o.ai/driverless-ai/latest-stable/docs/booklets/...Machine Learning Interpretability with H2O Driverless AI by Patrick Hall, Navdeep

DeepCoDA: personalized interpretability for compositional health …proceedings.mlr.press/v119/quinn20a/quinn20a.pdf · 2021. 1. 28. · DeepCoDA: personalized interpretability for

IKONOS spatial resolution and image interpretability characterization.pdf

Interpretability - University of Colorado Boulder

Interpretable Models in Probabilistic Deep Learning · Interpretability of (Probabilistic) Deep Learning Post-hoc interpretability: (humans) can obtain useful information about model’s

Routes to EPR - Digital Health€¦ · 1 Routes to EPR Routes to EPR: trust strategies for electronic patient records and progress towards paperless working June 2013 T: +44 (0)20

Post-hoc Interpretability for Neural NLP: A Survey

Proper Network Interpretability Helps Adversarial ... · Proper Network Interpretability Helps Adversarial Robustness in Classiﬁcation Akhilan Boopathy1 Sijia Liu 2Gaoyuan Zhang

1 An Investigation of Interpretability Techniques for Deep … · 2020-02-24 · An Investigation of Interpretability Techniques for Deep Learning in Predictive Process Analytics

Big Data, Model Complexity, and Interpretability: Machine

Interpretability of Delay-based Congestion

Enhanced Resolution, Imaging, and Interpretability: Dual ... · Enhanced Resolution, Imaging, and Interpretability: Dual-Sensor Towed Streamer Data Examples from around the World

The Interpretability Hypothesis: evidence from wh

Evaluating the Interpretability of the Knowledge ... · Evaluating the Interpretability of the Knowledge Compilation Map: Communicating Logical Statements Effectively Serena Booth1,

Summit: Scaling Deep Learning Interpretability ... · Index Terms—Deep learning interpretability, visual analytics, scalable summarization, attribution graph 1 INTRODUCTION Deep

The Mythos of Model Interpretability - zacklipton.comzacklipton.com/media/papers/mythos-of-model-interpretability... · The Mythos of Model Interpretability ... Transferability

Explainability vs. Interpretability and methods for models

VISUAL INTERPRETABILITY OF DEEP LEARNING ALGORITHMS IN

Network Dissection: Quantifying Interpretability of Deep ...netdissect.csail.mit.edu/final-network-dissection.pdfNetwork Dissection: Quantifying Interpretability of Deep Visual Representations

Interpreting Interpretability: Understanding Data …harmank/Papers/CHI2020...Lipton [39] surveys different crite-ria for assessing interpretability, such as simulatability, as well

Training Deep Neural Networks for Interpretability and

TOWARDS DEEP INTERPRETABILITY MUS ROVER II): LEARNING

Cognitive biases and interpretability of inductively …...Cognitive biases and interpretability of inductively learnt rules Tom a s Kliegr Department of Information and Knowledge

Interpretability Beyond Feature Attribution: Quantitative ... · Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) Been Kim

Interpretability in Deep Learning for Heston Smile

Image Interpretability Evaluation Presentation

Network Dissection: Quantifying Interpretability of Deep Visual … · 2017. 5. 31. · Network Dissection: Quantifying Interpretability of Deep Visual Representations David Bau∗,