sesiÓn de actualidad: la ia explicable
TRANSCRIPT
SESIÓN DE ACTUALIDAD:
LA IA EXPLICABLE
Ricardo Baeza-Yates
Director of Research
Institute for Experiencial AI @ Northeastern Universityhttps://users.dcc.uchile.cl/~rbaeza/ | @polarbearby
ExplainingExplainable AI
Ricardo Baeza-Yates
Director of ResearchInstitute for Experiential AI
Northeastern Univ. at Silicon Valley, USA&
Web Science and Social Computing GroupDTIC, UPF, Catalonia
Big Data Congress, Barcelona, September 2021
Agenda
• What is XAI? • Why do we need it? Always?• Interpretability vs. Explainability• 5 Myths and 1 Warning• Responsible AI• Take-Home Messages
Explainable AI
[DARPA XAI program, 2016]
Goals
[DARPA XAI program, 2016]
Explanation Framework
[DARPA XAI program, 2016]
ADM (Automated Decision Making)
Contestability
Predictions
Explanation Example
Types of explanations: descriptions or justifications [Vig et al., 2009]
Why We Need XAI?• Verification of the System
• Transparency• Improvement of the System
• Mismatched objectives• Multi-objective trade-offs• Some systems are very sensitive
• Learning from the System• Causality
• Compliance to legislation• Safety• Privacy (e.g., GDPR)
• Ethical Issues• Transparency• Fairness (e.g., gender or race bias)• Stupidity
One pixel attack[Su et al, 2018]
May Also Help Us!
New Yorker
Explaining AI Systems [Bryson, 2019]
1. No explanation (too hard, in many cases it’s impossible!)2. Explain human actions that led to the system’s decision
(accountability, better understanding)3. Explaining what inputs resulted in what outputs
a) Be able to experiment with a black box and see what changes(digital forensics)
b) Record (secure) logs for later analysis (legal)
4. Seeing exactly how the system works a) An explanation of the overall system (e.g., documentation)b) Making ML models more transparent
✔
✔Peopleinvolved
✕
Do We Always Need XAI? No!
• ADM: Automated Decision Making (not only ML)• Many automated systems do not have significant consequences for unacceptable
results or• Problem is sufficiently well-studied and validated in real applications that we
trust the system’s decision• Examples:
• Web advertising• Optical Character Recognition• Postal code sorting• Aircraft collision avoidance systems
• Challenge: Which systems can/should be fully automated?• New Proposed EU Regulation for AI (April 2021)
Interpretability vs. Explainability
Interpretability is the degree to which a human can understand the cause of a decision and consistently predict the model’s result
• You understand how and why it works• Sometimes is not relevant
• System has no significant impact• Problem is well studied• Too much transparency is dangerous (e.g., gaming a system)
• Explaining is understanding a single decision for a given input• Why my loan application was rejected?
• Many times is better to use an interpretable model than trying to explain a black-box • High stakes decisions [Rudin, 2019]
• To explain most of the time you need interpretability
ExplanaNons Might be Difficult
• What is this?• Systems are afraid to say “I don’t know”
• “The Last QuesVon”, Asimov (1956)
• To abstract you need to forget• “Funes, the Memorious”, Borges (1942/4)
• You Cannot Learn What is NOT in the Data
• If it is a cat, why does not have the poinMng ears feature?• Interpretability does not imply completeness• ExplanaMons may come from the data, the model, the process, etc.
5 Myths of Explainable AI [Gartner, 2019]
1. Lack of Explainability Is a New Problem Specific to Black-Box AI2. All Black-Box AI Must Be Explainable and Interpretable3. Black-Box AI Decisions Can Be Fully Explained4. Human Decisions Are More Explainable Than Black-Box AI Decisions5. Explainable AI Can Be Bought
0. There is a Trade-off Between Accuracy and Explainability • Not necessarily [Rudin, 2019]• A global interpretable model won the
2018 FICO Recognition Prize
Warning: Explaining Doesn’t Mean Less Harm
1. The opaque function of the black box remains the basis for the AI decision
2. The explanation of the black box cannot be perfect, because if it were, there would be no difference between the two
3. The explanations provided are post hoc and may differ from the right explanation
[Babic et al, 2021]
LIME: Local Interpretable Model-Agnos6c Explana6ons[Ribeiro et al, 2016]
Impact of local perturbations
EU Proposed AI Regulation (April 2021)
Article 13: Transparency and provision of information to users 1. High-risk AI systems shall be designed and developed in such a
way to ensure that their operation is sufficiently transparent to enable users to interpret the system’s output and use it appropriately.
Article 14: Human oversight 4. (c) be able to correctly interpret the high-risk AI system’s
output, taking into account in particular the characteristics of the system and the interpretation tools and methods available;
66
Responsible AI
It’s Complicated• Awareness:• Autonomy & Integrity
• Data Provenance:• Equity & Bias• Traceability• Access and Redress• Quality Assurance
• Completeness:• Interpretability• Adaptability• Scalability• Extensibility• Interoperability• Quality Assurance
Responsible AI• Usability:• Efficiency • Accessibility• Resilience• Reproducibility
• Transparency: • Explainability• Validation & Testing• Documentation• Auditability
• Responsibility: • Privacy, Security & Safety• Proportionality, Sustainability• Trustworthiness, Accountability• Maintenance, Legal compliance• Beneficial/Wellbeing
ACM Properties
Take-Home Messages
• Design for People First!• Deep Respect for Limitations of Our Systems• Assumptions, ethical risks, etc.
• Learning from the Past does not mean to Reproduce It• Have an Ethical Board and Enforce a Code of Ethics• Improve Explainability (repeat 100 times)• More evaluation and cross-discipline validation• Research Best Practices with Humans in Control and
Machines in the Loop• Better than “Human in the Loop”!
• Check the ethics of your providers and your clients
69
Contact: [email protected]
ASIST 2012Book of the Year Award
Questions?
Explanations?
Biased Ad
Biased Questions?
SESIÓN DE ACTUALIDAD:
LA IA EXPLICABLE
Gonzalo Espinosa
Data Scientist
Lead Ratings www.lead-ratings.com
EXPLAINABLE AI by Gonzalo Espinosa Duelo, Data Scientist at Lead Ratings
We are a Predictive Marketing SaaS.
We improve ROI through predictive models in specific steps of the marketing and sales funnel.
Team:
● 5 Data Scientists, specialized PhD or Master’s degrees ● 2 IT experts in Big Data and Cloud Systems
¿What is EXPLAINABLE AI?
¿What is Explainable AI?
Data Prediction
- Predictive models have become sophisticated and difficult to understand.
- The Explainable AI (XAI) aims to make our predictions transparent:
- How do the values of the variables specifically affect each prediction?
- How do the variables affect the entire set of predictions?
ML Model
The EXPLAINABLE AI uncovers the black box to make ML models
understandable and explainable
Use Case: Lead Scoring
Advantages
● Prioritize contacts with the highest probability of success to improve overall conversion.
H
Solution
● Individualized score and explainability for each contact based on the probability to sell, predicted by a ML model. Contact
79%ML Model
Client● Real estate, +4.000 flats listed / year● Online client acquisition● Complex offline sales process
Contact
Use Case: Lead Scoring
Modelo MLPropensión a venta
Name: Sandra R.Phone: 654XXX623Contact location: BarcelonaProperty type: FlatProperty class: SalePrice: 280.000€Size: 140m2Rooms: 3Property City: BarcelonaProperty Zipcode: 08023List datetime: 28-06-21Channel: HabitacliaEnquiry type: SitesComments: Asked…Max budget: 300.000€Created at: 02-07-21 01......
HHigh Priority
79%
AGENTDATA ANALYSTMARKETERCHIEF SALES OFFICER
ML ModelConversion probability
ProfileProductOriginTraffic
Why the contact has high probability to buy?
Use Case: Lead Scoring
Contact
AGENTDATA ANALYSTMARKETERCHIEF SALES OFFICER
Modelo MLPropensión a venta H
High Priority
79%ML ModelConversion probability
Comments: Asked for a visitProperty Zipcode: 08023Property City: BarcelonaContact Location: BarcelonaPrevious leads: 2
Property type: FlatProperty class: SalePrice: 280.000€Size: 140m2Channel: HabitacliaMax budget: 300.000€Recency: 5 days...
Use Case: Lead Scoring
Name: Sandra R.Phone: 654XXX623Contact location: BarcelonaProperty type: FlatProperty class: SalePrice: 280.000€Size: 140m2Rooms: 3Property City: BarcelonaProperty Zipcode: 08023List datetime: 28-06-21...Channel: HabitacliaEnquiry type: SitesComments: Asked…Max budget: 300.000€Created at: 02-07-21......
The most important featuresContact HHigh Priority
79%
AGENT
XAI
ML ModelConversion probability
HContact 79% Channel: Direct callPrice: 350.000€Size: 180m2Property Type: SaleComments: Very interested
Property City: MadridProperty Zipcode: 28011Property class: FlatPrevious leads: 1Enquiry type: NoneRecency: 13 daysRooms: 4...
Use Case: Lead Scoring
Name: Hector M.Phone: 623XXX662Contact location: MadridProperty type: FlatProperty class: SalePrice: 350.000€Size: 180m2Property City: MadridProperty Zipcode: 28011Rooms: 4List datetime: 01-06-21Channel: Direct callEnquiry type: NoneComments: Very int…Max budget: 360.000€Created at: 14-06-21 ......
The most important features
High Priority
AGENT
XAI
ML ModelConversion probability
Low Priority
Contact 9%
¿Why the contact has low probability to buy?
Use Case: Lead Scoring
The most important featuresPhone: -Comments: -Property City: MadridContact Location: ValenciaRecency: 60 days
Property Zipcode: 28120Property class: FlatPrice: 400.000€Channel: Google adsEnquiry type: WebPrevious leads: 1Comments: -...
L
XAI
ML ModelConversion probability
AGENT
+42+23+16+13
+1+0+0+0-1
-6-9
Comments: Asked for a visitProperty Zipcode: 08023Property City: BarcelonaContact Location: Barcelona...Property Class: SaleProperty type: FlatPrize: 280.000€Size: 140m2Enquiry type: Sites...Device: Mobile Hour: 1am
The most important features
Use Case: Lead Scoring
79%
HHigh Priority
79%
AGENT
XAI
ML ModelConversion probabilityContact
The EXPLAINABLE AI estimates the contribution of each variable value in
each prediction
What are the most important factors for the predictive model?
Contact HHigh Priority
79%
AGENTDATA ANALYSTMARKETERCHIEF SALES OFFICER
Use Case: Lead Scoring
ML ModelConversion probability
Use Case: Lead Scoring
Aggregation methods for the distribution of contributions:
● Absolut Mean/Median: General impact on the score.● Variance: Discriminatory power of the variable.
XAI values
Individual local values (n x m)
Overall Contributions
Aggregated contributions to understand the most important variables for the model globally.
The EXPLAINABLE AI provides a global overview of the model
What are the success profiles that I should target?
AGENTDATA ANALYSTMARKETERCHIEF SALES OFFICER
Use Case: Lead Scoring
Contact Modelo MLPropensión a venta H
High Priority
79%ML ModelConversion probability
Having specific profiles, which strategies should I implement?
AGENTDATA ANALYSTMARKETERCHIEF SALES OFFICER
Use Case: Lead Scoring
Contact Modelo MLPropensión a venta H
High Priority
79%ML ModelConversion probability
We apply clustering methods from the individual contributions to obtain distinguished profiles.
Use Case: Lead Scoring
XAI values
Individual local values (n x m) A clustering method is applied over the subset of individual local values
● Profile● Product● Traffic● Origin● Actions / Activity
Objectives
● Type of profiles
● Dimensions definition
● Full / success / failure Unsupervised Clustering
Use Case: Lead Scoring
2 Dimensions● Product● Origin● Profile● Traffic ● Actions / Activity
Unsupervised Clustering
6 success profiles○ From +32% to +351% over mean conversion
○ Min. 150 leads/month (~5%)
Objectives● Success profiles
(Min +30% conversion)
XAI values
Advantages of XAI Clustering:
- XAI values are on the same scale.- Kind of “Supervised Clustering”: Clusters of feature effects.
The EXPLAINABLE AI allows identify predictive profiles
Use Case: Lead Scoring
AGENT
DATA ANALYST
MARKETER
CHIEF SALES OFFICER
The EXPLAINABLE AI offers advantages to different roles within a company
The Big Challenges of Explainable AI
REVERSE FEATURE ENGINEERING
Original Data
XAI
Explainable Data
ML Model Prediction
SELECTION OF RELEVANT AND ACTIONABLE VARIABLES FOR
THE USERSCALABILITY
CHALLENGE 1:
Feature engineering
CHALLENGE 2: CHALLENGE 3:
Lead Ratings: Predictive Marketing Solutions
Thank you!Gonzalo Espinosa DueloData Scientist at Lead [email protected]
EXPLAINABLE AI by Gonzalo Espinosa Duelo, Data Scientist at Lead Ratings
SESIÓN DE ACTUALIDAD:
LA IA EXPLICABLE
Javier Rando
CEO
EXPAI www.linkedin.com/in/javier-rando/
LA IA EXPLICABLE
Javier RandoCo-Founder | CTO
La importancia de crear una Inteligencia Artificial en la que podamos confiar
Javier Rando
Modelo PredicciónRecolección de
datosToma de
decisiones
“Caja negra”
Falta de transparenciaDecisiones subóptimas y reticencias a la digitalización.
01Generación de sesgosComportamientos desiguales para diferentes grupos.
02
Javier Rando
Hay dos principales problemas del uso actual de la IA
Explicación PredicciónModeloRecolección de
datosToma de
decisiones
Mayor controlModelos más precisos yvalidación de negocio.
01Generación de confianzaLa transparencia es la principal fuente de confianza.
02
Proceso transparente
Mejores decisionesMás información para mejores decisiones
03
Javier Rando
La Inteligencia Artificial debe ser transparente y explicable
Beneficiosderivados de IA +200%
antesahora
+ 15-30%
Precisión Monitorización
antesahora- 35-50%
Modelos en Producción
X3-8por ahorro de tiempo y aumento de productividad
Responsabilidadsocial corporativa
Fuente: IBM Study on Explainable AI Impact
Javier Rando
Son muchos los beneficios de la transparencia
Objetivos Desarrollo Sostenible Exigencias sociales
El 66% de los consumidoresconsideran la transparencia un factor principal cuando eligen una marca. Accenture
Los empleados que confían en los procesos de su empresa están másmotivados. Forbes
Movimientos sociales que pidenresponsabilidad corporativa.
Legislación
Javier Rando
IA explicable como base para una digitalización sostenible
Modelos transparentes
Los modelos sencillos son explicables
Solo pueden resolver problemas básicos
Potencial reducido
Explicaciones post-hoc
Permiten explicar cualquier modelo
Modelos para problemas complejos
Mayor extracción de información
Javier Rando
¿Cómo puedo conseguir Inteligencia Artificial explicable?
Modelo
Javier Rando
¿Por qué utilizar explicaciones post-hoc?
No queremos renunciar a la precisión
Nuestro problema no admite modelos interpretables (por
ejemplo, clasificación de imágenes o audio)
Solo tenemos acceso a un modelo ya entrenado
01
02
03
Javier Rando
¿Qué tipos de explicaciones podemos generar?
LocalesGlobales
Explican una predicción
Muestran si las predicciones son correctas
Ayuda al decision-making
Explican funcionamiento en general
Ilustran el comportamiento para grupos
Validación de las reglas del modelo
Javier Rando
Seleccionar los datos de interés01Calcular la loss del modelo en la predicción02
Javier RandoJavier RandoFisher et al. (2019): https://jmlr.org/papers/v20/18-760.html
Global: importancia de las variables (Fisher et al. 2019)
Modelo IA LOSS (error)
Seleccionar los datos de interés01Calcular la loss del modelo en la predicción02Elegir la variable y a estudiar03Permutar los valores de la variable y04
Javier RandoJavier Rando
Global: importancia de las variables
Fisher et al. (2019): https://jmlr.org/papers/v20/18-760.html
Seleccionar los datos de interés01Calcular la loss del modelo en la predicción02Elegir la variable y a estudiar03Permutar los valores de la variable y04Calcular la loss para los datos actualizados05
Javier RandoJavier Rando
Global: importancia de las variables
Modelo IA LOSS’ (error)Este proceso puede repetirse N veces para conseguir resultados más estables.
Fisher et al. (2019): https://jmlr.org/papers/v20/18-760.html
Javier RandoJavier Rando
Global: importancia de las variables
Calculamos la importancia comparando la diferencia entre los errores en la predicción.
Si LOSS’ >>> LOSSLa variable es importante para la predicción
Si LOSS’ ≃ LOSSLa ausencia de esta variable no afecta al modelo
Fisher et al. (2019): https://jmlr.org/papers/v20/18-760.html
Javier Rando
LIME: Local Interpretable Model-Agnostic Explanations
Inspirado en las slides de Hima Lakkaraju: https://www.chilconference.org/tutorial_T04.html
Explicar esta predicción
Javier Rando
LIME: Local Interpretable Model-Agnostic Explanations
Inspirado en las slides de Hima Lakkaraju: https://www.chilconference.org/tutorial_T04.html
Seleccionar muestras aleatorias en torno a xi01Predicción para cada muestra02
Javier RandoJavier RandoInspirado en las slides de Hima Lakkaraju: https://www.chilconference.org/tutorial_T04.html
Seleccionar muestras aleatorias en torno a xi01Predicción para cada muestra02Ponderar predicción según distancia a xi03
LIME: Local Interpretable Model-Agnostic Explanations
Javier RandoJavier RandoInspirado en las slides de Hima Lakkaraju: https://www.chilconference.org/tutorial_T04.html
Seleccionar muestras aleatorias en torno a xi01Predicción para cada muestra02Ponderar predicción según distancia a xi03Aprender un modelo linear sencillo04
LIME: Local Interpretable Model-Agnostic Explanations
Seleccionar muestras aleatorias en torno a xi01Predicción para cada muestra02Ponderar predicción según distancia a xi03Aprender un modelo linear sencillo04¡Explicar modelo interpretable!05
Javier RandoJavier RandoInspirado en las slides de Hima Lakkaraju: https://www.chilconference.org/tutorial_T04.html
LIME: Local Interpretable Model-Agnostic Explanations
Modelo IA
Predicción
Acción de retención
Datos cliente
¿Es este cliente churner? ¿Por qué?
¿Quién tiene la razón?
¿Qué acción es la más adecuada para este cliente?
Javier Rando
Casos de éxito de la IA Explicable: Detección de Churn
Perfilado rápido del cliente
Identifica factores clave para retención
El comercial podrá entender el razonamiento del modelo
Javier Rando
Casos de éxito de la IA Explicable: Detección de Churn
Javier Rando
IAX como defensa frente adversarial attacks
Los adversarial attacks son capaces de cambiar la predicción sin afectar la apariencia de la entrada.
Javier Rando
IAX como defensa frente adversarial attacks
Idea: pidamos al modelo que nos explique por qué ha elegido esa clase y decidamos si es una justificación válida
Modelo Predicción ¿Es coherente con la predicción?Explicación
Javier Rando
IAX como defensa frente adversarial attacks
Adversarial attack01
Fidel et al. (2019): https://arxiv.org/pdf/1909.03418.pdf
Javier Rando
IAX como defensa frente adversarial attacks
Adversarial attack01
Fidel et al. (2019): https://arxiv.org/pdf/1909.03418.pdf
Javier Rando
IAX como defensa frente adversarial attacks
Predicción y explicación02
Fidel et al. (2019): https://arxiv.org/pdf/1909.03418.pdf
Javier Rando
IAX como defensa frente adversarial attacks
Predicción y explicación02 Validación con IAX03
Fidel et al. (2019): https://arxiv.org/pdf/1909.03418.pdf
Javier Rando
IAX como defensa frente adversarial attacks
Fidel et al. (2019): https://arxiv.org/pdf/1909.03418.pdf
Javier Rando
La IAX no es la solución a todos nuestros problemas
• ¿Cómo recogemos los datos?
• ¿A quién(es) representan los datos?
• ¿Cómo evaluamos nuestros modelos?
• ¿Cómo va a usarse el modelo y sus explicaciones?
Retos como desarrolladores
Javier Rando
La IAX no es la solución a todos nuestros problemas
• ¿Cómo evaluamos la precisión las explicaciones?
• ¡Las explicaciones también pueden manipularse! (Slack et al., 2020)
• ¿Cuál es la mejor manera de expresar las explicaciones?
• Mejorar al interpretabilidad de las explicaciones
Retos para la IAX