information quality (infoq) - afekainfoq(f,x,g) = u( f(x|g) ) information quality (infoq) 19 g a...

28
איכות האינפורמציהInformation Quality (InfoQ) פרופ' רון קנת1 יום העיון בשיתוף: יום עיון בניהול איכות: כיוונים עכשוויים ועתידיים20 דצמבר2016 | אפקה- המכללה האקדמית להנדסה בתל- אביב

Upload: others

Post on 30-Jul-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

איכות האינפורמציה Information Quality (InfoQ)

רון קנת' פרופ

1

יום העיון :בשיתוף

כיוונים עכשוויים ועתידיים: יום עיון בניהול איכות

אביב -המכללה האקדמית להנדסה בתל-אפקה | 2016דצמבר 20

Page 2: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Background

The Skill Content of Recent Technical Change: An Empirical Investigation (updated data)

David H. Autor, Frank Levy, and Richard J. Murnane, Quarterly Journal of Economics, 118, 4, 2003, pp. 1279-1334

• Working with New Information• Solving Unstructured Problems

2

Page 3: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Data

3

Page 4: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

numbers

data

statistical analysis

findings

information

insight

4

From numbers to insights

Page 5: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Insights

5https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619681/

Page 6: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Insights

6

Page 7: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

More insights

7

Page 8: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Missing Data

8

Page 9: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Missing Data

9

How do you impute the

missing data?

Page 10: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Applied statistics is about meeting the challenge of

solving real world problems with mathematical tools and

statistical thinking

10

Page 11: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Why Research

The creation of new knowledge.

This includes

– new theory

– new methods

– assessing the practical value of methods

– creative application of existing methods

11

Page 12: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Research Motivation

• We find an interesting problem.

– It requires novel use of old ideas.

– It requires completely new ideas.

• We are stimulated by research that we read.

– We have a better solution to the problem.

– We explore the properties of the method.

• We have a customer.

12

Page 13: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

SPC• Profile monitoring• Image monitoring• Multivariate SPC

Reliability• Multi-layer, multi-scale surveillance• Tests with multiple accelerating factors• Degradation Models

Research Topics in Statistics/Quality

13

Page 14: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Design of Experiments• Experiments for mixtures/Compositional Data

• Computer experiments

• Accelerated testing

Statistical Strategy• Integrated models

• Life cycle views

• Information quality

Research Topics in Statistics/Quality

14

Page 15: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

15

Research Topics in Statistics/Quality

Integrated Models

Page 16: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Problem Elicitation

GoalFormulation

DataCollection

DataAnalysis

Formulation of Findings

Communicationof Findings

ImpactAssessment

Operationalizationof Findings

Kenett, R.S. (2015) Statistics: A Life Cycle View, Quality Engineering (with discussion), 27(1), pp. 111-129.

Kenett, R.S. & Thyregod, P. (2006) Aspects of statistical consulting not taught by academia, Statistica Neerlandica, 60 (3),396-412..

Academic courses

Unstructured problems

A life cycle view

Research Topics in Statistics/Quality

Page 17: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Information Quality

Research Topics in Statistics/Quality

17

Page 18: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Information Quality: The Potential of Data and Analytics to Generate Knowledge

Part I: The Information Quality Framework1. Introduction to information quality2. Quality of data, quality of analysis3. The dimensions of InfoQ4. InfoQ at the study-design stage5. InfoQ at the post-data collection stagePart II: Applications of InfoQ6. Education7. Customer surveys8. Healthcare9. Risk management10. Official statisticsPart III: Implementing InfoQ11. InfoQ and reproducible research12. InfoQ in review processes of scientific publications13. Integrating InfoQ into applied statistics and data mining academic programs14. Information quality support with R15. Information quality support with MINITAB16. Information quality support with JMP

18

Page 19: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

InfoQ(f,X,g) = U( f(X|g) )

Information Quality (InfoQ)

19

g A specific analysis goal

X The available dataset

f An empirical analysis method

U A utility measure

Kenett, R.S. and Shmueli , G. (2014) On Information Quality , Journal of the Royal Statistical Society, Series A (with discussion), Vol. 177, No. 1, pp. 3-38, 2014. http://ssrn.com/abstract=1464444.

The potential of a particular dataset to achieve a particular goal using a given empirical analysis method

Analysis goal

g XAvailable data

fData analysis

method

Utility measure

U

Page 20: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Data

Quality

Information

Quality

Analysis

Quality

1.Data resolution

2.Data structure

3.Data integration

4.Temporal relevance

5.Chronology of data and goal

6.Generalizability

7.Operationalization

8.Communication

Goals

Analytic Space

Domain

Space

Insights

How

Analysis goal

g XAvailable data

fData analysis

method

Utility measure

U

What

InfoQ(f,X,g) = U(f(X|g))

20

Page 21: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

InfoQ Assessment

21

Rating-based assessment

1-5 scale or desirability function on each dimension:

InfoQ Score = [d1(Y1) d2(Y2) … d8(Y8)]1/8

Experience from research methods courses:– Preparing a PhD research proposal (FELU, Univ. of

Ljubljana, Sant’Anna School of Advanced Studies, Pisa)

– Post-hoc evaluation of completed studies (CMU, goo.gl/erNPF)

Page 22: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Analysis goal

g XAvailable data

fData analysis

method

Utility measure

U

What

1.Data resolution

2.Data structure

3.Data integration

4.Temporal relevance

5.Chronology of data and goal

6.Generalizability

7.Operationalization

8.Communication

How

# Dimension Note Value Index

1 Data resolution 5 1.0000

2 Data structure 4 0.7500

3 Data integration 5 1.0000

4 Temporal relevance 5 1.0000

5 Generalizability 3 0.5000

6 Chronology of data and goal 5 1.0000

7 Concept operationalization 2 0.2500

8 Communication 3 0.5000

InfoQ Score = 0.68InfoQ=68%22

InfoQ(f,X,g) = U(f(X|g))

Data

Quality

Information

Quality

Analysis

Quality

Goals

Analytic Space

Domain

Space

Insights

Page 23: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

InfoQ at the Study Design Stage

• Primary vs. Secondary Data

• Experimental vs. Observational Data

• Designed Experiments

• Computer Experiments

• Surveys: Pilot the questionnaire, plan initiatives to increase response rates

23

Fritz Scheuren (2005). "What is a Survey?", American Statistical Association, Washington, D.Chttp://www.amstat.org/sections/srms/pamphlet.pdf

Page 24: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

InfoQ at the Post-Data-Collection Stage

• Data Cleaning

• Preprocessing

• Reweighting

• Bias Adjustment

• Meta-Analysis

• Retrospective Experimental Design Analysis

• Censoring and Truncation

• Surveys: Determine representativeness of returns and decide if responses should be weighted

24

Fritz Scheuren (2005). "What is a Survey?", American Statistical Association, Washington, D.Chttp://www.amstat.org/sections/srms/pamphlet.pdf

Page 25: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

25

How to design a program in data science/business analytics

Page 26: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

26

How to evaluate a program in data science/business analytics

Page 27: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Challenges Ahead

• Bridge the gap between theory and applications

• Strengthen the position of statistical thinking in business, industrial, educational and academic applications

• Expand research on information quality (InfoQ)

27

Page 28: Information Quality (InfoQ) - AfekaInfoQ(f,X,g) = U( f(X|g) ) Information Quality (InfoQ) 19 g A specific analysis goal X The available dataset f An empirical analysis method U A utility

Thank you for your attention

28