information quality (infoq) - afekainfoq(f,x,g) = u( f(x|g) ) information quality (infoq) 19 g a...

Post on 30-Jul-2021

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

איכות האינפורמציה Information Quality (InfoQ)

רון קנת' פרופ

1

יום העיון :בשיתוף

כיוונים עכשוויים ועתידיים: יום עיון בניהול איכות

אביב -המכללה האקדמית להנדסה בתל-אפקה | 2016דצמבר 20

Background

The Skill Content of Recent Technical Change: An Empirical Investigation (updated data)

David H. Autor, Frank Levy, and Richard J. Murnane, Quarterly Journal of Economics, 118, 4, 2003, pp. 1279-1334

• Working with New Information• Solving Unstructured Problems

2

Data

3

numbers

data

statistical analysis

findings

information

insight

4

From numbers to insights

Insights

5https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619681/

Insights

6

More insights

7

Missing Data

8

Missing Data

9

How do you impute the

missing data?

Applied statistics is about meeting the challenge of

solving real world problems with mathematical tools and

statistical thinking

10

Why Research

The creation of new knowledge.

This includes

– new theory

– new methods

– assessing the practical value of methods

– creative application of existing methods

11

Research Motivation

• We find an interesting problem.

– It requires novel use of old ideas.

– It requires completely new ideas.

• We are stimulated by research that we read.

– We have a better solution to the problem.

– We explore the properties of the method.

• We have a customer.

12

SPC• Profile monitoring• Image monitoring• Multivariate SPC

Reliability• Multi-layer, multi-scale surveillance• Tests with multiple accelerating factors• Degradation Models

Research Topics in Statistics/Quality

13

Design of Experiments• Experiments for mixtures/Compositional Data

• Computer experiments

• Accelerated testing

Statistical Strategy• Integrated models

• Life cycle views

• Information quality

Research Topics in Statistics/Quality

14

15

Research Topics in Statistics/Quality

Integrated Models

Problem Elicitation

GoalFormulation

DataCollection

DataAnalysis

Formulation of Findings

Communicationof Findings

ImpactAssessment

Operationalizationof Findings

Kenett, R.S. (2015) Statistics: A Life Cycle View, Quality Engineering (with discussion), 27(1), pp. 111-129.

Kenett, R.S. & Thyregod, P. (2006) Aspects of statistical consulting not taught by academia, Statistica Neerlandica, 60 (3),396-412..

Academic courses

Unstructured problems

A life cycle view

Research Topics in Statistics/Quality

Information Quality

Research Topics in Statistics/Quality

17

Information Quality: The Potential of Data and Analytics to Generate Knowledge

Part I: The Information Quality Framework1. Introduction to information quality2. Quality of data, quality of analysis3. The dimensions of InfoQ4. InfoQ at the study-design stage5. InfoQ at the post-data collection stagePart II: Applications of InfoQ6. Education7. Customer surveys8. Healthcare9. Risk management10. Official statisticsPart III: Implementing InfoQ11. InfoQ and reproducible research12. InfoQ in review processes of scientific publications13. Integrating InfoQ into applied statistics and data mining academic programs14. Information quality support with R15. Information quality support with MINITAB16. Information quality support with JMP

18

InfoQ(f,X,g) = U( f(X|g) )

Information Quality (InfoQ)

19

g A specific analysis goal

X The available dataset

f An empirical analysis method

U A utility measure

Kenett, R.S. and Shmueli , G. (2014) On Information Quality , Journal of the Royal Statistical Society, Series A (with discussion), Vol. 177, No. 1, pp. 3-38, 2014. http://ssrn.com/abstract=1464444.

The potential of a particular dataset to achieve a particular goal using a given empirical analysis method

Analysis goal

g XAvailable data

fData analysis

method

Utility measure

U

Data

Quality

Information

Quality

Analysis

Quality

1.Data resolution

2.Data structure

3.Data integration

4.Temporal relevance

5.Chronology of data and goal

6.Generalizability

7.Operationalization

8.Communication

Goals

Analytic Space

Domain

Space

Insights

How

Analysis goal

g XAvailable data

fData analysis

method

Utility measure

U

What

InfoQ(f,X,g) = U(f(X|g))

20

InfoQ Assessment

21

Rating-based assessment

1-5 scale or desirability function on each dimension:

InfoQ Score = [d1(Y1) d2(Y2) … d8(Y8)]1/8

Experience from research methods courses:– Preparing a PhD research proposal (FELU, Univ. of

Ljubljana, Sant’Anna School of Advanced Studies, Pisa)

– Post-hoc evaluation of completed studies (CMU, goo.gl/erNPF)

Analysis goal

g XAvailable data

fData analysis

method

Utility measure

U

What

1.Data resolution

2.Data structure

3.Data integration

4.Temporal relevance

5.Chronology of data and goal

6.Generalizability

7.Operationalization

8.Communication

How

# Dimension Note Value Index

1 Data resolution 5 1.0000

2 Data structure 4 0.7500

3 Data integration 5 1.0000

4 Temporal relevance 5 1.0000

5 Generalizability 3 0.5000

6 Chronology of data and goal 5 1.0000

7 Concept operationalization 2 0.2500

8 Communication 3 0.5000

InfoQ Score = 0.68InfoQ=68%22

InfoQ(f,X,g) = U(f(X|g))

Data

Quality

Information

Quality

Analysis

Quality

Goals

Analytic Space

Domain

Space

Insights

InfoQ at the Study Design Stage

• Primary vs. Secondary Data

• Experimental vs. Observational Data

• Designed Experiments

• Computer Experiments

• Surveys: Pilot the questionnaire, plan initiatives to increase response rates

23

Fritz Scheuren (2005). "What is a Survey?", American Statistical Association, Washington, D.Chttp://www.amstat.org/sections/srms/pamphlet.pdf

InfoQ at the Post-Data-Collection Stage

• Data Cleaning

• Preprocessing

• Reweighting

• Bias Adjustment

• Meta-Analysis

• Retrospective Experimental Design Analysis

• Censoring and Truncation

• Surveys: Determine representativeness of returns and decide if responses should be weighted

24

Fritz Scheuren (2005). "What is a Survey?", American Statistical Association, Washington, D.Chttp://www.amstat.org/sections/srms/pamphlet.pdf

25

How to design a program in data science/business analytics

26

How to evaluate a program in data science/business analytics

Challenges Ahead

• Bridge the gap between theory and applications

• Strengthen the position of statistical thinking in business, industrial, educational and academic applications

• Expand research on information quality (InfoQ)

27

Thank you for your attention

28

top related