diagnosing type errors with class - penn state …dbz5017/pub/pldi15-slides.pdfdiagnosing type...

32
Diagnosing Type Errors with Class PLDI 2015 Distinguished Paper Award Danfeng Zhang Andrew C. Myers Cornell University Dimitrios Vytiniotis Simon Peyton-Jones MSR Cambridge

Upload: doque

Post on 12-Apr-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Diagnosing Type Errors with Class

PLDI 2015

Distinguished Paper Award

Danfeng ZhangAndrew C. Myers

Cornell University

Dimitrios VytiniotisSimon Peyton-Jones

MSR Cambridge

Page 2: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

2

“It is a truism that most bugs are detected

only at a great distance from their source.”Mitchell Wand

Finding the source of type errors, POPL’86

Error localization is difficult for ML type systems

Even worse in sophisticated type systems

The Glasgow Haskell Compiler (GHC) Type classes Type families GADTs Type signatures

Page 3: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

3

GHC: Bool is not a numerical type

Actual mistake: ‘==’ should be ‘−’

Error messages are sometimes confusing

Inference Engine

Page 4: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

SHErrLoc: Static Holistic Error Locator

4

Most likely error cause

A general, expressive and accurate error localization method, which handles the highly expressive type system of GHC

Page 5: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

General Error Localization [Zhang&Myers’14]

5

Programs

JifOCaml Others

Based on Bayesian

interpretation

Cannot diagnose Haskell errors

Constraints Analysis

Constraints

The error cause is likely to be• Simple• Able to explain all errors• Not used often on correct paths

General Diagnosis Heuristics

Cause

Page 6: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Haskell Program

Constraints Analysis

ConstraintsA highly expressive

constraint language

A decidable and

efficient constraint

analysis algorithm

Key Contributions

Bayesian reasoning

A Bayesian model that

accounts for the richer

graph representation

fact n = if n == 0 then 1

else n * fac (n == 1)

Cause

Cause

Page 7: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Roadmap

7

Haskell Program

ConstraintsA highly expressive

constraint language

fact n = if n == 0 then 1

else n * fac (n == 1)

Page 8: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Type Checking as Constraint Solving

• ML type system

– Constraint elements: types

– Constraints: type equalities

8

Constructors: Int, Bool, ListVariables:𝛼, 𝛽, 𝛾

𝐸 ∷= 𝛼 𝑐𝑜𝑛 𝐸1, … , 𝐸𝑛𝐼 ∷= 𝐸1 = 𝐸2 𝐶 ∷= 𝑖 𝐼𝑖

Syntax of Constraints

Constraint

Element

Page 9: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Type Classes

9

Instances of a type class, called Num

Intuitively, a set of types

Page 10: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

10

𝐸 ∷= 𝛼 𝑐𝑜𝑛 𝐸1, … , 𝐸𝑛𝐼 ∷= 𝐸1 = 𝐸2 𝑐𝑙𝑎 𝐸1, … , 𝐸𝑛 𝐶 ∷= 𝑖 𝐼𝑖

Syntax of Constraints

Modeling Type Class Constraints

𝐸 ∷= 𝛼 𝑐𝑜𝑛 𝐸1, … , 𝐸𝑛𝐼 ∷= 𝐸1 ≤ 𝐸2 𝐶 ∷= 𝑖 𝐼𝑖𝐼 ∷= 𝐸1 ≤ 𝐸2

A type class is a set of its instances

𝐍𝐮𝐦 𝛼 ≔ 𝛼 ≤ 𝐍𝐮𝐦

𝐸1 = 𝐸2 ≔ 𝐸1 ≤ 𝐸2 ∧ 𝐸2 ≤ 𝐸1

Our constraint language

Page 11: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

11

𝐸 ∷= 𝛼 𝑐𝑜𝑛 𝐸1, … , 𝐸𝑛𝐼 ∷= 𝐸1 = 𝐸2 𝑐𝑙𝑎 𝐸1, … , 𝐸𝑛 𝐶 ∷= 𝑖 𝐼𝑖

Syntax of Constraints

Modeling Type Class Constraints

𝐸 ∷= 𝛼 𝑐𝑜𝑛 𝐸1, … , 𝐸𝑛𝐼 ∷= 𝐸1 ≤ 𝐸2 𝐶 ∷= 𝑖 𝐼𝑖𝐼 ∷= 𝐸1 ≤ 𝐸2

Concise; inequalities directly map to edges in a graph

Our constraint languageA type class is a

set of its instances

Page 12: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Types are Checked Under Hypotheses

• Type signatures and GADTs introduce hypotheses

12

Constraints

a≤ Num ⊢ a≤ Numdouble :: Num a => a -> a

double n = n * 2

Haskell Program

Hypothesis: a is an instance of Num

Constraint hypothesis

Constraints are checked under hypotheses

Page 13: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Types are Checked Under Axioms

• Instance declaration may introduce (global) axioms

13

instance Eq a => Eq [a]

where {...}

Haskell Program

For all a, a is an instance of Eq implies

list of a is an instance of Eq

Int ≤ Eq ∧ ∀𝑎. 𝑎 ≤ Eq ⇒ 𝑎 ≤ Eq ⊢ Int ≤ Eq

Hypothesis Axiom

Constraint example with axioms:

Int ≤ Eq ⇒ Int ≤ Eq

Page 14: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Modeling Hypothesis and Axioms

• Constraints (𝐶): inequalities under quantified axioms

• Quantified axioms (𝑄): implication rules

• Hypotheses (e.g., Int ≤ Num): degenerate axioms

14

𝐸 ∷= 𝛼 𝑐𝑜𝑛 𝐸1, … , 𝐸𝑛 𝑄 ∷= ∀𝑎. 𝑖 𝐼𝑖 ⇒ 𝐼𝐼 ∷= 𝐸1 ≤ 𝐸2 𝐶 ∷= 𝑗 ( 𝑖𝑄𝑖 ⊢ 𝐼𝑗)

Syntax of SHErrLoc Constraints

Page 15: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

The Full Constraint Language

• Also supports

– Functions on constraint elements

– Nested universally and existentially quantified variables

• Is expressive enough to model

– type classes, type families, GADTs, type signatures

15

(refer to the paper for more details)

Page 16: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Haskell Program

Constraints Analysis

ConstraintsA highly expressive

constraint language

A decidable and

efficient constraint

analysis algorithm

Roadmap

fact n = if n == 0 then 1

else n * fac (n == 1)

Cause

Page 17: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Constraint Graph in a Nutshell

• Graph construction (simple case)

– Node: constraint element

– Directed edge: partial ordering

17

Page 18: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

18

Constraint Analysis in a Nutshell

Bool is not an instance of Num

fact n = if n == 0 then 1

else n * fac (n == 1)n == 1

n 0

Num

Page 19: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Limitations of Previous Algorithms[Barrett et al.’00, Melski&Reps’00,Zhang&Myers’10]

19

Int ≤ C ⊢ 𝛼 = Bool ∧ [𝛼] ≤ C

Previous algorithms under-saturates the graph

Satisfiable:

𝛼 = Int

Satisfiable:

𝛼 = Bool

Previous algs only addedges ([Bool] is not in the graph!)

Bool C≰a type class

C

[𝛼]

𝛼 Bool

Page 20: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

New Algorithm

20

Key idea: add new edges and nodes during saturation

New algorithm adds new nodes

Key challenge: naive algorithms either fail to terminate, or under-saturate the graph

Int ≤ C ⊢ 𝛼 = Bool ∧ [𝛼] ≤ C

C

[𝛼]

𝛼 Bool

[Bool]

Page 21: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

New Algorithm in a Nutshell

21

Lemma: the algorithm always terminates

Black node: node before saturationWhite node: added during saturation

Nodes added based on patterns 1. one edge with two black nodes 2. a black/white node

Recursion check: if white node, not added based on the edge in pattern

C

[𝛼]

𝛼 Bool

[Bool]

Page 22: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Constraint Analysis

• The analysis also handles

– Functions on constraint elements

– Hypotheses

– Quantified axioms

• Performance

– Empirically: quadratic in graph size

22

(refer to the paper for more details)

Page 23: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Roadmap

23

Haskell Program

Constraints Analysis

ConstraintsA highly expressive

constraint language

A decidable and

efficient constraint

analysis algorithm

Bayesian reasoning

A Bayesian model that

accounts for the richer

graph representation

fact n = if n == 0 then 1

else n * fac (n == 1)

Cause

Cause

Page 24: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Likelihood Estimation [Zhang&Myers’14]

A ranking metric based on Bayesian reasoning

(𝑃1, 𝑃2 are tunable parameters)

• Simplifying assumption

– Satisfiability of paths are independent

𝑃1 𝐸 𝑃2

1 − 𝑃2

𝑘𝐸 # sat paths using locations in E

Explanation: a set of locations

White nodes break this assumption

Page 25: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

• Observation: some paths using white nodes provide neither positive nor negative evidence

25

Satisfiability depends on edges between 𝛼 and Bool

Lemma: the satisfiability of any redundant path depends on non-redundant paths

Redundant Paths (definition in paper)

C

[𝛼]

𝛼 Bool

[Bool]

Page 26: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

New Ranking Metric

• Intuitively,

• Top candidates returned by an efficient A* algorithm [Zhang&Myers’14]

26

The error cause is likely to be• Simple• Able to explain all errors• Not used often on correct

non-redundant paths

General Diagnosis Heuristics

non-redundant

𝑃1 𝐸 𝑃2

1 − 𝑃2

𝑘𝐸# sat paths use constraints in E

Page 27: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Evaluation

• Implementation

– From Haskell programs to constraints

– SHErrLoc

27

Modified GHC

GHC Constraints

little effort

Constraint Graph

Error Diagnosis

Reports

SHErrLoc

Constraint Translator

SHErrLocConstraints

50 atop 20K+ LOC

~400 LOC~7500 LOC

Page 28: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Evaluation Setup

• Benchmarks

– CE Benchmark: analyzed 77 Haskell programs collected from papers about type-error diagnosis, used in [Chen&Erwig’14]

– Helium benchmark: analyzed 228 programs with type-checking errors, logged by the Helium tool [Hage’14]

• Ground truth

– CE Benchmark: already well-marked

– Helium benchmark: user’s actual fix

• Correctness

– only when the programmer mistake is returned by tools

28

Page 29: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Accuracy on the CE Benchmark

29

Comparison with GHC

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Comparison with the Helium tool[Heeren et al.’03]

Other tool finds the correct errorSHErrLoc misses the error

Both find the correct error

Both miss the correct error

SHErrLoc finds the correct errorOther tool misses the error

SHErrLoc uses no Haskell-specific heuristics!

Page 30: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Accuracy on the Helium Benchmark

30

Comparison with GHC

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Other tool finds the correct errorSHErrLoc misses the error

Both find the correct error

Both miss the correct error

SHErrLoc finds the correct errorOther tool misses the error

Comparison with the Helium tool

Page 31: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

Related Work

• General error localization [Zhang&Myers’14]

– Cannot handle the type system of GHC

– Simpler constraints and constraint analysis algorithm

• Program analyses as constraint solving [e.g., Aiken’99, Foster et al.’06]

– No support for hypotheses and axioms

• Diagnosing Haskell error [e.g., Heeren et al’03,Hage&Heeren’07,Chen&Erwig’14]

– Haskell-specific heuristics

– Unable to handle all of the sophisticated features of GHC

31

Page 32: Diagnosing Type Errors with Class - Penn State …dbz5017/pub/pldi15-slides.pdfDiagnosing Type Errors with Class ... efficient constraint analysis algorithm Key Contributions ... instance

SHErrLoc

General, expressive and accurate error localization

– Applies to the highly expressive type system of GHC

– A novel graph-based constraint analysis algorithm

– Bayesian reasoning => more accurate error locations than with existing tools

32

Type classesType familiesGADTsType signatures

GHC

SHErrLoc