ethics in data design - companyname.com · 2014. 10. 9. · social traffic sources for b2b and b2c...

33
Ethics in data design stat/engl 332

Upload: others

Post on 10-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Ethics in data designstat/engl 332

Page 2: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

What is at stake?

… and for whom?

Page 3: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Statistics often has a PR problem …

• … don’t trust any Statistic unless you have falsified it yourself …

• lies, damn lies, and statistics (Mark Twain)

• There are three kinds of lies: lies, damned lies and statistics (Benjamin Disraeli, British PM)

• “How to lie with statistics”, Darrell Huff, 1993, paperback

!

• “Statistics don’t lie, people do” IVN, Science of political polling

Page 4: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

… what about Charts?!

• Figures don’t lie

• Figures don’t lie but liars do figure (around 1890)

• Figures often beguile me (Mark Twain)

trust and ethical use are inseparable … and there are clearly issues

Page 5: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

General concepts & issues

• To what extent does the design of the data distort the reader’s interpretation?

• What ethical responsibility does the designer have to the audience?

• How do we define “ethical”? What is not?

Page 6: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Some criteria and concepts

• Ethos. Credibility of design/designer.

• Golden rule. Kant’s categorical imperative.

• Lie factor. Tufte’s formula for measuring truth.

• Emphasis. Data made more (or less) visible.

• Inclusion/exclusion. What’s visualized & what’s not?

• Naturalizing effect. Authority based on convention.

• Pathos appeals. Eliciting appropriate emotional response.

Page 7: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Is there anything distorted?

$13,508

$12,203 $11,958 $11,670

$9,420 $9,070 $8,987 $8,386 $8,237

$6,997

$5,829

$0

$2,500

$5,000

$7,500

$10,000

$12,500

$15,000

IL MN CADavis

MIState

OHState

Purdue WI TXA&M

AZ ISU NCState

TUITION AND FEES October 29, 2010

Office of Institutional Research

ISU UNDERGRADUATE RESIDENT TUITION & MANDATORY FEES, 1997-1998 through 2010-2011

UNDERGRADUATE RESIDENT TUITION & FEES AS A PERCENTAGE OF MEDIAN FAMILY INCOME FOR BORDER STATE LAND-GRANT UNIVERSITIES

ISU UNDERGRADUATE ESTIMATED COSTS, 2010-2011

ACADEMIC YEAR UNDERGRADUATE RESIDENT TUITION & FEES AT ISU & PEER LAND-GRANT UNIVERSITIES, 2010-2011

1 Source: U.S. Census Bureau, 2009 American Community Survey

ACADEMIC YEAR UNDERGRADUATE RESIDENT TUITION & FEES AT ISU & BORDER STATE LAND-GRANT UNIVERSITIES, 2010-2011

UNDERGRADUATE RESIDENT TUITION & FEES AS A PERCENTAGE OF MEDIAN FAMILY INCOME FOR PEER LAND-GRANT UNIVERSITIES

1 Source: U.S. Census Bureau, 2009 American Community Survey

ISU ACADEMIC YEAR TOTAL TUITION & MANDATORY FEES

RESIDENTTotal: $18,921

NON-RESIDENTTotal: $30,487

Medical/Dental $534 (2.8%)

Personal $2,280 (12.1%)

Transportation $624 (3.3%)

Tuition $6,102 (32.2%)

Fees $895 (4.7%)

Room/Board $7,472 (39.5%)

Books/Supplies$1,014 (5.4%)

Medical/Dental$534 (1.8%)

Personal$2,280 (7.5%)

Transportation$624 (2.0%)

Tuition$17,668 (58.0%)

Fees$895 (2.9%)

Room/Board$7,472 (24.5%)

Books/Supplies$1,014 (3.3%)

* Mandatory fees include: building, computer, health, health facilities, student activities, recreation and student services.

2009 Median Tuition & Fees2010-11 Family Income as a Percent

Institution Tuition & Fees (by State) 1 of IncomeMichigan State U. $11,670 $56,681 20.6%U. of Illinois $13,508 $66,806 20.2%U. of California-Davis $11,958 $67,038 17.8%U. of Minnesota $12,203 $69,374 17.6%Ohio State U. $ 9,420 $57,360 16.4%Purdue U. $ 9,070 $56,432 16.1%Texas A&M U. $ 8,386 $56,607 14.8%U. of Wisconsin $ 8,987 $62,638 14.3%U. of Arizona $ 8,237 $57,855 14.2%Iowa State U. $ 6,997 $61,156 11.4%North Carolina State U. $ 5,829 $54,288 10.7%

2009 Median Tuition & Fees2010-11 Family Income as a Percent

Institution Tuition & Fees (by State) 1 of IncomeU. of Illinois $13,508 $66,806 20.2%U. of Minnesota $12,203 $69,374 17.6%U. of Missouri $ 8,501 $56,318 15.1%U. of Wisconsin $ 8,987 $62,638 14.3%U. of Nebraska $ 7,224 $60,102 12.0%Iowa State U. $ 6,997 $61,156 11.4%So. Dakota State U. $ 6,444 $57,764 11.2%

2005-06 2006-07 2007-08 2008-09 2009-10 2010-11

UndergraduateResident $ 5,634 $ 5,860 $ 6,161 $ 6,360 $ 6,651 $ 6,997Non-Resident $15,724 $16,354 $16,919 $17,350 $17,871 $18,563

GraduateResident $ 6,410 $ 6,666 $ 7,009 $ 7,236 $ 7,565 $ 7,969Non-Resident $16,422 $17,080 $17,669 $18,120 $18,665 $19,397

Veterinary MedicineResident $12,692 $14,634 $15,391 $15,886 $16,577 $17,519Non-Resident $31,278 $34,972 $36,171 $37,082 $38,155 $39,683

$13,508 $12,203

$ 8,987 $ 8,501 $ 7,224 $ 6,997 $ 6,444

$ 0

$ 2,500

$ 5,000

$ 7,500

$10,000

$12,500

$15,000

IL MN WI MO NE ISU SD State

$ 1,000

$ 2,000

$ 3,000

$ 4,000

$ 5,000

$ 6,000

$ 7,000

$ 8,000

97-98

98-99

99-00

00-01

01-02

02-03

03-04

04-05

05-06

06-07

07-08

08-09

09-10

10-11

Mandatory Fees*

Tuition

Iowa State University Office of Institutional Research, “President’s Council Handout,” 29 October 2010. http://www.ir.iastate.edu/PDFfiles/PCR/1010Handout.pdf

Page 8: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Social Traffic Sources for B2B and B2C Companies

LinkedIn TwitterFacebook

Source: Eloqua Benchmark Data, 2010 Full year

Social Media Referral Traffic: B2B vs. B2CLinkedIn referral traffic is 16x higher for B2B Companies

72%16%

12%

B2B

84%

1%

15%

B2C

“40 Must See Charts for the Modern Marketeer”. oracle, http://www.oracle.com/webfolder/mediaeloqua/documents/40+Charts.pdf

Page 9: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

modern? credible? engaging?

!

!

!

!

• alternative design by Romain Vuillemot

Page 10: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

modern? credible? engaging?

!

!

!

!

• alternative design by Romain Vuillemot

Page 11: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

modern? credible? engaging?

!

!

!

!

• alternative design by Romain Vuillemot

Page 12: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

What do you see?• Survival on board the Titanic: which group of

survivors was the largest?

• Parallel Sets (Kosara et al, 2006)

!

!

!

!

!

• We asked 16 participants in a study to order the number of survivors by class

Yes No

1st 2nd 3rd Crew

Yes No

1st 2nd 3rd Crew

Survived

Class

0 500 1000 1500 2000count

Fig. 2. Parallel sets plot showing the relationship between survival ofthe sinking of the HMS Titanic and class membership. Class member-ship and survival are clearly related, but which class had the largestnumber of survivors?

Crew 1st 2nd 3rdSurvivors 212 203 118 178

Non-Survivors 673 122 167 528

Table 1. Survival status and class membership of all persons on boardthe HMS Titanic. Most survivors were among crew members, followedby first, third, and, lastly, second class passengers.

line-width illusion is a contextual illusion that leads to perceptual dis-tortion in evaluating parallel sets plots. In this paper, we first describeand then quantify this illusion. We also propose and test common angleplots as an alternative method for visualizing multivariate categoricaldata that helps the audience to avoid the distortional effects of the linewidth illusion.

2 LINE WIDTH ILLUSIONS

The phenomenon of the line width illusion is known and widely dis-cussed in statistical graphics literature [7, 34, 35, 28]. It is due to ourtendency to assess distance between curves as the minimal (orthogo-nal) distance rather than the vertical distance – see sketch 5 for a visualrepresentation of both.

On of the earliest examples of the line width illusion is shown infigure 3. This chart displays the balance of trade between England andthe East Indies as demonstrated by William Playfair in his Commercialand Political Atlas, 1786 [25, 26]. One purpose of this chart is tohighlight the difference between imports and exports in a particularyear and the pattern of these differences over time. The difference inexports and imports is encoded as the vertical difference between thelines. When observers are asked to sketch out the difference betweenexports and imports (Cleveland and McGill [7]), they very often miss

Fig. 3. Playfair’s chart from the Commercial and Political Atlas (1786)showing the balance of trade between England and the East Indies. Inwhich years was the difference between imports and exports the high-est?

Fig. 4. Difference between exports and imports from England to andfrom the East Indies in the 18th century – the steep rise in the differencearound 1760 comes as a surprise to many viewers of the raw data infigure 3.

the steep rise in the difference between the lines in the years betweenabout 1755 and 1765. Figure 4 shows the actual difference betweenimports and exports.

In the perception literature, this phenomenon is known as part ofa group of geometrical optical misperceptions of a context-sensitivenature classified as Muller-Lyer illusions [10, 13]. Interestingly, thereseems to be a general agreement that this illusion exists, but a quan-tification of it is curiously absent from literature.

The type of chart as shown in figure 3 proposed by Playfair is aquite common occurrence, particularly in election years – where thesekind of charts are used to enable comparisons of support for differentcandidates. The recommendation from literature is to avoid charts inwhich the audience is asked to do visual subtractions, and show thesedifferences directly [7, 35, 34].

2.1 Strength of the line width illusionWhen visually evaluating lines of thickness greater than one, the linewidth illusion applies. As above, there is a strong preference of evalu-ating the width of lines orthogonal to their slopes as opposed to hori-zontally (see figure 5), which would lead us to a correct evaluation ofparallel sets-style displays.

Orthogonal wo and horizontal wh line widths are related – the or-thogonal line width depends on the angle (or, equivalently, the slope)of the line:

wo = wh sinq , (1)

where q is the angle of the line with respect to the horizontal.

a

b

� ��!

a

b�

� ��!

Fig. 5. Sketch of line width assessments: (a) is showing horizontalwidth, (b) shows width orthogonal to the slope. Survey results in section4.2 indicate that observers associate line width more with orthogonalwidth wo (b) than horizontal width wh (a).

The perceived slope of a line depends on the aspect ratio of thecorresponding plot – changing the height to width ratio of a displaywill change our perception of the corresponding line widths, if theyare not adjusted for the slope [7]. This finding is not new, but itsstrength on our perception is surprising, as can be seen in the exampleof figure 6. Again, survival and class membership on the Titanic isshown; the same parallel sets plot is shown twice in this figure, butwith very different aspect ratios: in the plot on the left the numberof surviving 3rd class passengers seems to be about twice as big asthe number of survivors among crew members, whereas in the plot on

Page 13: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

What do you see?• 11 out of 16 participants (69%) picked 1st class

Yes No

1st 2nd 3rd Crew

Yes No

1st 2nd 3rd Crew

Survived

Class

0 500 1000 1500 2000count

Fig. 2. Parallel sets plot showing the relationship between survival ofthe sinking of the HMS Titanic and class membership. Class member-ship and survival are clearly related, but which class had the largestnumber of survivors?

Crew 1st 2nd 3rdSurvivors 212 203 118 178

Non-Survivors 673 122 167 528

Table 1. Survival status and class membership of all persons on boardthe HMS Titanic. Most survivors were among crew members, followedby first, third, and, lastly, second class passengers.

line-width illusion is a contextual illusion that leads to perceptual dis-tortion in evaluating parallel sets plots. In this paper, we first describeand then quantify this illusion. We also propose and test common angleplots as an alternative method for visualizing multivariate categoricaldata that helps the audience to avoid the distortional effects of the linewidth illusion.

2 LINE WIDTH ILLUSIONS

The phenomenon of the line width illusion is known and widely dis-cussed in statistical graphics literature [7, 34, 35, 28]. It is due to ourtendency to assess distance between curves as the minimal (orthogo-nal) distance rather than the vertical distance – see sketch 5 for a visualrepresentation of both.

On of the earliest examples of the line width illusion is shown infigure 3. This chart displays the balance of trade between England andthe East Indies as demonstrated by William Playfair in his Commercialand Political Atlas, 1786 [25, 26]. One purpose of this chart is tohighlight the difference between imports and exports in a particularyear and the pattern of these differences over time. The difference inexports and imports is encoded as the vertical difference between thelines. When observers are asked to sketch out the difference betweenexports and imports (Cleveland and McGill [7]), they very often miss

Fig. 3. Playfair’s chart from the Commercial and Political Atlas (1786)showing the balance of trade between England and the East Indies. Inwhich years was the difference between imports and exports the high-est?

Fig. 4. Difference between exports and imports from England to andfrom the East Indies in the 18th century – the steep rise in the differencearound 1760 comes as a surprise to many viewers of the raw data infigure 3.

the steep rise in the difference between the lines in the years betweenabout 1755 and 1765. Figure 4 shows the actual difference betweenimports and exports.

In the perception literature, this phenomenon is known as part ofa group of geometrical optical misperceptions of a context-sensitivenature classified as Muller-Lyer illusions [10, 13]. Interestingly, thereseems to be a general agreement that this illusion exists, but a quan-tification of it is curiously absent from literature.

The type of chart as shown in figure 3 proposed by Playfair is aquite common occurrence, particularly in election years – where thesekind of charts are used to enable comparisons of support for differentcandidates. The recommendation from literature is to avoid charts inwhich the audience is asked to do visual subtractions, and show thesedifferences directly [7, 35, 34].

2.1 Strength of the line width illusionWhen visually evaluating lines of thickness greater than one, the linewidth illusion applies. As above, there is a strong preference of evalu-ating the width of lines orthogonal to their slopes as opposed to hori-zontally (see figure 5), which would lead us to a correct evaluation ofparallel sets-style displays.

Orthogonal wo and horizontal wh line widths are related – the or-thogonal line width depends on the angle (or, equivalently, the slope)of the line:

wo = wh sinq , (1)

where q is the angle of the line with respect to the horizontal.

a

b

� ��!

a

b�

� ��!

Fig. 5. Sketch of line width assessments: (a) is showing horizontalwidth, (b) shows width orthogonal to the slope. Survey results in section4.2 indicate that observers associate line width more with orthogonalwidth wo (b) than horizontal width wh (a).

The perceived slope of a line depends on the aspect ratio of thecorresponding plot – changing the height to width ratio of a displaywill change our perception of the corresponding line widths, if theyare not adjusted for the slope [7]. This finding is not new, but itsstrength on our perception is surprising, as can be seen in the exampleof figure 6. Again, survival and class membership on the Titanic isshown; the same parallel sets plot is shown twice in this figure, butwith very different aspect ratios: in the plot on the left the numberof surviving 3rd class passengers seems to be about twice as big asthe number of survivors among crew members, whereas in the plot on

Page 14: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

What do you see?• 11 out of 16 participants (69%) picked 1st class

Yes No

1st 2nd 3rd Crew

Yes No

1st 2nd 3rd Crew

Survived

Class

0 500 1000 1500 2000count

Fig. 2. Parallel sets plot showing the relationship between survival ofthe sinking of the HMS Titanic and class membership. Class member-ship and survival are clearly related, but which class had the largestnumber of survivors?

Crew 1st 2nd 3rdSurvivors 212 203 118 178

Non-Survivors 673 122 167 528

Table 1. Survival status and class membership of all persons on boardthe HMS Titanic. Most survivors were among crew members, followedby first, third, and, lastly, second class passengers.

line-width illusion is a contextual illusion that leads to perceptual dis-tortion in evaluating parallel sets plots. In this paper, we first describeand then quantify this illusion. We also propose and test common angleplots as an alternative method for visualizing multivariate categoricaldata that helps the audience to avoid the distortional effects of the linewidth illusion.

2 LINE WIDTH ILLUSIONS

The phenomenon of the line width illusion is known and widely dis-cussed in statistical graphics literature [7, 34, 35, 28]. It is due to ourtendency to assess distance between curves as the minimal (orthogo-nal) distance rather than the vertical distance – see sketch 5 for a visualrepresentation of both.

On of the earliest examples of the line width illusion is shown infigure 3. This chart displays the balance of trade between England andthe East Indies as demonstrated by William Playfair in his Commercialand Political Atlas, 1786 [25, 26]. One purpose of this chart is tohighlight the difference between imports and exports in a particularyear and the pattern of these differences over time. The difference inexports and imports is encoded as the vertical difference between thelines. When observers are asked to sketch out the difference betweenexports and imports (Cleveland and McGill [7]), they very often miss

Fig. 3. Playfair’s chart from the Commercial and Political Atlas (1786)showing the balance of trade between England and the East Indies. Inwhich years was the difference between imports and exports the high-est?

Fig. 4. Difference between exports and imports from England to andfrom the East Indies in the 18th century – the steep rise in the differencearound 1760 comes as a surprise to many viewers of the raw data infigure 3.

the steep rise in the difference between the lines in the years betweenabout 1755 and 1765. Figure 4 shows the actual difference betweenimports and exports.

In the perception literature, this phenomenon is known as part ofa group of geometrical optical misperceptions of a context-sensitivenature classified as Muller-Lyer illusions [10, 13]. Interestingly, thereseems to be a general agreement that this illusion exists, but a quan-tification of it is curiously absent from literature.

The type of chart as shown in figure 3 proposed by Playfair is aquite common occurrence, particularly in election years – where thesekind of charts are used to enable comparisons of support for differentcandidates. The recommendation from literature is to avoid charts inwhich the audience is asked to do visual subtractions, and show thesedifferences directly [7, 35, 34].

2.1 Strength of the line width illusionWhen visually evaluating lines of thickness greater than one, the linewidth illusion applies. As above, there is a strong preference of evalu-ating the width of lines orthogonal to their slopes as opposed to hori-zontally (see figure 5), which would lead us to a correct evaluation ofparallel sets-style displays.

Orthogonal wo and horizontal wh line widths are related – the or-thogonal line width depends on the angle (or, equivalently, the slope)of the line:

wo = wh sinq , (1)

where q is the angle of the line with respect to the horizontal.

a

b

� ��!

a

b�

� ��!

Fig. 5. Sketch of line width assessments: (a) is showing horizontalwidth, (b) shows width orthogonal to the slope. Survey results in section4.2 indicate that observers associate line width more with orthogonalwidth wo (b) than horizontal width wh (a).

The perceived slope of a line depends on the aspect ratio of thecorresponding plot – changing the height to width ratio of a displaywill change our perception of the corresponding line widths, if theyare not adjusted for the slope [7]. This finding is not new, but itsstrength on our perception is surprising, as can be seen in the exampleof figure 6. Again, survival and class membership on the Titanic isshown; the same parallel sets plot is shown twice in this figure, butwith very different aspect ratios: in the plot on the left the numberof surviving 3rd class passengers seems to be about twice as big asthe number of survivors among crew members, whereas in the plot on

• This is unfortunately not right ...

Page 15: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

What do you see?• 11 out of 16 participants (69%) picked 1st class

Yes No

1st 2nd 3rd Crew

Yes No

1st 2nd 3rd Crew

Survived

Class

0 500 1000 1500 2000count

Fig. 2. Parallel sets plot showing the relationship between survival ofthe sinking of the HMS Titanic and class membership. Class member-ship and survival are clearly related, but which class had the largestnumber of survivors?

Crew 1st 2nd 3rdSurvivors 212 203 118 178

Non-Survivors 673 122 167 528

Table 1. Survival status and class membership of all persons on boardthe HMS Titanic. Most survivors were among crew members, followedby first, third, and, lastly, second class passengers.

line-width illusion is a contextual illusion that leads to perceptual dis-tortion in evaluating parallel sets plots. In this paper, we first describeand then quantify this illusion. We also propose and test common angleplots as an alternative method for visualizing multivariate categoricaldata that helps the audience to avoid the distortional effects of the linewidth illusion.

2 LINE WIDTH ILLUSIONS

The phenomenon of the line width illusion is known and widely dis-cussed in statistical graphics literature [7, 34, 35, 28]. It is due to ourtendency to assess distance between curves as the minimal (orthogo-nal) distance rather than the vertical distance – see sketch 5 for a visualrepresentation of both.

On of the earliest examples of the line width illusion is shown infigure 3. This chart displays the balance of trade between England andthe East Indies as demonstrated by William Playfair in his Commercialand Political Atlas, 1786 [25, 26]. One purpose of this chart is tohighlight the difference between imports and exports in a particularyear and the pattern of these differences over time. The difference inexports and imports is encoded as the vertical difference between thelines. When observers are asked to sketch out the difference betweenexports and imports (Cleveland and McGill [7]), they very often miss

Fig. 3. Playfair’s chart from the Commercial and Political Atlas (1786)showing the balance of trade between England and the East Indies. Inwhich years was the difference between imports and exports the high-est?

Fig. 4. Difference between exports and imports from England to andfrom the East Indies in the 18th century – the steep rise in the differencearound 1760 comes as a surprise to many viewers of the raw data infigure 3.

the steep rise in the difference between the lines in the years betweenabout 1755 and 1765. Figure 4 shows the actual difference betweenimports and exports.

In the perception literature, this phenomenon is known as part ofa group of geometrical optical misperceptions of a context-sensitivenature classified as Muller-Lyer illusions [10, 13]. Interestingly, thereseems to be a general agreement that this illusion exists, but a quan-tification of it is curiously absent from literature.

The type of chart as shown in figure 3 proposed by Playfair is aquite common occurrence, particularly in election years – where thesekind of charts are used to enable comparisons of support for differentcandidates. The recommendation from literature is to avoid charts inwhich the audience is asked to do visual subtractions, and show thesedifferences directly [7, 35, 34].

2.1 Strength of the line width illusionWhen visually evaluating lines of thickness greater than one, the linewidth illusion applies. As above, there is a strong preference of evalu-ating the width of lines orthogonal to their slopes as opposed to hori-zontally (see figure 5), which would lead us to a correct evaluation ofparallel sets-style displays.

Orthogonal wo and horizontal wh line widths are related – the or-thogonal line width depends on the angle (or, equivalently, the slope)of the line:

wo = wh sinq , (1)

where q is the angle of the line with respect to the horizontal.

a

b

� ��!

a

b�

� ��!

Fig. 5. Sketch of line width assessments: (a) is showing horizontalwidth, (b) shows width orthogonal to the slope. Survey results in section4.2 indicate that observers associate line width more with orthogonalwidth wo (b) than horizontal width wh (a).

The perceived slope of a line depends on the aspect ratio of thecorresponding plot – changing the height to width ratio of a displaywill change our perception of the corresponding line widths, if theyare not adjusted for the slope [7]. This finding is not new, but itsstrength on our perception is surprising, as can be seen in the exampleof figure 6. Again, survival and class membership on the Titanic isshown; the same parallel sets plot is shown twice in this figure, butwith very different aspect ratios: in the plot on the left the numberof surviving 3rd class passengers seems to be about twice as big asthe number of survivors among crew members, whereas in the plot on

• This is unfortunately not right ...

Page 16: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

What do you see?• 11 out of 16 participants (69%) picked 1st class

Yes No

1st 2nd 3rd Crew

Yes No

1st 2nd 3rd Crew

Survived

Class

0 500 1000 1500 2000count

Fig. 2. Parallel sets plot showing the relationship between survival ofthe sinking of the HMS Titanic and class membership. Class member-ship and survival are clearly related, but which class had the largestnumber of survivors?

Crew 1st 2nd 3rdSurvivors 212 203 118 178

Non-Survivors 673 122 167 528

Table 1. Survival status and class membership of all persons on boardthe HMS Titanic. Most survivors were among crew members, followedby first, third, and, lastly, second class passengers.

line-width illusion is a contextual illusion that leads to perceptual dis-tortion in evaluating parallel sets plots. In this paper, we first describeand then quantify this illusion. We also propose and test common angleplots as an alternative method for visualizing multivariate categoricaldata that helps the audience to avoid the distortional effects of the linewidth illusion.

2 LINE WIDTH ILLUSIONS

The phenomenon of the line width illusion is known and widely dis-cussed in statistical graphics literature [7, 34, 35, 28]. It is due to ourtendency to assess distance between curves as the minimal (orthogo-nal) distance rather than the vertical distance – see sketch 5 for a visualrepresentation of both.

On of the earliest examples of the line width illusion is shown infigure 3. This chart displays the balance of trade between England andthe East Indies as demonstrated by William Playfair in his Commercialand Political Atlas, 1786 [25, 26]. One purpose of this chart is tohighlight the difference between imports and exports in a particularyear and the pattern of these differences over time. The difference inexports and imports is encoded as the vertical difference between thelines. When observers are asked to sketch out the difference betweenexports and imports (Cleveland and McGill [7]), they very often miss

Fig. 3. Playfair’s chart from the Commercial and Political Atlas (1786)showing the balance of trade between England and the East Indies. Inwhich years was the difference between imports and exports the high-est?

Fig. 4. Difference between exports and imports from England to andfrom the East Indies in the 18th century – the steep rise in the differencearound 1760 comes as a surprise to many viewers of the raw data infigure 3.

the steep rise in the difference between the lines in the years betweenabout 1755 and 1765. Figure 4 shows the actual difference betweenimports and exports.

In the perception literature, this phenomenon is known as part ofa group of geometrical optical misperceptions of a context-sensitivenature classified as Muller-Lyer illusions [10, 13]. Interestingly, thereseems to be a general agreement that this illusion exists, but a quan-tification of it is curiously absent from literature.

The type of chart as shown in figure 3 proposed by Playfair is aquite common occurrence, particularly in election years – where thesekind of charts are used to enable comparisons of support for differentcandidates. The recommendation from literature is to avoid charts inwhich the audience is asked to do visual subtractions, and show thesedifferences directly [7, 35, 34].

2.1 Strength of the line width illusionWhen visually evaluating lines of thickness greater than one, the linewidth illusion applies. As above, there is a strong preference of evalu-ating the width of lines orthogonal to their slopes as opposed to hori-zontally (see figure 5), which would lead us to a correct evaluation ofparallel sets-style displays.

Orthogonal wo and horizontal wh line widths are related – the or-thogonal line width depends on the angle (or, equivalently, the slope)of the line:

wo = wh sinq , (1)

where q is the angle of the line with respect to the horizontal.

a

b

� ��!

a

b�

� ��!

Fig. 5. Sketch of line width assessments: (a) is showing horizontalwidth, (b) shows width orthogonal to the slope. Survey results in section4.2 indicate that observers associate line width more with orthogonalwidth wo (b) than horizontal width wh (a).

The perceived slope of a line depends on the aspect ratio of thecorresponding plot – changing the height to width ratio of a displaywill change our perception of the corresponding line widths, if theyare not adjusted for the slope [7]. This finding is not new, but itsstrength on our perception is surprising, as can be seen in the exampleof figure 6. Again, survival and class membership on the Titanic isshown; the same parallel sets plot is shown twice in this figure, butwith very different aspect ratios: in the plot on the left the numberof surviving 3rd class passengers seems to be about twice as big asthe number of survivors among crew members, whereas in the plot on

212 Crew members survived as opposed to 203 first class passengers … it’s a small difference … but it doesn’t look small

• This is unfortunately not right ...

Page 17: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

… and here are the numbers ...

• One of the 16 participants got the order right

Yes No

1st 2nd 3rd Crew

Yes No

1st 2nd 3rd Crew

Survived

Class

0 500 1000 1500 2000count

Fig. 2. Parallel sets plot showing the relationship between survival ofthe sinking of the HMS Titanic and class membership. Class member-ship and survival are clearly related, but which class had the largestnumber of survivors?

Crew 1st 2nd 3rdSurvivors 212 203 118 178

Non-Survivors 673 122 167 528

Table 1. Survival status and class membership of all persons on boardthe HMS Titanic. Most survivors were among crew members, followedby first, third, and, lastly, second class passengers.

line-width illusion is a contextual illusion that leads to perceptual dis-tortion in evaluating parallel sets plots. In this paper, we first describeand then quantify this illusion. We also propose and test common angleplots as an alternative method for visualizing multivariate categoricaldata that helps the audience to avoid the distortional effects of the linewidth illusion.

2 LINE WIDTH ILLUSIONS

The phenomenon of the line width illusion is known and widely dis-cussed in statistical graphics literature [7, 34, 35, 28]. It is due to ourtendency to assess distance between curves as the minimal (orthogo-nal) distance rather than the vertical distance – see sketch 5 for a visualrepresentation of both.

On of the earliest examples of the line width illusion is shown infigure 3. This chart displays the balance of trade between England andthe East Indies as demonstrated by William Playfair in his Commercialand Political Atlas, 1786 [25, 26]. One purpose of this chart is tohighlight the difference between imports and exports in a particularyear and the pattern of these differences over time. The difference inexports and imports is encoded as the vertical difference between thelines. When observers are asked to sketch out the difference betweenexports and imports (Cleveland and McGill [7]), they very often miss

Fig. 3. Playfair’s chart from the Commercial and Political Atlas (1786)showing the balance of trade between England and the East Indies. Inwhich years was the difference between imports and exports the high-est?

Fig. 4. Difference between exports and imports from England to andfrom the East Indies in the 18th century – the steep rise in the differencearound 1760 comes as a surprise to many viewers of the raw data infigure 3.

the steep rise in the difference between the lines in the years betweenabout 1755 and 1765. Figure 4 shows the actual difference betweenimports and exports.

In the perception literature, this phenomenon is known as part ofa group of geometrical optical misperceptions of a context-sensitivenature classified as Muller-Lyer illusions [10, 13]. Interestingly, thereseems to be a general agreement that this illusion exists, but a quan-tification of it is curiously absent from literature.

The type of chart as shown in figure 3 proposed by Playfair is aquite common occurrence, particularly in election years – where thesekind of charts are used to enable comparisons of support for differentcandidates. The recommendation from literature is to avoid charts inwhich the audience is asked to do visual subtractions, and show thesedifferences directly [7, 35, 34].

2.1 Strength of the line width illusionWhen visually evaluating lines of thickness greater than one, the linewidth illusion applies. As above, there is a strong preference of evalu-ating the width of lines orthogonal to their slopes as opposed to hori-zontally (see figure 5), which would lead us to a correct evaluation ofparallel sets-style displays.

Orthogonal wo and horizontal wh line widths are related – the or-thogonal line width depends on the angle (or, equivalently, the slope)of the line:

wo = wh sinq , (1)

where q is the angle of the line with respect to the horizontal.

a

b

� ��!

a

b�

� ��!

Fig. 5. Sketch of line width assessments: (a) is showing horizontalwidth, (b) shows width orthogonal to the slope. Survey results in section4.2 indicate that observers associate line width more with orthogonalwidth wo (b) than horizontal width wh (a).

The perceived slope of a line depends on the aspect ratio of thecorresponding plot – changing the height to width ratio of a displaywill change our perception of the corresponding line widths, if theyare not adjusted for the slope [7]. This finding is not new, but itsstrength on our perception is surprising, as can be seen in the exampleof figure 6. Again, survival and class membership on the Titanic isshown; the same parallel sets plot is shown twice in this figure, butwith very different aspect ratios: in the plot on the left the numberof surviving 3rd class passengers seems to be about twice as big asthe number of survivors among crew members, whereas in the plot on

Yes No

1st 2nd 3rd Crew

Yes No

1st 2nd 3rd Crew

Survived

Class

0 500 1000 1500 2000count

Fig. 2. Parallel sets plot showing the relationship between survival ofthe sinking of the HMS Titanic and class membership. Class member-ship and survival are clearly related, but which class had the largestnumber of survivors?

Crew 1st 2nd 3rdSurvivors 212 203 118 178

Non-Survivors 673 122 167 528

Table 1. Survival status and class membership of all persons on boardthe HMS Titanic. Most survivors were among crew members, followedby first, third, and, lastly, second class passengers.

line-width illusion is a contextual illusion that leads to perceptual dis-tortion in evaluating parallel sets plots. In this paper, we first describeand then quantify this illusion. We also propose and test common angleplots as an alternative method for visualizing multivariate categoricaldata that helps the audience to avoid the distortional effects of the linewidth illusion.

2 LINE WIDTH ILLUSIONS

The phenomenon of the line width illusion is known and widely dis-cussed in statistical graphics literature [7, 34, 35, 28]. It is due to ourtendency to assess distance between curves as the minimal (orthogo-nal) distance rather than the vertical distance – see sketch 5 for a visualrepresentation of both.

On of the earliest examples of the line width illusion is shown infigure 3. This chart displays the balance of trade between England andthe East Indies as demonstrated by William Playfair in his Commercialand Political Atlas, 1786 [25, 26]. One purpose of this chart is tohighlight the difference between imports and exports in a particularyear and the pattern of these differences over time. The difference inexports and imports is encoded as the vertical difference between thelines. When observers are asked to sketch out the difference betweenexports and imports (Cleveland and McGill [7]), they very often miss

Fig. 3. Playfair’s chart from the Commercial and Political Atlas (1786)showing the balance of trade between England and the East Indies. Inwhich years was the difference between imports and exports the high-est?

Fig. 4. Difference between exports and imports from England to andfrom the East Indies in the 18th century – the steep rise in the differencearound 1760 comes as a surprise to many viewers of the raw data infigure 3.

the steep rise in the difference between the lines in the years betweenabout 1755 and 1765. Figure 4 shows the actual difference betweenimports and exports.

In the perception literature, this phenomenon is known as part ofa group of geometrical optical misperceptions of a context-sensitivenature classified as Muller-Lyer illusions [10, 13]. Interestingly, thereseems to be a general agreement that this illusion exists, but a quan-tification of it is curiously absent from literature.

The type of chart as shown in figure 3 proposed by Playfair is aquite common occurrence, particularly in election years – where thesekind of charts are used to enable comparisons of support for differentcandidates. The recommendation from literature is to avoid charts inwhich the audience is asked to do visual subtractions, and show thesedifferences directly [7, 35, 34].

2.1 Strength of the line width illusionWhen visually evaluating lines of thickness greater than one, the linewidth illusion applies. As above, there is a strong preference of evalu-ating the width of lines orthogonal to their slopes as opposed to hori-zontally (see figure 5), which would lead us to a correct evaluation ofparallel sets-style displays.

Orthogonal wo and horizontal wh line widths are related – the or-thogonal line width depends on the angle (or, equivalently, the slope)of the line:

wo = wh sinq , (1)

where q is the angle of the line with respect to the horizontal.

a

b

� ��!

a

b�

� ��!

Fig. 5. Sketch of line width assessments: (a) is showing horizontalwidth, (b) shows width orthogonal to the slope. Survey results in section4.2 indicate that observers associate line width more with orthogonalwidth wo (b) than horizontal width wh (a).

The perceived slope of a line depends on the aspect ratio of thecorresponding plot – changing the height to width ratio of a displaywill change our perception of the corresponding line widths, if theyare not adjusted for the slope [7]. This finding is not new, but itsstrength on our perception is surprising, as can be seen in the exampleof figure 6. Again, survival and class membership on the Titanic isshown; the same parallel sets plot is shown twice in this figure, butwith very different aspect ratios: in the plot on the left the numberof surviving 3rd class passengers seems to be about twice as big asthe number of survivors among crew members, whereas in the plot on

Page 18: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

The line width problem • The area of the line segments encodes the

numbers, the horizontal width also encodes it …

!

!

!

!

• … but participants (we?) seem to pick up on the orthogonal line width

• 10 of the 16 put Crew behind 3rd class

Yes No

1st 2nd 3rd Crew

Yes No

1st 2nd 3rd Crew

Survived

Class

0 500 1000 1500 2000count

Fig. 2. Parallel sets plot showing the relationship between survival ofthe sinking of the HMS Titanic and class membership. Class member-ship and survival are clearly related, but which class had the largestnumber of survivors?

Crew 1st 2nd 3rdSurvivors 212 203 118 178

Non-Survivors 673 122 167 528

Table 1. Survival status and class membership of all persons on boardthe HMS Titanic. Most survivors were among crew members, followedby first, third, and, lastly, second class passengers.

line-width illusion is a contextual illusion that leads to perceptual dis-tortion in evaluating parallel sets plots. In this paper, we first describeand then quantify this illusion. We also propose and test common angleplots as an alternative method for visualizing multivariate categoricaldata that helps the audience to avoid the distortional effects of the linewidth illusion.

2 LINE WIDTH ILLUSIONS

The phenomenon of the line width illusion is known and widely dis-cussed in statistical graphics literature [7, 34, 35, 28]. It is due to ourtendency to assess distance between curves as the minimal (orthogo-nal) distance rather than the vertical distance – see sketch 5 for a visualrepresentation of both.

On of the earliest examples of the line width illusion is shown infigure 3. This chart displays the balance of trade between England andthe East Indies as demonstrated by William Playfair in his Commercialand Political Atlas, 1786 [25, 26]. One purpose of this chart is tohighlight the difference between imports and exports in a particularyear and the pattern of these differences over time. The difference inexports and imports is encoded as the vertical difference between thelines. When observers are asked to sketch out the difference betweenexports and imports (Cleveland and McGill [7]), they very often miss

Fig. 3. Playfair’s chart from the Commercial and Political Atlas (1786)showing the balance of trade between England and the East Indies. Inwhich years was the difference between imports and exports the high-est?

Fig. 4. Difference between exports and imports from England to andfrom the East Indies in the 18th century – the steep rise in the differencearound 1760 comes as a surprise to many viewers of the raw data infigure 3.

the steep rise in the difference between the lines in the years betweenabout 1755 and 1765. Figure 4 shows the actual difference betweenimports and exports.

In the perception literature, this phenomenon is known as part ofa group of geometrical optical misperceptions of a context-sensitivenature classified as Muller-Lyer illusions [10, 13]. Interestingly, thereseems to be a general agreement that this illusion exists, but a quan-tification of it is curiously absent from literature.

The type of chart as shown in figure 3 proposed by Playfair is aquite common occurrence, particularly in election years – where thesekind of charts are used to enable comparisons of support for differentcandidates. The recommendation from literature is to avoid charts inwhich the audience is asked to do visual subtractions, and show thesedifferences directly [7, 35, 34].

2.1 Strength of the line width illusionWhen visually evaluating lines of thickness greater than one, the linewidth illusion applies. As above, there is a strong preference of evalu-ating the width of lines orthogonal to their slopes as opposed to hori-zontally (see figure 5), which would lead us to a correct evaluation ofparallel sets-style displays.

Orthogonal wo and horizontal wh line widths are related – the or-thogonal line width depends on the angle (or, equivalently, the slope)of the line:

wo = wh sinq , (1)

where q is the angle of the line with respect to the horizontal.

a

b

� ��!

a

b�

� ��!

Fig. 5. Sketch of line width assessments: (a) is showing horizontalwidth, (b) shows width orthogonal to the slope. Survey results in section4.2 indicate that observers associate line width more with orthogonalwidth wo (b) than horizontal width wh (a).

The perceived slope of a line depends on the aspect ratio of thecorresponding plot – changing the height to width ratio of a displaywill change our perception of the corresponding line widths, if theyare not adjusted for the slope [7]. This finding is not new, but itsstrength on our perception is surprising, as can be seen in the exampleof figure 6. Again, survival and class membership on the Titanic isshown; the same parallel sets plot is shown twice in this figure, butwith very different aspect ratios: in the plot on the left the numberof surviving 3rd class passengers seems to be about twice as big asthe number of survivors among crew members, whereas in the plot on

(213 - 178)/213 = 16.4%

Page 19: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

The line width problem • The area of the line segments encodes the

numbers, the horizontal width also encodes it …

!

!

!

!

• … but participants (we?) seem to pick up on the orthogonal line width

• 10 of the 16 put Crew behind 3rd class

Yes No

1st 2nd 3rd Crew

Yes No

1st 2nd 3rd Crew

Survived

Class

0 500 1000 1500 2000count

Fig. 2. Parallel sets plot showing the relationship between survival ofthe sinking of the HMS Titanic and class membership. Class member-ship and survival are clearly related, but which class had the largestnumber of survivors?

Crew 1st 2nd 3rdSurvivors 212 203 118 178

Non-Survivors 673 122 167 528

Table 1. Survival status and class membership of all persons on boardthe HMS Titanic. Most survivors were among crew members, followedby first, third, and, lastly, second class passengers.

line-width illusion is a contextual illusion that leads to perceptual dis-tortion in evaluating parallel sets plots. In this paper, we first describeand then quantify this illusion. We also propose and test common angleplots as an alternative method for visualizing multivariate categoricaldata that helps the audience to avoid the distortional effects of the linewidth illusion.

2 LINE WIDTH ILLUSIONS

The phenomenon of the line width illusion is known and widely dis-cussed in statistical graphics literature [7, 34, 35, 28]. It is due to ourtendency to assess distance between curves as the minimal (orthogo-nal) distance rather than the vertical distance – see sketch 5 for a visualrepresentation of both.

On of the earliest examples of the line width illusion is shown infigure 3. This chart displays the balance of trade between England andthe East Indies as demonstrated by William Playfair in his Commercialand Political Atlas, 1786 [25, 26]. One purpose of this chart is tohighlight the difference between imports and exports in a particularyear and the pattern of these differences over time. The difference inexports and imports is encoded as the vertical difference between thelines. When observers are asked to sketch out the difference betweenexports and imports (Cleveland and McGill [7]), they very often miss

Fig. 3. Playfair’s chart from the Commercial and Political Atlas (1786)showing the balance of trade between England and the East Indies. Inwhich years was the difference between imports and exports the high-est?

Fig. 4. Difference between exports and imports from England to andfrom the East Indies in the 18th century – the steep rise in the differencearound 1760 comes as a surprise to many viewers of the raw data infigure 3.

the steep rise in the difference between the lines in the years betweenabout 1755 and 1765. Figure 4 shows the actual difference betweenimports and exports.

In the perception literature, this phenomenon is known as part ofa group of geometrical optical misperceptions of a context-sensitivenature classified as Muller-Lyer illusions [10, 13]. Interestingly, thereseems to be a general agreement that this illusion exists, but a quan-tification of it is curiously absent from literature.

The type of chart as shown in figure 3 proposed by Playfair is aquite common occurrence, particularly in election years – where thesekind of charts are used to enable comparisons of support for differentcandidates. The recommendation from literature is to avoid charts inwhich the audience is asked to do visual subtractions, and show thesedifferences directly [7, 35, 34].

2.1 Strength of the line width illusionWhen visually evaluating lines of thickness greater than one, the linewidth illusion applies. As above, there is a strong preference of evalu-ating the width of lines orthogonal to their slopes as opposed to hori-zontally (see figure 5), which would lead us to a correct evaluation ofparallel sets-style displays.

Orthogonal wo and horizontal wh line widths are related – the or-thogonal line width depends on the angle (or, equivalently, the slope)of the line:

wo = wh sinq , (1)

where q is the angle of the line with respect to the horizontal.

a

b

� ��!

a

b�

� ��!

Fig. 5. Sketch of line width assessments: (a) is showing horizontalwidth, (b) shows width orthogonal to the slope. Survey results in section4.2 indicate that observers associate line width more with orthogonalwidth wo (b) than horizontal width wh (a).

The perceived slope of a line depends on the aspect ratio of thecorresponding plot – changing the height to width ratio of a displaywill change our perception of the corresponding line widths, if theyare not adjusted for the slope [7]. This finding is not new, but itsstrength on our perception is surprising, as can be seen in the exampleof figure 6. Again, survival and class membership on the Titanic isshown; the same parallel sets plot is shown twice in this figure, butwith very different aspect ratios: in the plot on the left the numberof surviving 3rd class passengers seems to be about twice as big asthe number of survivors among crew members, whereas in the plot on

(213 - 178)/213 = 16.4%

Page 20: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

The line width problem • The area of the line segments encodes the

numbers, the horizontal width also encodes it …

!

!

!

!

• … but participants (we?) seem to pick up on the orthogonal line width

• 10 of the 16 put Crew behind 3rd class

Yes No

1st 2nd 3rd Crew

Yes No

1st 2nd 3rd Crew

Survived

Class

0 500 1000 1500 2000count

Fig. 2. Parallel sets plot showing the relationship between survival ofthe sinking of the HMS Titanic and class membership. Class member-ship and survival are clearly related, but which class had the largestnumber of survivors?

Crew 1st 2nd 3rdSurvivors 212 203 118 178

Non-Survivors 673 122 167 528

Table 1. Survival status and class membership of all persons on boardthe HMS Titanic. Most survivors were among crew members, followedby first, third, and, lastly, second class passengers.

line-width illusion is a contextual illusion that leads to perceptual dis-tortion in evaluating parallel sets plots. In this paper, we first describeand then quantify this illusion. We also propose and test common angleplots as an alternative method for visualizing multivariate categoricaldata that helps the audience to avoid the distortional effects of the linewidth illusion.

2 LINE WIDTH ILLUSIONS

The phenomenon of the line width illusion is known and widely dis-cussed in statistical graphics literature [7, 34, 35, 28]. It is due to ourtendency to assess distance between curves as the minimal (orthogo-nal) distance rather than the vertical distance – see sketch 5 for a visualrepresentation of both.

On of the earliest examples of the line width illusion is shown infigure 3. This chart displays the balance of trade between England andthe East Indies as demonstrated by William Playfair in his Commercialand Political Atlas, 1786 [25, 26]. One purpose of this chart is tohighlight the difference between imports and exports in a particularyear and the pattern of these differences over time. The difference inexports and imports is encoded as the vertical difference between thelines. When observers are asked to sketch out the difference betweenexports and imports (Cleveland and McGill [7]), they very often miss

Fig. 3. Playfair’s chart from the Commercial and Political Atlas (1786)showing the balance of trade between England and the East Indies. Inwhich years was the difference between imports and exports the high-est?

Fig. 4. Difference between exports and imports from England to andfrom the East Indies in the 18th century – the steep rise in the differencearound 1760 comes as a surprise to many viewers of the raw data infigure 3.

the steep rise in the difference between the lines in the years betweenabout 1755 and 1765. Figure 4 shows the actual difference betweenimports and exports.

In the perception literature, this phenomenon is known as part ofa group of geometrical optical misperceptions of a context-sensitivenature classified as Muller-Lyer illusions [10, 13]. Interestingly, thereseems to be a general agreement that this illusion exists, but a quan-tification of it is curiously absent from literature.

The type of chart as shown in figure 3 proposed by Playfair is aquite common occurrence, particularly in election years – where thesekind of charts are used to enable comparisons of support for differentcandidates. The recommendation from literature is to avoid charts inwhich the audience is asked to do visual subtractions, and show thesedifferences directly [7, 35, 34].

2.1 Strength of the line width illusionWhen visually evaluating lines of thickness greater than one, the linewidth illusion applies. As above, there is a strong preference of evalu-ating the width of lines orthogonal to their slopes as opposed to hori-zontally (see figure 5), which would lead us to a correct evaluation ofparallel sets-style displays.

Orthogonal wo and horizontal wh line widths are related – the or-thogonal line width depends on the angle (or, equivalently, the slope)of the line:

wo = wh sinq , (1)

where q is the angle of the line with respect to the horizontal.

a

b

� ��!

a

b�

� ��!

Fig. 5. Sketch of line width assessments: (a) is showing horizontalwidth, (b) shows width orthogonal to the slope. Survey results in section4.2 indicate that observers associate line width more with orthogonalwidth wo (b) than horizontal width wh (a).

The perceived slope of a line depends on the aspect ratio of thecorresponding plot – changing the height to width ratio of a displaywill change our perception of the corresponding line widths, if theyare not adjusted for the slope [7]. This finding is not new, but itsstrength on our perception is surprising, as can be seen in the exampleof figure 6. Again, survival and class membership on the Titanic isshown; the same parallel sets plot is shown twice in this figure, butwith very different aspect ratios: in the plot on the left the numberof surviving 3rd class passengers seems to be about twice as big asthe number of survivors among crew members, whereas in the plot on

(213 - 178)/213 = 16.4%• 6 of the 16 put Crew last

(213 - 118)/213 = 44.6%

Page 21: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

alternative approach: Common Angles

• Idea: break line into three connected segments under the same angle

Fig. 8. Lines in hammock plot of Titanic data for survival variable, levelyes. Comparing horizontal widths suggests that a greater number ofsurvivors were from third class instead of first, which is inconsistent withunderlying data.

variables, connecting bands between categories are drawn as a combi-nation of a vertical segment, a segment under a pre-specified angle q ,followed by another vertical segment as sketched out in figure 9

A A A A

B B B B

! ! 15°! ! 45°! ! 55°!= 65°

Fig. 9. Sketch of ribbons under three different angles (from left to right).All ribbons form a connection between points A and B.

1st 2nd 3rd Crew

Yes No

Female Male

1st 2nd 3rd Crew

1st 2nd 3rd Crew

Yes No

Female Male

1st 2nd 3rd Crew

Class

Survived

Sex

Class.1

0 500 1000 1500 2000count

Fig. 10. Common angle plot of the Titanic data.

The pre-specified angle q (between the line and the horizontal band)is given as –at most– the angle of the longest connecting line betweentwo categories of neighboring variables. This makes the width of rib-bons comparable without being affected by the distortion, as all rib-bons are sharing at least one segment under the same angle.

Figure 10 shows a common angle plot of the same data as the ham-mock plot. Now, tasks such as ordering levels according to the num-bers they represent are not affected by either line width illusion and13 out of 18 participants in the survey gave a correct assessment ofthe number of survivors by class (more details to follow in the nextsection). We also see that a few more men survived than women, butproportionally the situation is very different – a much higher percent-age of women survived than men. While more first class passengers

survived than not, the survival chances of second class passengers tilttowards doom. More members of the third class and the crew perishedthan survived.

Common angle plots, as well as all of the related methods ofhammock plots and parallel sets are implemented in the packageggparallel based on the ggplot2 (Wickham [37]) plottingframework in the software R 2.15.1 [27]. The ggparallel package isfreely available from CRAN (http://www.r-project.org/).The colors for the plots have been chosen using color schemes sug-gested by the ColorBrewer project (Harrower et al [14]), as imple-mented in the R package RColorBrewer (Neuwirth [24]).

4 USABILITY TESTING

4.1 Study DesignTo determine the effectiveness of the common angle plot, we con-ducted a user study in the form of a survey asking participants to pro-vide responses regarding the structure in two data sets with predomi-nantly categorical variables. The Titanic data includes class, sex, age,and survival status for each person on board of the Titanic [9]. Thegene data was retrieved from the UCSC Genome Browser (Kent et al[20]) and includes chromosome location for genes involved in one ofthree metabolism pathways: steroid biosynthesis, caffeine metabolismand drug metabolism. For each data set, participants were asked toprovide responses for three tasks that analysts routinely perform aspart of exploratory data analysis:

Task I: simple comparison task, chosen to be unaffected by any il-lusion. Performance on this task should be comparable acrossdesigns.

Task II: simple ordering, involving three pairwise comparisons,some of which are affected by the line width illusion or its re-verse.

Task III: more complex ordering task with at least six pairwise com-parisons, some of which are affected by either illusion.

The study was conducted in form of a crossover design (see Ta-ble 2): each participant was presented with two out of the three dis-play types, where the first display showed the Titanic data, and thesecond display showed the gene data set. All participants were askedto answer the same set of questions (see Appendix A) covering tasksI through III for each data set. This design allows for comparisons ofdisplay types and tasks while it is possible to simultaneously adjust forindividuals’ different skill sets and learning effects.

At the start of the survey, participants were given a link to a brief tu-torial regarding the different plot types. Not all of the participants fol-lowed this link. The decision to not require participants to go througha thorough training beforehand was conscious. The main goal of ourstudy was to assess performance of the plots based on intuitive eval-uation. We therefore refrained from any coaching on how to evaluateplots in the training material and restricted ourselves to an explanationof the construction.

The choice to show only two of the three possible types of dis-plays to a participant was made to encourage participation by reduc-ing the amount of time needed for its completion. On average, partic-ipants needed 18 minutes to complete the survey. We did not find anysignificant differences between the amount of time needed betweenthe first and the second block of questions (F1,95 = 0.0556 for a p-value of 0.8142), nor were there significant differences in the length oftime taken between the three designs (F2,94 = 0.1909 for a p-value of0.8265).

No personally identifiable information was collected, nor did weoffer any compensation for participation in the survey.

4.2 ResultsWe are investigating four aspects of the experiment in this section:(i) participants’ performance on each task according to the percentageof correct responses, (ii) extent of variability due to subject-specificabilities, (iii) space of answers for the more complex ordering task

12 of 16 participants got the order right

Page 22: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Limitations of human perception

• Human perception is limited

• What can we do?

• Be aware & adjust charts for it

• Do user tests!

Page 23: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

How ‘ethical’ is it to aggregate individuals?

• human component often gets lost in numbers

Page 24: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

How ‘ethical’ is it to aggregate individuals?

http://www.encyclopedia-titanica.org/titanic-first-class-passengers/

Page 25: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Mini case study: off-campus housing

• What will students want to know?

• How do costs of the options compare?

• What does “cost” include? (apples and oranges!)

• What are the “hidden” costs?

• How safe is this place?

• Who else lives there?

Page 26: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Who benefits most from this design?

Page 27: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Who benefits most from this design?

Page 28: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year
Page 29: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year
Page 30: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Ethical issues are often very subtle—but may

still have consequences!

• May be interpreted differently by some readers

• May only affect some readers

• May not be readily apparent to all readers

Page 31: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Is there any misrepresentation?

Iowa State University Office of Institutional Research, “President’s Council Handout,” 24 Sept. 2010. http://www.ir.iastate.edu/PDFfiles/PCR/0910Handout.pdf

Page 32: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

What’s emphasized, what’s not?

Iowa State University Office of Institutional Research, “President’s Council Handout,” 20 May 2011. http://www.ir.iastate.edu/PDFfiles/PCR/0511Handout.pdf

Page 33: Ethics in data design - CompanyName.com · 2014. 10. 9. · Social Traffic Sources for B2B and B2C Companies Facebook LinkedIn Twitter Source: Eloqua Benchmark Data, 2010 Full year

Addressing ethical issues in our designs

• Apply Tufte’s “Lie Factor”

• Explain why you’re emphasizing certain data

• Consider “what’s missing” in your display

• Always put yourself in your readers’ shoes when making design decisions (golden rule)

• Get feedback from your readers!