icml’2003 minitutorial on research, ‘riting, and reviews€¦ · august 29, 2005 1 experimental...

13
August 29, 2005 1 Experimental Experimental Methodology Methodology Rob Holte University of Alberta [email protected] ICML’2003 Minitutorial on Research, ‘Riting, and Reviews

Upload: others

Post on 11-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 20051

ExperimentalExperimentalMethodologyMethodology

Rob HolteUniversity of [email protected]

ICML’2003 Minitutorial on Research, ‘Riting, and Reviews

Page 2: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 20052

Experiments serve a purposeExperiments serve a purpose

They provide evidence for claims, designthem accordingly

Choose appropriate test datasets,consider using artificial data

Record measurements directly related toyour claims

Page 3: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 20053

Establish a needEstablish a need

Try very simple approaches beforecomplex ones

Try off-the-shelf approaches beforeinventing new ones

Try a wide range of alternatives not justones most similar to yours

Make sure comparisons are fair

Page 4: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 20054

Explore limitationsExplore limitations

Under what conditions does your systemwork poorly ? When does it work well ?

What are the sources of variance ? Eliminate as many as possible Explain the rest

Page 5: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 20055

Explore anomaliesExplore anomalies Superlinear speedup IDA* on N processors is more than N times

faster than on 1 processor

“…we were surprised to obtain superlinearspeedups on average… our first reactionwas to assume that our sample size was toosmall …” - V. Rao & V. Kumar

Superlinear Speedup in Parallel State-Space SearchTechnical Report AI88-80, 1988CS Dept., U of Texas - Austin

Page 6: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 20056

Look at your dataLook at your data4 x-y datasets, all with the same statistics.Are they similar ? Are they linear ?

•mean of the x values = 9.0•mean of the y values = 7.5•equation of the least-squared regression line is: y = 3 + 0.5x•sums of squared errors (about the mean) = 110.0•regression sums of squared errors = 27.5•residual sums of squared errors (about the regression line) = 13.75•correlation coefficient = 0.82•coefficient of determination = 0.67

F.J. Anscombe (1973), "Graphs in Statistical Analysis," American Statistician, 27, 17-21

Page 7: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 20057

AnscombeAnscombe datasets plotted datasets plotted

Page 8: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 20058

Look at your data, againLook at your data, again

Japanese credit card dataset (UCI) Cross-validation error rate is identical for

C4.5 and 1R

Is their performance the same ?

Page 9: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 20059

Closer analysis revealsCloser analysis reveals……

Error rate is the same only onthe dataset class distribution

•ROC curves•Cost curves•REC curves•Learning curves

C4.5

1R

Page 10: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 200510

Test alternative explanationsTest alternative explanations

Solution Quality (% of optimal)

Combinatorial auction problemsCHC = hill-climbing with a clever new heuristic

87 arb 89 r90N 90 r90P 83 r75P 96 sched 99 match 98 path CHC problem type

Page 11: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 200511

Is CHC better than random HC ?Is CHC better than random HC ?

Percentage of CHC solutionsbetter than random HC solutions

20 arb 6 r90N 7 r90P 63 r75P 100 sched 100 match 100 path % betterproblem type

Page 12: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 200512

Avoid Avoid ““OvertuningOvertuning””

Overtuning = using all your data forsystem development. Final system likelyoverfits.

David Lubinksy, PhD thesis Held out ~10 UCI datasets for final testing

Chris Drummond, PhD thesis Fresh data used for final testing

Page 13: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial

August 29, 200513

Too obvious to mention ? (no)Too obvious to mention ? (no)

Debug and test your code thoroughly

Keep track of parameter settings, versionsof program and dataset, etc.