syntactic and functional variability of a million...

27
Syntactic and Functional Variability of a Million Code Submissions in a Machine Learning MOOC Jonathan Huang Chris Piech Andy Nguyen Leonidas Guibas

Upload: others

Post on 05-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Syntactic and Functional Variability of a Million Code Submissions in a Machine

Learning MOOC

Jonathan Huang Chris Piech

Andy Nguyen Leonidas Guibas

Page 2: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

How can we efficiently grade 10,000 students?

A variety of assessments

2

Once you have completed the function, the next step in ex1.m will runcomputeCost once using ✓ initialized to zeros, and you will see the costprinted to the screen.

You should expect to see a cost of 32.07.

You should now submit “compute cost” for linear regression with onevariable.

2.2.4 Gradient descent

Next, you will implement gradient descent in the file gradientDescent.m.The loop structure has been written for you, and you only need to supplythe updates to ✓ within each iteration.

As you program, make sure you understand what you are trying to opti-mize and what is being updated. Keep in mind that the cost J(✓) is parame-terized by the vector ✓, not X and y. That is, we minimize the value of J(✓)by changing the values of the vector ✓, not by changing X or y. Refer to theequations in this handout and to the video lectures if you are uncertain.

A good way to verify that gradient descent is working correctly is to lookat the value of J(✓) and check that it is decreasing with each step. Thestarter code for gradientDescent.m calls computeCost on every iterationand prints the cost. Assuming you have implemented gradient descent andcomputeCost correctly, your value of J(✓) should never increase, and shouldconverge to a steady value by the end of the algorithm.

After you are finished, ex1.m will use your final parameters to plot thelinear fit. The result should look something like Figure 2:

Your final values for ✓ will also be used to make predictions on profits inareas of 35,000 and 70,000 people. Note the way that the following lines inex1.m uses matrix multiplication, rather than explicit summation or loop-ing, to calculate the predictions. This is an example of code vectorization inOctave.

You should now submit gradient descent for linear regression with onevariable.

predict1 = [1, 3.5]

*

theta;

predict2 = [1, 7]

*

theta;

7

Prove that if p is prime,

then

pp is irrational.

Page 3: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Complex and informative feedback in MOOCS

3

Short Response!

Long Response!

Multiple choice!

Essay questions!

Coding assignments!

Proofs!

Easy to automate!

Limited ability to ask expressive questions or require creativity!

(for now anyway)

Requires crowdsourcing!

Can assign complex assignments and provide complex feedback!

Page 4: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Binary feedback for coding questions

4

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta % theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha m = length(y); % number of training examples J_history = zeros(num_iters, 1); for iter = 1:num_iters theta = theta-alpha*1/m*(X'*(X*theta-y)); J_history(iter) = computeCost(X, y, theta); end

Test Inputs

Correct / Incorrect ? Test Outputs

What would a human grader (TA) do?

Linear Regression submission (Homework 1) for Coursera’s ML class

Page 5: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Complex, Informative Feedback

5

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) m = length(y); J_history = zeros(num_iters, 1); for iter = 1:num_iters hypo = X*theta; newMat = hypo – y; trans1 = (X(:,1)); trans1 = trans1’; newMat1 = trans1 * newMat; temp1 = sum(newMat1); temp1 = (temp1 *alpha)/m; A = [temp1]; theta(1) = theta(1) - A; trans2 = (X(:,2))’ ; newMat2 = trans2*newMat; temp2 = sum(newMat2); temp2 = (temp2 *alpha)/m; B = [temp2]; theta(2)= theta(2) - B; J_history(iter) = computeCost(X, y, theta); end theta(1) = theta(1) theta(2)= theta(2); Why??

Correctness Efficiency

Style Elegance

Better: theta = theta-(alpha/m) *X'*(X*theta-y)

Good Good Poor Poor

Page 6: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

0

5000

10000

15000

20000

25000

30000

1 6 11 16 21 26 31 36 41

Optional problems

42 coding problems (ordered chronologically)

Num

ber o

f uni

que

user

s su

bmitt

ing

to e

ach

prob

lem

ML-class by the numbers

120,000

25,839

10,405

Registered Users

Users who submitted code

Users who submitted to every homework assignment

Topics: Linear regression, Logistic Regression, SVMs, Neural Nets, PCA, Nearest Neighbor…

Taught by Andrew Ng

How will Andrew give feedback to 25,000 students (~40,000 code submissions)?

Page 7: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Gradient descent for linear regression ~40,000 submissions

(the big picture)

Page 8: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Unit test output classes

[.1, .3, 0]

Unit test inputs

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta

% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha

m = length(y); % number of training examples J_history = zeros(num_iters, 1);

for iter = 1:num_iters

theta = theta-alpha*1/m*(X'*(X*theta-y));

J_history(iter) = computeCost(X, y, theta); end

Student #1

Unit test inputs

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta

% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha

m = length(y); % number of training examples J_history = zeros(num_iters, 1);

for iter = 1:num_iters

theta = theta-alpha*1/m*(X'*(X*theta-y));

J_history(iter) = computeCost(X, y, theta); end

Student #2

[.1, .4, 0]

Unit test inputs

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta

% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha

m = length(y); % number of training examples J_history = zeros(num_iters, 1);

for iter = 1:num_iters

theta = theta-alpha*1/m*(X'*(X*theta-y));

J_history(iter) = computeCost(X, y, theta); end

Student #5

[NaN]

Unit test inputs

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta

% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha

m = length(y); % number of training examples J_history = zeros(num_iters, 1);

for iter = 1:num_iters

theta = theta-alpha*1/m*(X'*(X*theta-y));

J_history(iter) = computeCost(X, y, theta); end

Student #3

[.1, .3, 0]

Unit test inputs

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta

% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha

m = length(y); % number of training examples J_history = zeros(num_iters, 1);

for iter = 1:num_iters

theta = theta-alpha*1/m*(X'*(X*theta-y));

J_history(iter) = computeCost(X, y, theta); end

Student #6

[.1, .4, 0]

Unit test inputs

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta

% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha

m = length(y); % number of training examples J_history = zeros(num_iters, 1);

for iter = 1:num_iters

theta = theta-alpha*1/m*(X'*(X*theta-y));

J_history(iter) = computeCost(X, y, theta); end

Student #4

[NaN]

Page 9: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Output based feedback

[.1, .3, 0]

Output class #1

[.1, .4, 0]

Output class #2

[NaN]

Output class #3

Instructor Feedback: Great job! Keep up the good work!

Instructor Feedback: Did you remember to preprocess the data before running the SVD?

Instructor Feedback: You might have accidentally divided by zero when you tried to normalize

How many output classes must Andrew Ng annotate?

Page 10: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

0 1 2 3 4 5 6 7 8 9

10

0 20

0 40

0 60

0 80

0 10

00

1200

14

00

1600

18

00

2000

22

00

2400

26

00

2800

Regularized Logistic regression

(i.e., poor Andrew…)

Num

ber o

f pr

oble

ms

(of 4

2)

Number of distinct unit test output classes (over all submissions to a problem)

Functional Variability Problems with over 1000

ways to be wrong!

Of course, not all output classes are born equal…

Page 11: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

1 6 11 16 21 26 31 36 41 46

Regularized Logistic Regression

Linear Regression (Gradient Descent) Return a 5x5 identity matrix

Top-k unit test output classes

Frac

tion

of s

tude

nts

cove

red

by

anno

tatin

g to

p-k

unit

test

out

put c

lass

es

How many students are covered?

If Andrew gives feedback on 11 output classes, he “covers” 75% of the users

Page 12: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Beyond output classes function [J, grad] = lrCostFunction (theta, X, y, lambda) m = length (y); J = 0; grad = zeros (size (theta)); for i = 1:m J = J + (-y (i) * log (sigmoid (X (i, :) * theta)))

- (1 - y (i)) * log (1 - sigmoid (X (i, :) * theta)); endfor; J = J / m; for j = 2:size (theta) J = J + (lambda * (theta (j) ^ 2) / (2 * m)); endfor; for j = 1:size (theta) for i = 1:m grad (j) = grad (j) +

(sigmoid (X (i, :) * theta) - y (i)) * X (i, j); endfor; grad (j) = grad (j) / m; endfor; for j = 2:size (theta) grad (j) = grad (j) + (lambda * theta (j)) / m; endfor; grad = grad (:);

function [J, grad] = lrCostFunction (theta, X, y, lambda) m = length (y); J = 0; grad = zeros (size (theta)); n = size (X, 2);

h = sigmoid (X * theta); J = ((-y)' * log (h) - (1 - y)' * log (1 - h)) / m; theta1 = [0; theta(2:size (theta), :)]; soma = sum (theta1' * theta1); p = (lambda * soma) / (2 * m);

J = J + p; grad = (X' * (h - y) + lambda * theta1) / m;

Two “correct” implementations of logistic regression

Output based feedback ignores: •  vectorization, •  multiple approaches with the same result, •  coding style, …

Page 13: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

TARGET

IDENT:A

ASSIGN

SOURCE

CONST:[]

STATEMENT_LIST

ASSIGN NO_OP:endfunction

TARGET

IDENT:A

SOURCE

IDX_EXPR

IDENT:eye ARG_LIST

CONST:5

FUNCTION:warmUpExercise

From function… to syntax! function A = warmUpExercise()

% helloooo! A = []; A = eye(5); endfunction

•  Whitespace •  Comments •  Differences in

variable names •  …

ASTs ignore:

Abstract syntax tree

Page 14: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

0

1

2

3

4

5

6

7

0 20

00

4000

60

00

8000

10

000

1200

0 14

000

1600

0 18

000

2000

0 22

000

2400

0 26

000

2800

0 30

000

3200

0 34

000

3600

0 38

000

4000

0

Regularized Logistic regression

(again)

Num

ber o

f pro

blem

s (o

f 42)

Number of distinct abstract syntax trees (over all submissions to a problem)

Syntactic Variability

and again: not all ASTs are born equal…

Page 15: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

0.00E+00 1.00E-01 2.00E-01 3.00E-01 4.00E-01 5.00E-01 6.00E-01 7.00E-01 8.00E-01 9.00E-01 1.00E+00

1 6 11 16 21 26 31

Linear Regression (Gradient Descent)

Return a 5x5 identity matrix

Linear Regression (Evaluate cost function)

Top-k abstract syntax trees

Frac

tion

of s

tude

nts

cove

red

by

anno

tatin

g to

p-k

abst

ract

syn

tax

trees

How many students are covered?

If Andrew gives feedback on 20 ASTs, he “covers” ~30% of the users.

Idea: If Alice and Bob have highly similar submissions (albeit not identical), the same feedback can often apply to both students!!

Page 16: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Core Subproblem: Code Correspondence

16

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) m = length(y); J_history = zeros(num_iters, 1); for iter = 1:num_iters h=X*theta; theta=theta-alpha*(1/m).*(X'*(h-y)); J_history(iter) = computeCost(X, y, theta); end

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) m = length(y); J_history = zeros(num_iters, 1); for iter = 1:num_iters temp1 = (alpha/m) * ( ( ( (theta' * X')' - y )' * X(:,1) ) ); temp2 = (alpha/m) * ( ( ( (theta' * X')'- y )' * X(:,2) ) ); theta(1) = theta(1) – temp1; theta(2) = theta(2) – temp2; fprintf('Theta: %f, %f\n', theta(1), theta(2)) J_history(iter) = computeCost(X, y, theta); end

Page 17: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Tree matching

17

FUNCTION (warmUpExercise) STATEMENT_LIST ASSIGN TARGET IDENT (A) SOURCE CONST ([]) ASSIGN TARGET IDENT (A) SOURCE IDX_EXPR IDENT (eye) ARG_LIST CONST (5) NO_OP (endfunction)

function A = warmUpExercise() A = eye(5); endfunction

function A = warmUpExercise() A = []; A = eye(5); endfunction

See [Shasha et al, ‘94] for O(n4) dynamic programming algorithm

Page 18: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

ASSIGN

SOURCE TARGET

BINARY_OP:*

IDENT:X

IDENT:theta

TARGET

IDENT:errors

BINARY_OP:-

IDENT:predictions IDENT:y

BINARY_OP:*

POSTFIX:'

IDENT:XIDENT:errors

TARGET

IDENT:derivative_vector

BINARY_OP:*

BINARY_OP:/IDENT:X_errors

CONST:1IDENT:m

BINARY_OP:-

BINARY_OP:*IDENT:theta

IDENT:alphaPOSTFIX:'

IDENT:derivative_vector

IDX_EXPR

ARG_LIST

CONST:1

IDX_EXPR

ARG_LIST

IDENT:y IDENT:theta IDENT:alpha CONST:2

TARGET

IDX_EXPR

ARG_LIST

IDX_EXPR

ARG_LISTIDENT:theta

CONST:2

IDENT:m

TARGET

IDENT:length

IDENT:y

ARG_LIST

IDX_EXPR

SOURCE

ASSIGN

IDENT:J_history

TARGET

IDENT:zeros

IDENT:num_iters CONST:1

ARG_LIST

IDX_EXPR

SOURCE

ASSIGN

IDENT:iter

LEFT

CONST:1

BASE

none

INCREMENT

IDENT:num_iters

LIMIT

COLON

CONTROL

IDENT:predictions

SOURCE

ASSIGN

IDENT:X_errors

TARGETSOURCE

ASSIGN

SOURCE

ASSIGN

TARGET SOURCE

ASSIGN

IDENT:J_history

IDENT:iter

ARG_LIST

IDX_EXPR

TARGET

IDENT:computeCost

IDENT:XIDENT:y IDENT:theta

ARG_LIST

IDX_EXPR

SOURCE

ASSIGN

STATEMENT_LIST

FOR NO_OP:endfunction

STATEMENT_LIST

FUNCTION:gradientDescent

Matched ASTs for two submitted octave implementations

Purple – tokens common to both implementations Red/Blue – tokens distinct to one of the implementations

Gradient descent for one-dimensional linear regression

Boilerplate code

Page 19: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Code webs

•  Codewebs can act as a backbone for: –  Clustering –  Outlier identification –  Automatic suggestions for alternative (better)

implementations –  Propagating complex and informative feedback/annotations 19

Implementations

Graphs of code submissions with nodes connected to nearest neighbors

Page 20: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

20

Labeled code webs!

Gradient descent for linear regression ~40,000 submissions

Page 21: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

1.00E-06

1.00E-05

1.00E-04

1.00E-03

1.00E-02

1.00E-01

1.00E+00

1 41 81 121 161 201

Log

prob

abili

ty

Log distribution for pairs that agree on unit tests

Log distribution for pairs that disagree on unit tests

Edit distance between pairs of submissions

Interplay between functional and syntactic similarity

Page 22: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Finding Prototypical Submissions

22

(we use affinity propagation [Frey,

Dueck, 2007])

Page 23: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

(Correct) Gradient Descent Prototypes

23

m = length(y); J_history = zeros(num_iters, 1); for iter = 1:num_iters theta = theta-alpha*1/m*(X'*(X*theta-y)); J_history(iter) = computeCost(X, y, theta); end

m = length(y); J_history = zeros(num_iters, 1); for iter = 1:num_iters, delta = X' *(X * theta – y); theta = theta - (alpha / m * delta); J_history(iter) = computeCost(X, y, theta); end

m = length(y); J_history = zeros(num_iters, 1); for iter = 1:num_iters t0 = theta(1,1) - ((alpha / m) * sum ((X * theta) - y)); t1 = theta(2,1) - ((alpha / m) * sum (((X * theta) - y) .* X (:,2))); theta = [t0; t1]; J_history(iter) = computeCost(X, y, theta); end

function [theta, J_history] = gradientDescent(X,y,theta,alpha,num_iters)

#1

#2

#3

Most common implementation, vectorized, one-line update!

Vectorized, with temporary variable!

Unvectorized, synchronous update!

Page 24: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Finding Outlier Submissions

24

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) m = length(y); J_history = zeros(num_iters, 1); for iter = 1:num_iters hypo = X*theta; newMat = hypo - y ; trans1 = (X(:,1)) ; trans1 = trans1'; newMat1 = trans1 * newMat; temp1 = sum(newMat1); temp1 = (temp1 *alpha)/m; A = [temp1]; theta(1) = theta(1) - A; trans2 = (X(:,2))' ; newMat2 = trans2*newMat; temp2 = sum(newMat2); temp2 = (temp2 *alpha)/m; B = [temp2]; theta(2)= theta(2) - B; J_history(iter) = computeCost(X, y, theta); end theta(1) = theta(1) ; theta(2)= theta(2); end

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) m = length(y); J_history = zeros(num_iters, 1); for iter = 1:num_iters a=0; h=zeros(m,1); h=X*theta; k=(h-y); for i=1:m, a=k(i)+a; end; a=(alpha*a)/m; theta(1)=theta(1)-a; a=0; x1=X(:,2); for i=1:m, a=k(i)*x1(i)+a; end; a=(alpha*a)/m; theta(2)=theta(2)-a; J_history(iter) = computeCost(X, y, theta); end end

Page 25: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a

Finding Outlier Submissions

25

function [C, sigma] = dataset3Params(X, y, Xval, yval) C = 1; sigma = 0.3; mean(double(predictions ~= yval)) % d = [0.01, 0.03, 0.1, 0.3, 1, 3, 10, 30]; i = 1; j = 1; global gl_sigma = d(j); gl_sigma = d(j); function f = f(x1, x2)

global gl_sigma; f = gaussianKernel(x1, x2, gl_sigma); end model = svmTrain(X, y, d(i), @f); pred = svmPredict(model, Xval); ans = mean(double(pred ~= yval)); if (i == 1 && j == 1) || ans < res

res = ans; C = d(i); sigma = d(j);

end i = 1; j = 2; global gl_sigma = d(j); gl_sigma = d(j); function f = f(x1, x2)

global gl_sigma; f = gaussianKernel(x1, x2, gl_sigma); end model = svmTrain(X, y, d(i), @f); pred = svmPredict(model, Xval); ans = mean(double(pred ~= yval)); if (i == 1 && j == 1) || ans < res

res = ans; C = d(i); sigma = d(j);

end

i = 1; j = 3; global gl_sigma = d(j); gl_sigma = d(j); function f = f(x1, x2)

global gl_sigma; f = gaussianKernel(x1, x2, gl_sigma); end model = svmTrain(X, y, d(i), @f); pred = svmPredict(model, Xval); ans = mean(double(pred ~= yval)); if (i == 1 && j == 1) || ans < res

res = ans; C = d(i); sigma = d(j);

end i = 1; j = 4; global gl_sigma = d(j); gl_sigma = d(j); function f = f(x1, x2)

global gl_sigma; f = gaussianKernel(x1, x2, gl_sigma); end model = svmTrain(X, y, d(i), @f); pred = svmPredict(model, Xval); ans = mean(double(pred ~= yval)); if (i == 1 && j == 1) || ans < res

res = ans; C = d(i); sigma = d(j);

end i = 1; j = 5; global gl_sigma = d(j); gl_sigma = d(j); function f = f(x1, x2)

global gl_sigma; f = gaussianKernel(x1, x2, gl_sigma); end model = svmTrain(X, y, d(i), @f); pred = svmPredict(model, Xval); ans = mean(double(pred ~= yval)); if (i == 1 && j == 1) || ans < res

res = ans; C = d(i); sigma = d(j);

end

i = 1; j = 6; global gl_sigma = d(j); gl_sigma = d(j); function f = f(x1, x2)

global gl_sigma; f = gaussianKernel(x1, x2, gl_sigma); end model = svmTrain(X, y, d(i), @f); pred = svmPredict(model, Xval); ans = mean(double(pred ~= yval)); if (i == 1 && j == 1) || ans < res

res = ans; C = d(i); sigma = d(j);

end i = 1; j = 7; global gl_sigma = d(j); gl_sigma = d(j); function f = f(x1, x2)

global gl_sigma; f = gaussianKernel(x1, x2, gl_sigma); end model = svmTrain(X, y, d(i), @f); pred = svmPredict(model, Xval); ans = mean(double(pred ~= yval)); if (i == 1 && j == 1) || ans < res

res = ans; C = d(i); sigma = d(j);

end i = 1; j = 8; global gl_sigma = d(j); gl_sigma = d(j); function f = f(x1, x2)

global gl_sigma; f = gaussianKernel(x1, x2, gl_sigma); end model = svmTrain(X, y, d(i), @f); pred = svmPredict(model, Xval); ans = mean(double(pred ~= yval)); if (i == 1 && j == 1) || ans < res

res = ans; C = d(i); sigma = d(j);

end i

Page 26: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a
Page 27: Syntactic and Functional Variability of a Million …people.csail.mit.edu/zp/moocshop2013/presentation_16.pdfSyntactic and Functional Variability of a Million Code Submissions in a