in focus presentation: improving retention: predicting at-risk students by analysing clicking...

26
Improving retention: predicting at- risk students by analysing clicking behaviour in a virtual learning environment Annika Wolff and Zdenek Zdrahal 10 th December 2013

Upload: centre-for-distance-education

Post on 22-Apr-2015

588 views

Category:

Education


1 download

DESCRIPTION

Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment. Presentation from 'InFocus: Learner analytics and big data', a CDE technology symposium held at Senate House on 10 December 2013. Conducted by Annika Wolff, Knowledge Media Institute, Open University. Audio of the session and more details can be found at www.cde.london.ac.uk.

TRANSCRIPT

Page 1: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Annika Wolff and Zdenek Zdrahal10th December 2013

Page 2: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Student retention

• Struggling students don’t always ask for help – drop-out of module or fail and then don’t progress further

• When timely help is offered, this can make the difference between success and failure.

• It can be hard to know who’s in trouble and where to direct resources

Page 3: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Open University context

students

tutors

Distance learning:• Content through VLE• Contact mediated

through VLE – how to tell if students are struggling?

Solution: develop predictive models from student data

Page 4: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Data sources and data sets

VLE Assessment Demographic

Learning contentForumsQuizzes….

Ongoing assessmentsFinal exam

AgeGenderPrevious study…..

Page 5: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Typical VLE clicks

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 470

500

1000

1500

2000

2500

3000

Students Tutors

Page 6: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

VLE activity (prior TMA1)• No VLE activity … 317 students• 1-20 clicks ……….. 609 students• 21-80 clicks ……… 943 students• 81-150 clicks ……. 621 students• 151-300 clicks …. 803 students• 301-600 clicks …. 516 students• > 600 clicks ……… 355

students

Page 7: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Problem specification

• Given:– Demographic data at the Start (may include information about

student’s previous modules studied at the OU and his/her objectives)– Assessments (TMAs) as they are available during the module– VLE activities between TMAs– Conditions student must satisfy to pass the module

• Goal: – Identify students at risk of failing the module as early as possible so

that OU intervention is meaningful.

Page 8: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Comments on problem specification

• OU intervention is meaningful if the cost of the intervention is lower than the expected gain from retaining the student.

• Modelling the problem:

We are here

Page 9: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Comments on problem specification

• OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student.

• Modelling the problem:

We are here

History we know

Page 10: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Comments on problem specification

• OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student.

• Modelling the problem:

We are here

History we know Future we can estimate

Page 11: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Comments on problem specification

• OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student.

• Modelling the problem:

We are here

History we know Future we can estimate

… and we can influence!

Page 12: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Comments on problem specification

• OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student.

• Modelling the problem:

We are here

History we know Future we can estimate

How can we estimate the future? … Based on student’s history and properties of upcoming parts of the module known from previous presentations.

Page 13: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Prediction at TMA1

– Why? TMA1 is a good predictor of success or failure

– It is enough time to intervene

We are hereHistory we know Future we can affect

Page 14: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Building a classifier

Training instances

New instances

FAIL

PASS

PassFail

Pass

Fail

FailPass

Assessment 1 score?

>40% <40%

Decision Tree – first results (no demographics)

Page 15: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Performance drop (VLE+TMA)

Page 16: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Final outcome

Page 17: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Naïve Bayes network

Sex

Education

N/C

VLE

TMA1

• Education:– No formal qualif.– Lower than A level– A level– HE qualif.– Postgraduate qualif.

• VLE:– No engagement– 1-20 clicks– 21-100 clicks– 101 – 800 clicks

• N/C:– New student– Continuing student

• Sex:– Female– Male

Goal:Calculate probability of failing at TMA1 • either by not submitting TMA1,• or by submitting with score < 40.

Page 18: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Predicting final result from TMA1

TMA1 Final resultTMA7TMA2

Pass/Distinction

Fail

TMA1 >=40

TMA1 <40

Prior probabilities: P(Success) = 0.807, P(Fail) = 0.193

Posteriori probabilities: P(Success|TMA1) = 0.858, P(Fail|TMA1) = 0.142P(Success|~TMA1) = 0.093, P(Fail|~TMA1) = 0.907

Bayes minimum error classifierIf student fails in TMA1 he/she is likely to fail the final result

VLE

Page 19: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

P(Fail|TMA1-score), P(Pass/Dist|TMA1-score)

0-39 40-59 60-69 70-79 80-1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

FailPass/Dist

TMA1

Page 20: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Predicting final result from TMA1

Sex

Education

N/C

VLE

TMA1 Final resultTMA7TMA2

Pass/Distinction

Fail

TMA1 >=40

TMA1 <40

Prior probabilities: P(Success) = 0.807, P(Fail) = 0.193

Posteriori probabilities: P(Success|TMA1) = 0.858, P(Fail|TMA1) = 0.142P(Success|~TMA1) = 0.093, P(Fail|~TMA1) = 0.907

Bayes minimum error classifierIf student fails in TMA1 he/she is likely to fail the final result

VLE

Page 21: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Demo Case 1• Demographic data

– Student fits certain demographic profile of gender, educational background etc.

Sex

Education

N/CTMA1

Without VLE:Probability of failing at TMA1 = 18.5%

Sex

Education

N/C

VLE

TMA1

Clicks Probability Nr of students0 64% 4

1-20 44% 3

21-100 26% 5

101-800 6.3% 14

With VLE:

Page 22: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Demo Case 2• Demographic data

– Different demographic profile to previous slide

Sex

Education

N/CTMA1

Without VLE:Probability of failing at TMA1 = 7.7%

Sex

Education

N/C

VLE

TMA1

Clicks Probability Nr of students0 39% 35

1-20 22% 74

21-100 11.2% 178

101-800 2.4% 461

With VLE:

Page 23: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

TMA1? … it might be too late!

Can we predict TMA1 from VLE activities 1 week before the TMA1 deadline? How about 2, 3, … weeks?

We are here

History Future we can affect

Page 24: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

predicted to fail

has not engaged with VLE

average score < 40

Dashboard and Chart

at least one TMA below 40

Has not submitted TMA5

has not engaged with VLEaverage score = 81.71 !!!

However

Page 25: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Dashboard – new design

Page 26: In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

Conclusions

• In a distance learning context, the VLE data provides a valuable source of data for prediction

• Prediction improves as a module progresses, but this is too late!

• We need to optimise methods for early prediction