survey of automated assessment approaches for programming assignments

37
Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science Click to edit Master title style Survey of Automated Assessment Approaches for Programming Assignments Gayathri Subramanian

Upload: ofira

Post on 25-Feb-2016

46 views

Category:

Documents


3 download

DESCRIPTION

Survey of Automated Assessment Approaches for Programming Assignments. Gayathri Subramanian. Reference Papers. ‘A Survey of Automated Assessment Approaches for Programming Assignments’ by ‘ Kirsti M. Ala- Mutka ’ (1995 – 2005) . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

Survey of Automated Assessment Approaches for Programming Assignments

Gayathri Subramanian

Page 2: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

2Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

1. ‘A Survey of Automated Assessment Approaches for Programming Assignments’ by ‘Kirsti M. Ala-Mutka’ (1995 – 2005).

2. ‘Review of Recent Systems for Automatic Assessment of Programming Assignments’ (2005 – 2010)

Reference Papers

Page 3: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

3Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Introduction and Motivation Static and Dynamic Assessment Techniques Features of a good Automated Assessment System Automatic Vs Semi-Automatic Approaches Summative Vs Formative Approaches Conclusion

Outline

Page 4: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Motivation

Programming Courses are integral part of Computer Science and Software Engineering Curricula.

Proficiency in a programming language is obtained with practice. Programming Courses are large in size and heavy workload for

the teachers. Even small programs typically have a large number of possible

execution paths. Research suggest that it is not possible to consistently and

thoroughly grade students’ programs without automated assistance.

Programs can be automatically assessed !!

Page 5: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

5Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

New Automated Systems are being created every year Many System share common features Systems exist which satisfy most of the assessment needs There are far less system that are widely adopted than there

are papers about it. Literature survey helps teacher identify tool they are looking

for.

Motivation .. Cntd ..

Page 6: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

6Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

What are the features of a program which can be automatically assessed and the tools which

support them ?

Page 7: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

7Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Programs follow syntax and semantics which makes automatic assessment feasible

Extract some kind of measurement value (justified by teaching goals) from a program and Compare them against the given requirements (or teaching goals)

Some features requires execution of the program , some are statically evaluated

Functionality , Efficiency and Testing Skills are Dynamically Assessed

Coding Style , Programming Errors, Software metrics and Design can be statically assessed

Static & Dynamic Assessment of the Code

Page 8: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

8Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Dynamic Assessment : Functionality

Testing Skills Efficiency

Language Specific Features

What Features of a Program can be automatically Assessed ?

Page 9: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

9Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Running the program against test cases [Course Marker, HOGG, BOSS, online Judge]

Success depends on Test Case Design and Model Solution Coverage Analysis – Function, Statement, Decision measures

efficiency of test cases Correlated Test Cases - defining a test case with a planned

relationship to the program state created during previous test input. [Quiver]

There should be certain degree of freedom in representing model solution [Assyst using pattern matching and Course marker uses Reg-exp]

Course Marker checks for the return status of the program.

Functionality ( Dynamic Assessment )

Page 10: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

10Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Assess functionality of single function/Method[Quiver for Java , Scheme-Robo for Schema]

Assessing the functionality of a program with a GUI requires a means to monitor actions and responses communicated through the user interface.[JEWL (a language library) for Java provides students GUI and teachers to manage events of program and its output actions]

Functionality .. Cntd

Page 11: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

11Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Dynamic Assessment : Functionality

Testing Skills Efficiency

Language Specific Features

What Features of a Program can be automatically Assessed ?

Page 12: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

12Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Testing is an essential phase in program development. Students submit test data sets along with the

programming assignment.Assyst was the first tool that provided assessment of

student test data. The assessment was based on Code Coverage Analysis

Chen (2004) assesses the student test suite by running a set of buggy instructor programs against it.

Webcat - When a student submits a test data set, it assesses how well it covers all the different execution paths.

Testing Skills (Dynamic Assessment)

Page 13: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

13Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Dynamic Assessment : Functionality

Testing Skills Efficiency

Language Specific Features

What Features of a Program can be automatically Assessed ?

Page 14: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

14Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

A simple efficiency measurement is the running time of the program, measured either by the clock or CPU time used.[Assyst , Online Judge]

Efficiency measurements can be distorted by different implementation of input/output actions.

A simple solution is to offer a common input/ output module for use in assignments.[Hansen and Ruuska]

Efficiency can also be assessed by studying the execution behavior of different structures inside the program.

This is done by calculating how many times certain blocks and statements are executed and by comparing the results to the values obtained from the model solution.[Assyst , Course Marker]

Efficiency ( Dynamic Assessment )

Page 15: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

15Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Dynamic Assessment : Functionality

Testing Skills Efficiency

Language Specific Features

What Features of a Program can be automatically Assessed ?

Page 16: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

16Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Language specific implementation issues can be difficult to learn and assess.

students often misuse memory management, do not deallocate all the reserved memory blocks.

[Tutnew ] C++ library which overrides normal memory management methods and thus can provide runtime assessment for program memory usage.

the test cases affect the coverage of this assessment, since they define the execution paths for the program.

Language Specific Features ( Dynamic Assessment)

Page 17: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

17Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Static Assessment : Coding Style

Programming Errors Software Metrics

Design Language Specific Features

What Features of a Program can be automatically Assessed ?

Page 18: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

18Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Programming style or coding style and its connections to readability, maintainability etc. Typographical - E.g. indentation, placement of parenthesis, maximum length of

lineSyntax - every switch-statement should have a default-branch, and each case-

branch should end to a break-statement.Semantic - class names begin with a capital letter and each declared variable

should be used in the program.Logical. Issues related to the logical structure of the program. E.g. there should

not be too deeply nested loops, methods should not have a huge number of parameters, and global variables should not be used as method parameters.

Making use of effectiveness of compilers and their warning capabilities GCC compiler (GCC) can provide feedback on unused variables, implicit type

conversions, and language features that are not following the language standards, amongst other things.

Coding Style ( Static Assessment )

Page 19: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

19Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Checkstyle is open source software for checking Java programs and can be combined to several programming environments.

Comments for classes, attributes and methods Naming conventions of variables, methods Number of parameters passed to a function Duplicated code sections The good practices of class construction Complexity Measurements of expressions

Style++ is another tool that has been developed for assessing quality factors from C ++ programs

An automatic system PASS (PASS) has been implemented to assess these issues from programs in Ada, C, and Java languages.

http://en.wikipedia.org/wiki/Checkstyle

Coding Style .. cntd

Page 20: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

20Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Static Assessment : Coding Style

Programming Errors Software Metrics

Design Language Specific Features

What Features of a Program can be automatically Assessed ?

Page 21: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

21Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

some errors , suspicious code fragments can be recognized statically.

Static check to recognize several typical error types caused by students. Eg, mistakes in updating a loop control variable or inconsistencies between a parameter type and usage.

Xie and Engler (2002), who used code redundancies for detecting errors. By implementing a tool to detect idempotent operations, redundant assignments, dead code, and redundant conditionals, they were able to find several errors from the well known Linux source code.

Programming Errors ( Static Assessment)

Page 22: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

22Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Static Assessment : Coding Style

Programming Errors Software Metrics

Design Language Specific Features

What Features of a Program can be automatically Assessed ?

Page 23: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

23Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Software Metric are general metric that characterize the program

Hung, Kwok and Chan (1993) studied different metrics with programming assignments and came to conclusion that the number of code lines was a good measurement of students’ programming skill.

counting different attributes, such as the number of operators and operands in a program , Control Structures

metrics as clear indicators of student performance and also possible indicators of needs for instructional development.

Software Metrics ( Static Assessment )

Page 24: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

24Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Static Assessment : Coding Style

Programming Errors Software Metrics

Design Language Specific Features

What Features of a Program can be automatically Assessed ?

Page 25: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

25Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Teachers often need to assess whether submitted programs conform to given interface or structural requirements.

Thorburn and Rowe (1997) implemented a system that automatically recognizes the functional structure of a C program. They call it the ‘‘solution plan’’ of the program and compare it to the solution plan of the model program, or to a set of possible plans.

Truong, Roe and Bancroft (2004) implemented a structural similarity analysis that transforms a student’s program to XML presentation and compares it to the set of model solutions.

MacNish (2000) used the Java reflection for analyzing if class interfaces and method signatures in students’ Java programs met the given requirements.

Design ( Static Assessment )

Page 26: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

26Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Static Assessment : Coding Style

Programming Errors Software Metrics

Design Language Specific Features

What Features of a Program can be automatically Assessed ?

Page 27: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

27Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Search for specific key-word based on teaching goal. In Scheme language to assess whether program structure is

purely functional by searching for primitives set!, set-car!, and set-cdr!

A more flexible approach has been implemented in Ceilidh (Foxley, 1999) by defining regular expressions to be searched from the student’s program code.

Language Specific Features

Page 28: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

28Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Essential Features of Automatic Assessment tool for a Programming Course

Page 29: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

29Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

AA is a means for administrating submission , grading, general information delivery

Benefits of Automated AdministrationEfficient way to track student progress and to Recognize

needs for improvement on the coursepeer-reviewing becomes feasible. Students comment on

each others’ programs.

Automated Administration

Page 30: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

30Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Computer programs are text files that are easy to copy. From Structural Information of the program

MOSS is based on document fingerprintingJPLAG uses string tokenization with sub-string pattern

matching Attribute Counting Mechanism

Verco and Wise (1996) compared automated tools based on attribute counting mechanisms

Plagiarism Detection

Page 31: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

31Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Resubmissions are required for improving the answers.Resubmission policy should prevent the trail-and-error

strategy by some studentsLimit the number of submissionsLimit the amount of feedbackCompulsory Time penaltyMaking each exercise slightly differentProgramming Contest approach [Mooshak]Combination of limited and unlimited submissions

based on test cases

Resubmission Policies

Page 32: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

32Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Programming assignments are graded by running the code on Server, its important to protect the sever from malicious and unintended code bugs and flawsUse Existing approaches like Linux security model, chroot,

Java Security policy etc to securely run codeUse Static Analysis to filter malicious codeGrading on the client side

Sand Boxing

Page 33: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

33Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Survey by Pears et.al in 2007 reported that tools were single largest group amongst papers , other categories were curricula, pedagogy and programming languages

Many of these System share common features and there exists systems which fulfills most assessment needs

New Automatic assessment system are being created every year

Open Sourcing

Page 34: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

34Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Quality of the automatic feedback may not be as high as one given by an instructor

All issues related to good programming cannot be automatically assessed.

Hybrid approach uses Automation for small assignments and to combine manual and automation for larger assignments

[Advantages] gives teachers more time to concentrate on the demanding assessment tasks and also provides a possibility to double check the results of the automatic assessment.

Semi-automatic vs. Automatic Assessment

Page 35: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

35Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Formative Assessment : Allows Resubmission to help student improve the answer based on feedback.[Complete Program should be submitted in first attempt , except Web-cat]

Summative Assessment : [BOSS] can be used in homework assignments , online examinations

Formative vs Summative Assessment

Page 36: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

36Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Benefits are numerous! Immediate Feedback to students24h availabilityObjectivity and Consistency of the evaluationMore Practice to students

Some features of a program can only be assessed with automatic assessment and some features cannot. Hybrid Approach may be useful.

Tool Specific IssuesSetting up configuration files may be time consuming. Specification should be non-ambiguousEffectiveness also depend Test Cases

If similar tool approaches are used, good assignments and their assessment routines could be stored and reused.

Tools should be made widely available !

Conclusion

Page 37: Survey of Automated Assessment Approaches for Programming Assignments

Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science

Click to edit Master title style

37Spring, 2012, Reinventing eTextbook - Virginia Tech – Computer Science

Thank You