plagiarism monitoring and detection -- towards an open discussion
DESCRIPTION
Plagiarism Monitoring and Detection -- Towards an Open Discussion. Edward L. Jones Computer Information Sciences Florida A & M University Tallahassee, Florida. Outline. What is Plagiarism, and Why Address It Plagiarism Detection & Countermeasures A Metrics-Based Detection Approach - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/1.jpg)
Plagiarism Monitoring and Plagiarism Monitoring and Detection -- Towards an Open Detection -- Towards an Open
DiscussionDiscussion
Edward L. JonesComputer Information Sciences
Florida A & M University
Tallahassee, Florida
![Page 2: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/2.jpg)
OutlineOutline
What is Plagiarism, and Why Address ItPlagiarism Detection &
CountermeasuresA Metrics-Based Detection ApproachExtending the ApproachConclusions & Future Work
![Page 3: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/3.jpg)
Why Tackle Plagiarism?Why Tackle Plagiarism?
Plagiarism undermines educational objectives
Failure to address sends wrong message
A non-contrived ethical issue in computing
Plagiarism is hard to define
Plagiarism is costly to pursue/prosecute
An interesting problem for tinkering
![Page 4: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/4.jpg)
What is Plagiarism?What is Plagiarism?
“use of another’s ideas, writings or inventions as
one’s own” (Oxford American Dictionary, 1980)
Shades of Gray
– Theft of work
– Gift of work
– Collusion
– Collaboration
– Coincidence
Intent to Deceive
![Page 5: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/5.jpg)
How is it Detected?How is it Detected?
By chance
– Anomalies
– Temporal proximity when grading
Automation methods
– Direct text comparison (Unix diff)
– Lexical pattern recognition
– Structural pattern recognition
– Numeric profiling
![Page 6: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/6.jpg)
Plagiarism Concealment Plagiarism Concealment
TacticsTactics
None
Change comments
Change formatting
Rename identifiers
Change data types
Reorder blocks
Reorder statements
Reorder expressions
Superfluous code
Alternative control
structures
![Page 7: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/7.jpg)
Prosecution -- DA in the Prosecution -- DA in the
House?House? Course syllabus broaches the subject
– Concrete definition generally lacking
– Sense of “we’ll know it when we see it”
N? Tolererance Policy
Investigation Stage
Prosecution Stage
Missed opportunity to teach?
![Page 8: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/8.jpg)
An Awareness ApproachAn Awareness Approach Monitor closeness of student programs
– Objective measures
– Automated
Post anonymous closeness results in public
– Nonconfrontational awareness
– “A word to the wise … “
Benchmark student behavior
– Establishing thresholds
– Effects of course, language
![Page 9: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/9.jpg)
Program 2Program 2
Program 1Program 1
( lines1, words1, characters1
Closeness Measures -- Closeness Measures -- PhysicalPhysical
( lines2, words2, characters2)
Euclidean Distance
![Page 10: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/10.jpg)
Program 2Program 2
Program 1Program 1
( length1, vocabulary1, volume1)
Closeness Measures -- Closeness Measures -- HalsteadHalstead
( length2, vocabulary2, volume2)
Euclidean Distance
![Page 11: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/11.jpg)
Comparison of MeasuresComparison of Measures Physical profile ==> weight test
– Simple/cheap to compute (Unix wc command)
– Sensitive to character variations
Halstead profile ==> content test
– More complex/expensive to compute
– Ignores comments and white space
– Sensitive only to changes in program content
Detection effectiveness vs. plagiarism tactic
![Page 12: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/12.jpg)
Closeness ComputationCloseness Computation Normalization
– Establish upper bound for comparison (1.414)
– Distance computed on normalized (unit) vectors
Normalization I -- Self normalization
– p = (a, b, c) ==> (a/L, b/L, c/L)
– Largest component dominates
Normalization II -- Global scaling
– p = (a, b, c) ==> q = (a/aMAX, b/bMAX, c/cMAX)
– Self normalization applied to q
![Page 13: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/13.jpg)
Distribution Of Closeness Distribution Of Closeness ValuesValues
Figure 2. Distribution of Halstead Closeness
0.000000.005000.010000.015000.020000.025000.030000.035000.040000.04500
0 100 200 300 400 500
Student Program Pairs
Clo
sen
ess
Mea
sure
Figure 2. Distribution of Halstead Closeness
0.000000.005000.010000.015000.020000.025000.030000.035000.040000.04500
0 100 200 300 400 500
Student Program Pairs
Clo
sen
ess
Mea
sure
![Page 14: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/14.jpg)
Comparison of ProfilesComparison of Profiles
![Page 15: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/15.jpg)
Closeness DistributionCloseness Distribution
Closeness values vary by assignment Programming language may lead clustering at
the lower end of the spectrum Reuse of modules leads to cluster ingat the
lower end of the spectrum No a priori threshold pin-pointing plagiarism All measures exhibit these behaviors
![Page 16: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/16.jpg)
Suspect IdentificationSuspect IdentificationCollaboration Suspects (5-th Percentile)
Rank Closeness student1 student21 0.00000000 alpha alpha2 0.00000652 alpha beta3 0.00026963 beta gamma4 0.00026981 alpha gamma5 0.00031262 gamma epsilon6 0.00048815 sigma delta7 0.00049825 alpha epsilon8 0.00050169 beta epsilon9 0.00066481 gamma theta
10 0.00073158 beta theta
![Page 17: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/17.jpg)
Independence IndexIndependence IndexStudent Independence Indices
Index student1
1 alpha 2 beta
3 gamma5 epsilon6 sigma6 delta9 theta
Index = position at which student debuts on Closeness List
![Page 18: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/18.jpg)
Preponderance of EvidencePreponderance of Evidence Historical Record of Student Behavior
– Collaboration/partnering
– Independence indices
Profile and analyze other artifacts
– Compilation logs
– Execution logs
![Page 19: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/19.jpg)
Another ApproachAnother Approach
Make student demonstrate familiarity with
submitted program
– Seed errors into program
– Time limit for removing error and resubmitting
Holistic approach
– Intentional, not accidental
![Page 20: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/20.jpg)
ConclusionsConclusions
We can do something about plagiarism -- the first step is to develop eyes and ears
Simple metrics appear to be adequateTools are essentialSophistication is not as necessary as
automationStudents are curious to know how they
compare with other students
![Page 21: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/21.jpg)
On-Going & Future WorkOn-Going & Future Work
Complete the toolset– Student Independence Index
Incorporate other Artifacts– Compilation logs– Execution logs
Integrate into Automated GradingDisseminate Results
– Package tool as shareware
![Page 22: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/22.jpg)
Questions?
Questions?
Questions?
![Page 23: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/23.jpg)
Thank You
![Page 24: Plagiarism Monitoring and Detection -- Towards an Open Discussion](https://reader034.vdocument.in/reader034/viewer/2022051316/56814f3a550346895dbcdd83/html5/thumbnails/24.jpg)
Flow ChartFlow Chart
Student Programs
ProfileComputeCloseness
SuspiciousPrograms