dbpd: a dynamic birthmark-based software plagiarism detection tool zhenzhou tian...
TRANSCRIPT
DBPD: A Dynamic Birthmark-based Software
Plagiarism Detection Tool
DBPD: A Dynamic Birthmark-based Software
Plagiarism Detection Tool
Zhenzhou Tian
MOE Key Lab for Intelligent Networks and Network Security
Xi’an Jiaotong University, China
23/4/20
1
2
Introduction Software plagiarism has been a serious threat to the healthy
development of software industry• Violate licenses for commercial interests or unwittingly
• Weak code protection awareness• Powerful automated code obfuscation tools• Distributed in binary form
3
Introduction Many software birthmark based techniques are proposed
Static Birthmarks: CVFV,SMC,IS,UC… Dynamic Birthmarks: WPP, SCSSB, SCDG, DKISB… Seldom tools are publically available
Dynamic birthmarks are believed to perform better than static birthmarks
Tool Static/Dynamic Language
Sandmark Static Java bytecode
Stigmata Static Java bytecode
Birthmarking Dynamic Java bytecode
JPlag Static Source code
4
Framework of DBPD Software BirthmarkA set of characteristics extracted from a program that reflects
intrinsic properties of the program, and which can be used to identify the program uniquely.
Design Overview
Dynamic Analysis Module
Similarity Calculator & Decision Maker
Plaintiff Binary
Defendant Binary
Input
DKISB Generator
SODB Generator
SCSSB Generator
Birthmark Generator
5
Three Dynamic Birthmarks Three Birthmark Approaches Implemented DKISB: Dynamic Key Instruction Sequence BirthmarkGenerated using k-gram algorithm from dynamic key instructions
(instructions that are both value updating and input correlated).
SCSSB: System Call Short Sequence BirthmarkExtracted by splitting system call sequence into short sub-sequences
SODB: Stack Operation Dynamic BirthmarkGenerated by analyzing the behavior of stack operations, utilizing
the law of push and pop operation of call stack to uniquely identify a program
6
Demonstration
Independently implemented software with similar functionalities
7
Demonstration
Plagiarism Using Different Compilers and Optimization Levels
8
Demonstration
Plagiarism Using Specific Obfuscation Tools
9
Demonstration
Cross-Platform Plagiarism Scenario
10
Some Definitions
11
Some Definitions