pattern-based features features.pdf · • test prioritization / test suite minimization •...
TRANSCRIPT
Pattern-based FeaturesA Data Transformation Pattern
http://research.microsoft.com/en-us/projects/tark/
Venkatesh-Prasad Ranganath
Jithin ThomasMicrosoft Research, India
http://research.microsoft.com/ICSE2013
Context / Scenarios
• Compatibility Testing
• Test Prioritization / Test Suite Minimization
• Representative Identification
• Similar Case Recommendation
• Anomaly Detection
Constraints
Input Characteristics– Data is sequential
– Data is structured
– Fields may be irrelevant
– Values may be irrelevant
– Value flow may be relevant
Output Constraints– Usable with existing DM/
ML algorithms
– Amenable to simple reasoning
– Accessible
– Possess explanatory power
A Data Transformation Pattern
• Use off-the-shelf techniques to mine patterns– Item-set mining
– Temporal pattern mining
– Association rule mining*
– Graph mining*
• Use patterns as features– Binary/Categorical features: Presence of patterns
– Numeric features: Properties of patterns
* We have not tried these pattern mining techniques.
Example
Win8Win8
USB 3.0 Driver Stack
USB 3.0 Driver Stack
XHCI
Driver1
Win7Win7
USB 2.0 Driver Stack
Driver1
EHCI OHCI UHCI
Driver2
USB 2.0 device
USB 2.0 device
When a USB 2.0 device is plugged into a USB 3.0 port on Win8, will the USB 3.0 driver in Win8 exhibit the same behavior as the USB 2.0 driver?
Example
USB2Log
Tark
USB2 Patterns
USB3Log
Tark
USB3 PatternsStructural and Temporal
Pattern Diffing
USB2Patterns
USB3Patterns
DispatchIrp forward alternates with IrpCompletion && PreIoCompleteRequest when IOCTLType=IRP_MJ_PNP(0x1B),IRP_MN_START_DEVICE(0x00), irpID=SAME, and IrpSubmitDetails.irp.ioStackLocation.control=SAME
DispatchIrp forward alternates with IrpCompletion && PreIoCompleteRequest when IOCTLType=IRP_MJ_PNP(0x1B),IRP_MN_START_DEVICE(0x00), irpID=SAME, and IrpSubmitDetails.irp.ioStackLocation.control=SAME
IOCTLType=URB_FUNCTION_BULK_OR_INTERRUPT_TRANSFER(0x09)&& IoCallDriverReturn && IoCallDriverReturn.irql=2&& IoCallDriverReturn.status=0xC000000E
IOCTLType=URB_FUNCTION_BULK_OR_INTERRUPT_TRANSFER(0x09)&& IoCallDriverReturn && IoCallDriverReturn.irql=2&& IoCallDriverReturn.status=0xC000000E
Pattern-based Features
Input Characteristics– Data is sequential
– Data is structured
– Fields may be irrelevant
– Values may be irrelevant
– Value flow may be relevant
Output Constraints– Usable with existing DM/ML
algorithms
– Amenable to simple reasoning
– Accessible
– Possess explanatory power
Pattern– Use off-the-shelf
techniques to mine patterns• Item-set mining
• Temporal pattern mining
– Use patterns as features• Binary/Categorical
features
• Numeric features
http://research.microsoft.com/en-us/projects/tark/