property based testing : shrinking risk in your code · property based testing : shrinking risk in...
TRANSCRIPT
Property Based Testing : Shrinking Risk In Your Code
Amanda Laucher @pandamonial
Pariveda Solutions Consultant Mined Minds Co-Founder
What is Property Based Testing?
Patterns for specifying properties
Generating Inputs to the properties
Shrinking Failures
Configurations
Surprise
Bank OCR Kata : Story 2
Account number: 3 4 5 8 8 2 8 6 5 Position names: d9 d8 d7 d6 d5 d4 d3 d2 d1
Checksum calculation: (d1 + 2 + 3*d3 + 4*d4 + 5*d5 + 6*d6 +… 9* d9) % 11 = 0
describe "#check?" do context "when the account number is good" do # good account numbers were taken from the user story specs Then { checker.check?("000000000").should be_true } Then { checker.check?("000000051").should be_true } Then { checker.check?("123456789").should be_true } Then { checker.check?("200800000").should be_true } Then { checker.check?("333393333").should be_true } Then { checker.check?("490867715").should be_true } Then { checker.check?("664371485").should be_true } Then { checker.check?("711111111").should be_true } Then { checker.check?("777777177").should be_true } end
What about these?
2625145141 7634763476
23623762 349745845
2376438X9734 sjdgdfkghfkgsd
bchd “ “
36372927365
property check (variable) do context "when the account number is good" do Then { checker.check?(variable).should be_true } end end
generate_10000000_strings.map(&:check)
Account number: 3 4 5 8 8 2 8 6 5 Position names: d9 d8 d7 d6 d5 d4 d3 d2 d1 Checksum calculation: (d1 + 2 + 3*d3 + 4*d4 + 5*d5 + 6*d6 +… 9* d9) % 11 = 0
Valid Accounts "000000000" “000000051" "123456789" “200800000” “333393333" “490867715" “664371485" "711111111" "777777177"
All are 9 chars - anything not 9 chars is invalid All are digits If we run the check digit multiple times it should return the same value
TypesUnit Tests
Property Tests
“Testing shows the presence, not the absence of bugs” Edsger W. Dijkstra
Create a model that specifies properties
from functional requirements
Generate tests to falsify the properties
Create a model that specifies properties
from functional requirements
Generate tests to falsify the properties
Determine minimal failure case
Create a model that specifies properties
from functional requirements
Generate tests to falsify the properties
Determine minimal failure case
Simple Enough!
“However learning to Use QuickCheck can be a challenge. It requires the developer to take a step back and really
try to understand what the code should be doing.”
Property Based Testing With QuickCheck in Erlang
property("max") = forAll { (x: Int, y: Int) => z = max(x, y) (z == x || z == y) && (z >= x && z >= y) }
Shift in thinking!
>OK, passed 100 tests
forAll: Claire
var commutative = forAll( _.Int,_.Int ).satisfy( function(a, b) { return a + b == b + a }).asTest()
forAll: jsverify
var boolFnAppliedThrice = jsv.forall("bool -> bool", "bool", function (f, b) { return f(f(f(b))) === f(b); });
forAll: Erlang QuickCheck
prop_json_encode_decode() -> ?FORALL(JSON_TERM, json_term(), begin JSON_TERM =:= decode(encode(JSON_TERM)) end)
val propContains = exists { c:Conference =>(c.name ==“YOW!”) && (c.keywords.contains(“Awesome”) }
+ OK, proved property.
propNewImplIsFaster = forAll{commands: genListOfCommands => time1 = time(FS.run(commands)) time2 = time(CS.run(commands)) time1 < time2 }
Pattern: Relations
propOptionsCost = forAll{ option1: genOption, option2: genOption => score1 = rank(option1) score2 = rank(option2) if (option1.cost <= option2.cost)
score1 > score2 else
score1 < score2 }
Pattern: Relations
forAll { list: List[String] => bubbleSort(list) == quickSort(list) }
Pattern: Reference Implementation
Pattern: Reference Implementation
propNewImplSameResult = forAll{ cmds: genListOfCommands =>
state1 = FS.run_and_return_state(cmds) state2 = CS.run_and_return_state(cmds) assert.equal(state1,state2) }
Pattern: Round Trip
let revRevTree (xs:list<Tree>) = List.rev(List.rev xs) = xs Check.Quick revRevTree
>OK, passed 100 tests
propInsert = forAll{ person: genPerson => p1 = svc.put(person) p2 = svc.get(person) assert.equal(p1,p2) }
Pattern: Round Trip
Pattern: Idempotent
propDBUpsertIsIdempotent = forAll{person : genPerson =>
p1 = db.upsert(person) p2 = db.upsert(person) p3 = db.upsert(person)
count = db.get(person).count() assert.equal(1,count) }
Preconditions
notZeroProperty = forAll { num: Number => (num != 0) ==> { …do something that divides by num } }
Preconditions
fakeRedHeadsProperty = forAll { p: Person => (p.hair == “Red” && p.age > 49) ==> { …test something for people with red hair aged 50+ } }
Re-order already generated inputs & rerun function for commutative tests Nesting forall for relations
trait Color case object Red extends Color case object Black extends Color
trait Vehicle { def color: Color } case class Mercedes(val color: Color) extends Vehicle case class BMW(val color: Color) extends Vehicle case class Tesla(val color: Color) extends Vehicle
val genColor = Gen.oneOf(Red, Black)
val genMerc = for {color <- genColor} yield Mercedes(color) val genBMW = for {color <- genColor} yield BMW(color) val genTesla = for {color <- genColor} yield Tesla(color)
val genNewCar = Gen.oneOf(genMerc, genBMW, genTesla)
let associativity (x:int) (f:int->float,g:float->char,h:char->int) = ((f >> g) >> h) x = (f >> (g >> h)) x Check.Quick associativity
import org.scalacheck.Prop.{forAll, classify} val p = forAll { n:Int =>classify(n < 1000 , “small”, "large") { classify(n < 0, "negative", “positive”) {
n+n == 2*n }
} }
Lambdas - to execute each command
Generate list of commands
Check pre-conditions before executing each
Execute and check state is as expected or that properties of the state machine hold true.
The generator can generate different things every time.
If you find a failure, consider making it a unit test for regression.
You start with a failing example. Apply shrink function once to the generated
failing input.
Then apply the property again. If it still fails repeat the process.
Otherwise you’re shrunk.
Minimum number of successful tests Maximum ratio between discarded and successful tests Minimum and maximum data size Random number generator Number of worker threads
Check all of the things
HaskellQuickCheck
Erlang : Quviq QuickCheck (commercial support) or Triq (Apache)
C : Theft
C++: QuickCheck++
.NET (C#, F#, VB): FsCheck
Ruby: Rantly
Scala: ScalaCheck
Clojure: ClojureCheck -- requires clojure.test
Java: JavaQuickCheck -- requires JUnit or some other testing framework
Groovy: Gruesome -- a quick and dirty implementation for Groovy
JavaScript: QC.js, jsverify, claire
Python:
Factcheck
Hypothesis
pytest-quickcheck - requires pytest, I found it hard to extend, and so wrote Factcheck
anonymize "[email protected]" == "p____@f______.s_"
Problem 1:
type IntervalSet = [(Int,Int)], where for example the set {1, 2, 3, 7, 8, 9, 10} would be represented by the list
[(1,3),(7,10)]
Problem 2:
SEND MORE
MONEY Map letters to digits, such that the calculation makes sense.
(In the example M=1,O=0,S=9,R=8,E=5,N=6,Y= 2, D = 7)
Problem 3:
…We observed that all the QuickCheck test suites (when they were written) were more effective at detecting
errors than any of the HUnit test suites.
…We observed that all the QuickCheck test suites (when they were written) were more effective at detecting errors than any of the HUnit test suites. … Finally, we observed that QuickCheck users are less likely to write test code than HUnit users—even in a study of automated testing—suggesting perhaps that HUnit is easier to use.
char *caesar(int shift, char *input) { char *output = malloc(strlen(input)); memset(output, '\0', strlen(input));
for (int x = 0; x < strlen(input); x++) { if (isalpha(input[x])) { int c = toupper(input[x]); c = (((c - 65) + shift) % 26) + 65; output[x] = c; } else { output[x] = input[x]; } }
return output; }
caesar :: Int -> String -> String caesar k = map f where f c | inRange ('A','Z') c = chr $ ord 'A' + (ord c - ord 'A' + k) `mod` 26 | otherwise = c
foreign import ccall "ceasar.h caesar" c_caesar :: CInt -> CString -> CString
native_caesar :: Int -> String -> IO String native_caesar shift input = withCString input $ \c_str -> peekCString(c_caesar (fromIntegral shift) c_str)
$ ghci caesar.hs caesar.so *Main> caesar 2 "ATTACKATDAWN" "CVVCEMCVFCYP" *Main> native_caesar 2 "ATTACKATDAWN" "CVVCEMCVFCYP"
safeEquivalenceProperty = forAll genSafeString $ \str -> unsafeEq (native_caesar 2 str) (caesar 2 str)
deepCheck p = quickCheckWith stdArgs{ maxSuccess = 10000 } p
$ ghci caesar.hs caesar.so *Main> deepCheck safeEquivalenceProperty *** Failed! Falsifiable (after 57 tests): "PMGDOSBUFYLIAITYVAPKZGTTWSCKMTXHJOKMYIEQFARLJGHJDPXSSXTP" *Main> caesar 2 "PMGDOSBUFYLIAITYVAPKZGTTWSCKMTXHJOKMYIEQFARLJGHJDPXSSXTP" "ROIFQUDWHANKCKVAXCRMBIVVYUEMOVZJLQMOAKGSHCTNLIJLFRZUUZVR" *Main> native_caesar 2 "PMGDOSBUFYLIAITYVAPKZGTTWSCKMTXHJOKMYIEQFARLJGHJDPXSSXTP" "ROIFQUDWHANKCKVAXCRMBIVVYUEMOVZJLQMOAKGSHCTNLIJLFRZUUZVR\SOH"
char *caesar(int shift, char *input) { char *output = malloc(strlen(input)); memset(output, '\0', strlen(input));
for (int x = 0; x < strlen(input); x++) { if (isalpha(input[x])) { int c = toupper(input[x]); c = (((c - 65) + shift) % 26) + 65; output[x] = c; } else { output[x] = input[x]; } }
return output; }
char *caesar(int shift, char *input) { char *output = malloc(strlen(input)); memset(output, '\0', strlen(input));
for (int x = 0; x < strlen(input); x++) { if (isalpha(input[x])) { int c = toupper(input[x]); c = (((c - 65) + shift) % 26) + 65; output[x] = c; } else { output[x] = input[x]; } }
return output; }
Improper Bound
char *caesar(int shift, char *input) { char *output = malloc(strlen(input)); memset(output, '\0', strlen(input));
for (int x = 0; x <= strlen(input); x++) { if (isalpha(input[x])) { int c = toupper(input[x]); c = (((c - 65) + shift) % 26) + 65; output[x] = c; } else { output[x] = input[x]; } }
return output; }
Final ThoughtsTDD?
Property tests live longer than unit tests
Find different bugs
Less code so more maintainable
More helper functions
More complex code
Use in conjunction with other tests