the sat 2005 competition

55
The SAT 2005 Competition What’s new this year The benchmarks First stage results All categories Random category Crafted category Industrial category Second stage results Random category Crafted category Industrial category Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation The SAT 2005 Competition Fourth Edition Daniel Le Berre and Laurent Simon Eighth International Conference on Theory and Applications of Satisfiability Testing, SAT’05 1 / 55

Upload: others

Post on 16-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The SAT 2005 CompetitionFourth Edition

Daniel Le Berre and Laurent Simon

Eighth International Conference on Theory andApplications of Satisfiability Testing, SAT’05

1 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Agenda

What’s new this year

The benchmarks

First stage resultsAll categoriesRandom categoryCrafted categoryIndustrial category

Second stage resultsRandom categoryCrafted categoryIndustrial category

Certified UNSAT Special track

Non clausal special track

Next contest?

Pseudo Boolean evaluation

2 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

They support usThank you!

3 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The new judges

Armin Biere Specialist about industrial benchmarks andsolvers.

Olivier Kullmann Specialist about k-SAT. Generated all thebenchmarks for the random category.

Allen van Gelder Well aware of the CASC competition.Proposed the new scoring scheme. Managedthe certified unsat special track.

All the decisions were taken in agreement with the judges

4 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The special tracks

Certified UNSAT a specific category in which the solversmust output a certificate of unsatisfiability.The proof format and a proof checker wereprovided by Allen van Gelder.Only two participants: zchaff and ttsp-3.0

Pseudo Boolean evaluation dedicated to solvers managingpseudo-boolean constraints and optimizationfunctions.Managed by Vasco Manquinho and OlivierRousselhttp://www.cril.univ-artois.fr/PB05/8 solvers (17 variants) from 8 submitters.

Non clausal evaluation dedicated to solvers able to takegates as input. The input format was providedby Fahiem Bacchus and Toby Walsh. No solversubmission. One benchmark submission.

5 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

What’s new in the rules

I Competition and Demonstration divisions.

Competition the source code of the solver must beavailable after the competition.

Demonstration a binary version of the solver must beavailable for research purpose.

I Participation to the competition must benefit to thecommunity

I By providing source code, binary or benchmarksI By supporting the conference and the competition

6 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The new scoring scheme

Benchmark purse to be divided equally among the solversable to solve it.

Speed purse to be divided unequally among the solvers ableto solve a given benchmark.

Series an extra credit is given for each series solved.

Solver his score is the sum of the credits obtained perbenchmarks solved.

7 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The new award scheme

I Three categories: industrial, crafted and random

I Three specialties: SAT, UNSAT and SAT+UNSAT

I Three medals: gold, silver and bronze

So we have a total of 27 awards this year!

8 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Invariants

I Only 3 solvers per submitter can enter the first stage,competition division.

I Only 1 solver per submitter can enter the second stage,competition division.

9 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Random category

I 3-SAT, 5-SAT, 7-SAT

I From 400 to 10000 variables.

I 285 SAT and 105 UNSAT benchmarks

I Answers known in advance

10 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Industrial category

Zarpas New formal verification benchmarks from IBM(FV 2004)

Velev Known VLIW-SAT (2.0 and 4.0),VLIW-UNSAT 2.0 and Liveness UNSAT 2.0

Grieu VMPC invertion, open cryptographic problem

Narain VPN models generated from Alloy

Maris Planning benchmarks

Wider range of problems than in previous edition.

11 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Crafted category

Sat’04 Previous year hard, unsolved benchmarks

Biere LinvRinv benchmarks (proposed by Cook lastyear)

Sabharwal Counting/Ordering/Pebbling problems

Jarvisalo Based on 3-Regular graphs

Lynce Social Golfer problem (A golf problem in StAndrews?)

Sorge Algebraic benchmarks

Markstrom Problems generating long learned clauses.

Roussel PHNF form of previous year mediumbenchmarks

Wider range of problems than in previous edition.

12 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Environment

The hardware:

LRI 16 Athlon 1800+ with 1GB RAM

UC 8 Athlon 1800+ with 2 GB RAM32 Pentium III 450 with 1GB of RAM

I Running GNU Linux (RH flavor).

I Solvers compiled with GCC 3.3.5.

I Java solver using Java 1.5.0 02 JVM.

Provided by:

I LINC Lab, Department of ECECS, University ofCincinnati

I LRI, Universite de Paris-Sud

13 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The first stage

I Aim: to detect the most promising solvers for a given(category,specialty)

I 20 minutes timeout (Greater than in previous years)

I Solvers answering incorrectly move to demonstrationdivision

14 / 55

Overview on all the benchmarks (see posters)

0 100 200 300 400 500 600 700 800 900 10000

200

400

600

800

1000

1200

1400!

tts!3!0

! ls

atv1.1

! (S

atELit

e relea

se)

! kc

nfs!2

004

! ad

aptno

velty

! sa

ps!

(rpa

ws5)

! (k

cnfs)

! w

llsatv

1!

(rpa

ws40)

! hs

atrr

! rp

aws1

0!

rrsa

ps!

vw! D

ew Satz 1

c!

Dew S

atz 1b

! g2

wsat

! ra

nov

! (D

ew Satz 1

e)

! (D

ew Satz 1

d)

! D

ew Satz 1

a!

comps

at!

hsat.

5!

hsat.

1!

sat4j

.jar

! (C

irCUsB

)!

(CirC

UsA)

! m

arch d

l!

zcha

ff!

Haif

aSat2

! Je

rusat1

.31A

! Je

rusat1

.31B

! (C

irCUsD

)

! (m

idisa

t stat

ic)

! H

aifaS

at!

zcha

ff rand

! va

llst.s

h!

csat

! (e

ureka B

)!

(eure

ka C)

! (e

ureka A

)!

mini

sat stat

ic!

SatE

LiteG

TI

#Solved

CPU!T

ime

need

ed (s

)

All solvers on All benchmarks

tts!3!0 (153)lsatv1.1 (217)(SatELiterelease) (294)kcnfs!2004 (305)adaptnovelty (315)saps (333)(rpaws5) (337)(kcnfs) (377)wllsatv1 (386)(rpaws40) (394)hsatrr (402)rpaws10 (402)rrsaps (432)vw (435)DewSatz1c (441)DewSatz1b (445)g2wsat (445)ranov (457)(DewSatz1e) (467)(DewSatz1d) (476)DewSatz1a (488)compsat (587)hsat.5 (590)hsat.1 (591)sat4j.jar (596)(CirCUsB) (606)(CirCUsA) (621)marchdl (621)zchaff (630)HaifaSat2 (637)Jerusat1.31A (641)Jerusat1.31B (641)(CirCUsD) (643)(midisatstatic) (644)HaifaSat (653)zchaffrand (655)vallst.sh (667)csat (694)(eurekaB) (696)(eurekaC) (697)(eurekaA) (710)minisatstatic (780)SatELiteGTI (818)

Clustering on all the benchmarks

0 83 166 249 332 415

Dew_Satz_1bDew_Satz_1cDew_Satz_1a

(Dew_Satz_1d)(Dew_Satz_1e)

wllsatv1(kcnfs)

kcnfs!2004adaptnovelty

sapsg2wsat

rpaws10vw

(rpaws40)rrsapsranov

(rpaws5)lsatv1.1tts!3!0(siege4)

(SatELite_release)(CirCUsA)(CirCUsB)(CirCUsD)

vallst.shcompsat

zchaffcsat

(eureka_A)(eureka_B)

HaifaSatHaifaSat2

zchaff_rand(eureka_C)

Jerusat1.31_AJerusat1.31_B(midisat_static)

sat4j.jarminisat_static

SatELiteGTIhsat.1hsat.5hsatrr

march_dl

Solv

ers

Distance (#Benchs over 1657)

SAT 2005 Clustering of all solvers on all benchmarks

405 401 448 436 431 356 357 305 294 308 416 375 409 365 402 428 316 211 153 0 276 583 568 602 623 544 584 641 659 644 610 594 611 646 596 596 598 549 731 768 547 546 373 577

406, 400

558, 535

454, 430

325, 277

678, 625

629, 575

604, 547

412, 355

458, 383

627, 565

431, 360

370, 292

648, 538

694, 561

458, 331 478, 301

716, 549

801, 698

489, 341

478, 260

739, 533

508, 260

758, 494

774, 479

513, 240

634, 494

788, 446

647, 500

153, 0

522, 294

719, 490

558, 373

864, 414

273, 0

883, 388

669, 351

567, 194

945, 384

739, 120

972, 242

493, 0

946, 0

1167, 0

Overview of the solvers on random benchmarks

0 20 40 60 80 100 120 140 160 1800

200

400

600

800

1000

1200

1400!

(CirC

UsA)

! co

mpsat

! H

aifaS

at!

lsatv

1.1!

(CirC

UsD)

! (C

irCUsB

)!

(eure

ka B)

! H

aifaS

at2!

hsatr

r!

Jerus

at1.31

A

! (e

ureka A

)!

(eure

ka C)

! Je

rusat1

.31B

! (S

atELit

e relea

se)

! zc

haff ra

nd!

zcha

ff!

hsat.

1!

valls

t.sh

! hs

at.5

! cs

at!

(midi

sat stat

ic)

! sa

t4j.ja

r!

SatE

LiteG

TI

! m

inisa

t stat

ic!

wlls

atv1

! (D

ew Satz 1

d)

! m

arch d

l!

Dew S

atz 1b

! D

ew Satz 1

c!

Dew S

atz 1a

! sa

ps!

adap

tnove

lty

! (D

ew Satz 1

e)!

(rpa

ws40)

! rr

saps

! (r

paws5

)!

(kcn

fs)!

kcnfs!2

004

! vw

! rp

aws1

0!

g2wsa

t!

rano

v

#Solved

CPU!T

ime

need

ed (s

)

All solvers on Random benchmarks

(CirCUsA) (3)compsat (3)HaifaSat (3)lsatv1.1 (3)(CirCUsD) (4)(CirCUsB) (5)(eurekaB) (5)HaifaSat2 (5)hsatrr (5)Jerusat1.31A (7)(eurekaA) (9)(eurekaC) (9)Jerusat1.31B (11)(SatELiterelease) (12)zchaffrand (12)zchaff (16)hsat.1 (17)vallst.sh (17)hsat.5 (19)csat (27)(midisatstatic) (46)sat4j.jar (50)SatELiteGTI (54)minisatstatic (56)wllsatv1 (67)(DewSatz1d) (73)marchdl (74)DewSatz1b (84)DewSatz1c (85)DewSatz1a (87)saps (101)adaptnovelty (107)(DewSatz1e) (107)(rpaws40) (112)rrsaps (116)(rpaws5) (139)(kcnfs) (140)kcnfs!2004 (140)vw (148)rpaws10 (151)g2wsat (158)ranov (178)

Overview of the size of crafted benchmarks

103 104 105 106

biere05/linvrinvjarvisalo05/mod2!3cage!unsat

jarvisalo05/mod2!3g14!satjarvisalo05/mod2!rand3bip!sat

arvisalo05/mod2!rand3bip!unsatjarvisalo05/mod2c!3cage!unsatjarvisalo05/mod2c!rand3bip!sat

rvisalo05/mod2c!rand3bip!unsatlynce05/social!golfer!problem

markstrom05/eulcbipmarkstrom05/pmg

roussel05/cnfcolor!PHNFroussel05/equilarge!PHNF

roussel05/visbmc!PHNFsabharwal05/counting/clqcolor/satal05/counting/clqcolor/unsat/set!aal05/counting/clqcolor/unsat/set!babharwal05/counting/fclqcolor/sat05/counting/fclqcolor/unsat/set!a05/counting/fclqcolor/unsat/set!b

sabharwal05/counting/fphp/satrwal05/counting/fphp/unsat/easierwal05/counting/fphp/unsat/harder

sabharwal05/counting/php/satrwal05/counting/php/unsat/easierrwal05/counting/php/unsat/harderharwal05/ordering/gt!ordering/satwal05/ordering/gt!ordering/unsatrwal05/pebbling/grid!pebbling/satal05/pebbling/grid!pebbling/unsat05/pebbling/random!pebbling/sat/pebbling/random!pebbling/unsatabharwal05/planning/logistics/satharwal05/planning/logistics/unsat

sat04/gomes03sorge05/QG6sorge05/QG7/

sorge05/QG7asorge05/QG8

Distribution of the sizes of crafted benchmarks

# of literals

Overview of the solvers on crafted benchmarks

0 50 100 150 200 250 300 350 4000

200

400

600

800

1000

1200

1400!

kcnfs!2

004

! (r

paws5

)!

tts!3!0

! ls

atv1.1

! ad

aptno

velty

! (k

cnfs)

! sa

ps!

rpaw

s10

! (S

atELit

e relea

se)

! (r

paws4

0)!

g2wsa

t!

vw!

rano

v!

wlls

atv1

! rr

saps

! (D

ew Satz 1

e)!

hsatr

r!

Dew S

atz 1c

! D

ew Satz 1

b!

Dew S

atz 1a

! (D

ew Satz 1

d)!

sat4j

.jar

! co

mpsat

! H

aifaS

at2!

Haif

aSat

! hs

at.5

! hs

at.1

! (C

irCUsB

)

! Je

rusat1

.31B

! zc

haff

! (e

ureka C

)!

(CirC

UsA)

! (C

irCUsD

)

! zc

haff ra

nd!

Jerus

at1.31

A

! cs

at!

(midi

sat stat

ic)

! (e

ureka B

)!

marc

h dl

! (e

ureka A

)!

mini

sat stat

ic!

SatE

LiteG

TI

! va

llst.s

h

#Solved

CPU!T

ime

need

ed (s

)

All solvers on Crafted benchmarks

kcnfs!2004 (84)(rpaws5) (97)tts!3!0 (98)lsatv1.1 (103)adaptnovelty (108)(kcnfs) (120)saps (121)rpaws10 (126)(SatELiterelease) (140)(rpaws40) (148)g2wsat (156)vw (156)ranov (163)wllsatv1 (166)rrsaps (176)(DewSatz1e) (192)hsatrr (206)DewSatz1c (227)DewSatz1b (232)DewSatz1a (233)(DewSatz1d) (234)sat4j.jar (256)compsat (269)HaifaSat2 (292)HaifaSat (294)hsat.5 (306)hsat.1 (307)(CirCUsB) (309)Jerusat1.31B (312)zchaff (312)(eurekaC) (316)(CirCUsA) (318)(CirCUsD) (320)zchaffrand (320)Jerusat1.31A (321)csat (324)(midisatstatic) (325)(eurekaB) (337)marchdl (337)(eurekaA) (342)minisatstatic (348)SatELiteGTI (362)vallst.sh (368)

Overview of the size of industrial benchmarks

103 104 105 106 107

grieu05/vmpc

maris05/Depots

maris05/DriverLog

maris05/Ferry

maris05/Rovers

maris05/Satellite

narain05/vpn

v05/liveness!unsat!2!0

velev05/vliw!sat!2!0

velev05/vliw!sat!4!0

velev05/vliw!unsat!2!0

zarpas05/01

zarpas05/07

zarpas05/18

zarpas05/1_11

zarpas05/20

zarpas05/23

zarpas05/26

zarpas05/29

zarpas05/2_14

Distribution of the sizes of industrial benchmarks

# of literals

Overview of the solvers on industrial benchmarks

0 50 100 150 200 250 300 350 400 450 5000

200

400

600

800

1000

1200

1400!

tts!3!0

! kc

nfs!2

004

! ad

aptno

velty

! (r

paws5

)!

lsatv

1.1!

saps

! ra

nov

! (k

cnfs)

! rp

aws1

0!

Dew S

atz 1b

! D

ew Satz 1

c!

g2wsa

t!

vw!

(rpa

ws40)

! rr

saps

! (S

atELit

e relea

se)

! w

llsatv

1!

Dew S

atz 1a

! (D

ew Satz 1

e)

! (D

ew Satz 1

d)

! hs

atrr

! m

arch d

l!

hsat.

5!

hsat.

1!

(midi

sat stat

ic)

! va

llst.s

h!

sat4j

.jar

! (C

irCUsB

)!

(CirC

UsA)

! zc

haff

! Je

rusat1

.31A

! co

mpsat

! Je

rusat1

.31B

! (C

irCUsD

)

! zc

haff ra

nd!

Haif

aSat2

! cs

at!

(eure

ka B)

! H

aifaS

at!

(eure

ka A)

! (e

ureka C

)!

mini

sat stat

ic!

SatE

LiteG

TI

#Solved

CPU!T

ime

need

ed (s

)

All solvers on Industrial benchmarks

tts!3!0 (55)kcnfs!2004 (81)adaptnovelty (100)(rpaws5) (101)lsatv1.1 (111)saps (111)ranov (116)(kcnfs) (117)rpaws10 (125)DewSatz1b (129)DewSatz1c (129)g2wsat (131)vw (131)(rpaws40) (134)rrsaps (140)(SatELiterelease) (142)wllsatv1 (153)DewSatz1a (168)(DewSatz1e) (168)(DewSatz1d) (169)hsatrr (191)marchdl (210)hsat.5 (265)hsat.1 (267)(midisatstatic) (273)vallst.sh (282)sat4j.jar (290)(CirCUsB) (292)(CirCUsA) (300)zchaff (302)Jerusat1.31A (313)compsat (315)Jerusat1.31B (318)(CirCUsD) (319)zchaffrand (323)HaifaSat2 (340)csat (343)(eurekaB) (354)HaifaSat (356)(eurekaA) (359)(eurekaC) (372)minisatstatic (376)SatELiteGTI (402)

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The second stage

22 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

And Now....

The final results

23 / 55

Overview of the solvers on random benchmarks

0 50 100 150 200 2500

1000

2000

3000

4000

5000

6000

7000

! m

inisa

t stat

ic

! S

atELit

eGTI

! m

arch d

l

! sa

ps

! w

llsatv

1

! D

ew Satz

1a

! ad

aptno

velty

! kc

nfs!2

004

! vw

! g2

wsat

! ra

nov

#Solved

CPU!T

ime

need

ed (s

)Second Stage:

All solvers on Random benchmarks

minisatstatic (78)

SatELiteGTI (79)march

dl (99)

saps (104)wllsatv1 (104)Dew

Satz

1a (118)

adaptnovelty (119)kcnfs!2004 (167)vw (170)g2wsat (178)ranov (209)

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Random SAT specialty, the winners

1. ranov

2. g2wsat

3. vw

25 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Random SAT specialty, the winners

Solver Score SAT answers UNSAT answers

ranov 163903 209 0g2wsat 101286 178 0

vw 76002 170 0

adaptnovelty 21748 119 0saps 15603 104 0

kcnfs-2004 14604 92 0dSatz-1a 8943 68 0march-dl 7444 56 0wllsatv1 7202 59 0

satELiteGTI 5198 46 0minisat 5147 45 0

26 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Random UNSAT specialty, the winners

1. kcnfs-2004

2. march-dl

3. sSatz-1a

27 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

UNSAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers

kcnfs-2004 97930 0 75march-dl 25228 0 43

dewSatz-1a 19456 0 50

wllsatv1 12902 0 45minisat 7369 0 33

satELiteGTI 7335 0 33

28 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Random SAT+UNSAT specialty, the winners

1. kcnfs-2004

2. march-dl

3. sSatz-1a

29 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Random SAT+UNSAT specialty, the completeranking

Solver Score SAT answers UNSAT answers

kcnfs-2004 95075 92 75march-dl 27141 56 43dSatz-1a 22940 68 50

wllsatv1 16145 59 45satELiteGTI 10074 46 33

minisat 10058 45 33

30 / 55

Overview of the solvers on crafted benchmarksbased on a selection of the first stage benchmarks

0 50 100 150 200 250 300 350 4000

1000

2000

3000

4000

5000

6000

7000

! tts!3!0

! hs

at.1

! Je

rusat1

.31A

! zc

haff

! m

arch d

l

! zc

haff ra

nd

! cs

at

! m

inisa

t stat

ic

! va

llst.s

h

! S

atELit

eGTI

#Solved

CPU!T

ime

need

ed (s

)

Second Stage:All solvers on Crafted benchmarks

tts!3!0 (102)hsat.1 (324)Jerusat1.31

A (347)

zchaff (356)march

dl (357)

zchaffrand (358)

csat (371)minisat

static (379)

vallst.sh (387)SatELiteGTI (400)

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Crafted SAT specialty, the winners

1. vallst

2. march-dl

3. hsat-1

32 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Crafted SAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers

vallst 31258 138 0march-dl 27656 138 0

hsat-1 20156 130 0

satELiteGTI 17418 122 0minisat 17210 122 0

csat 13791 113 0zchaff 13692 112 0

zchaff-rand 11431 107 0jerusat-A 10702 104 0

tts 475 5 0

33 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Crafted UNSAT specialty, the winners

1. satEliteGTI

2. minisat

3. vallst and march-dl

34 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Crafted UNSAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers

satELiteGTI 35639 0 126minisat 26159 0 121

vallst 25532 0 100march-dl 25371 0 99

csat 23878 0 112tts-3-0 20765 0 54hsat-1 19936 0 90zchaff 14359 0 89

zchaff-rand 12419 0 78jerusat-A 9275 0 77

35 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Crafted SAT+UNSAT specialty, the winners

1. vallst

2. satEliteGTI

3. march-dl

36 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Crafted SAT+UNSAT specialty, the completeranking

Solver Score SAT answers UNSAT answers

vallst 56445 138 100satELiteGTI 53128 122 126

march-dl 52432 138 99

minisat 43691 122 121hsat-1 39497 130 90

csat 38324 113 112zchaff 27455 112 89

zchaff-rand 24171 107 78tts-3-0 21298 5 54

jerusat-A 19632 104 77

37 / 55

Overview of the solvers on original industrialbenchmarks

0 50 100 150 200 250 3000

2000

4000

6000

8000

10000

12000

14000!

wlls

atv1

! hs

at.5

! va

llst.s

h

! sa

t4j.ja

r

! co

mpsat

! zc

haff

! zc

haff ra

nd

! cs

at

! H

aifaS

at

! Je

rusat1

.31B

! m

inisa

t stat

ic

! S

atELit

eGTI

#Solved

CPU!T

ime

need

ed (s

)

Second Stage:All solvers on renamed Industrial benchmarks

wllsatv1 (92)hsat.5 (153)vallst.sh (154)sat4j.jar (180)compsat (189)zchaff (197)zchaffrand (226)csat (231)HaifaSat (242)Jerusat1.31B (243)minisatstatic (250)SatELiteGTI (267)

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Industrial SAT specialty, the winners

1. satEliteGTI

2. minisat

3. jerusat-B and haifaSat

39 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Industrial SAT Specialty, the complete ranking

Solver Score SAT answers UNSAT answers

satELiteGTI 73506 180 0minisat 50985 166 0

jerusat-B 38625 163 0haifaSat 28428 151 0

zchaff-rand 24885 132 0csat 21997 140 0

zchaff 19236 121 0compsat 16715 114 0

sat4j 12898 110 0wllsatv1 11390 86 0

hsat-5 11046 99 0vallst 7757 85 0

40 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Industrial UNSAT specialty, the winners

1. satEliteGTI

2. zchaff-rand

3. haifaSat

41 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Industrial UNSAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers

satELiteGTI 27518 0 87zchaff-rand 26792 0 94

haifaSat 23666 0 91

minisat 19863 0 84csat 15892 0 91

zchaff 13829 0 76jerusat-B 10225 0 80

hsat-5 10029 0 54vallst 9192 0 69

compsat 9097 0 75sat4j 8654 0 70

wllsatv1 1053 0 6

42 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Industrial SAT+UNSAT specialty, the winners

1. satEliteGTI

2. minisat

3. zchaff-rand and haifaSat

43 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

SAT+UNSAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers

ssatELiteGTI 99662 180 87minisat 69485 166 84

haifaSat 50931 151 91zchaff-rand 50515 132 94

jerusat-B 47487 163 80csat 36526 140 91

compsat 25399 114 75zchaff 31702 121 76sat4j 21097 110 70

hsat-5 20995 99 54vallst 16874 85 69

wllsatv1 12467 86 6

44 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

45 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The proof formatAll that job was done by Allen van Gelder

I For each benchmarks bench found unsat. a proof filebench.proof must be given.

I There are two possibilities for the proof format:

Resolution format Each resolution steps are providedand resolvants are explicitly provided.

Trace format Only the resolution steps are provided.Not the resolvants. More compact format.

I A checker can check a proof using the resolution format.

I The trace format can be converted into the resolutionformat.

46 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Examples of the proof formats

Example (Original input file)

p cnf 2 31 -2 01 2 0-1 0

Example (possible proof, resolution format)

4 2 1 2 2 1 1 25 1 3 4 0 0

Example (possible proof, trace format)

4 2 1 25 1 3 4

47 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The contestants

zchaff Use the resolution format

ttsp Use the trace format

I The solvers behave quite differently: only 21benchmarks are found unsat by both solvers.

I They do not use the same output format.

48 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Zchaff vs TTS: running time

TTS ZCHAFFCPU Time CERT ORIG Cost Size CERT ORIG Cost Sizesat05-561 0,01 0,02 -50,00 5 0 0 -25,01 7sat05-562 6,83 4,38 55,94 415 0,09 0,01 633,33 982sat05-1273 (471,36) 286,59 64,47 12,55 8,42 49,13 54066sat05-1171 1,4 1,36 2,94 107 6,42 5,06 26,94 19907sat05-1172 7,89 7,06 11,76 594 41,11 34, 19,84 91341sat05-1185 2,15 1,87 14,97 245 6,28 5 25,66 15153sat05-1186 12,26 9,71 26,26 1377 65,28 56,11 16,35 132093sat05-1213 90,97 3,83 2275,20 9254 12,18 9,57 27,20 28493sat05-1214 (403,76) 21,41 1785,85 45,39 40,07 13,29 84726sat05-1227 (523,35) 6,69 7722,87 8,08 6,33 27,53 24433sat05-1228 (600,91) 30,27 1885,17 60,79 50,88 19,50 102763sat05-2308 143,54 7,76 1749,74 41533 30,23 26,4 14,51 43531sat05-2309 (153,31) 7,56 1927,91 16,03 13,37 19,91 22741sat05-2323 (162,45) 12,75 1174,12 85,7 69,31 23,65 53554sat05-2325 (124,59) 12,78 874,88 232,9 195,33 19,23 91270sat05-2594 7,93 0,96 726,04 5153 34,92 27,41 27,37 50009sat05-2595 45,5 1,12 3962,50 16756 5,75 3,97 44,74 18110sat05-2596 161,96 1,23 13067,48 49384 25,8 19,66 31,23 39092sat05-2610 221,81 2,08 10563,94 61406 71,13 58,1 22,42 65084sat05-2625 50,79 2,6 1853,46 27962 40,02 32,01 25,03 41484sat05-2654 (196,8) 7,39 2563,06 205,39 174,87 17,45 89099

Size in KB.

49 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Size of the Problems

sat05-561 24 61sat05-562 90 262sat05-1273 116 1362sat05-1171 80 370sat05-1172 120 672sat05-1185 90 415sat05-1186 132 738sat05-1213 80 650sat05-1214 120 1212sat05-1227 90 775sat05-1228 132 1398

sat05-2308 120 480sat05-2309 120 480sat05-2323 140 560sat05-2325 140 560sat05-2594 90 240sat05-2595 90 240sat05-2596 90 240sat05-2610 105 280sat05-2625 120 320sat05-2654 150 400

50 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Zchaff certificates on easy UNSAT IBMbenchmarks

Bench Cert Orig Cost (%) Size (KB)

01 SAT dat.k10 0,9 0,23 284,19 1265607 SAT dat.k30 4,21 3,41 23,41 1135307 SAT dat.k35 5,11 4,29 19,02 1183618 SAT dat.k10 35,93 0,32 10957,58 (711381)18 SAT dat.k15 (103,42) 3,64 2743,20 -1 11 SAT dat.k10 3,59 0,64 462,38 547861 11 SAT dat.k15 30,08 3,65 724,87 (525236)20 SAT dat.k10 3,02 0,29 931,40 5483923 SAT dat.k10 0,78 0,23 236,80 1025123 SAT dat.k15 (98,71) 1,23 7913,41 -26 SAT dat.k10 12,34 7,95 55,27 1482 14 SAT dat.k10 15,09 0,48 3017,36 (286783)2 14 SAT dat.k15 (102,2) 1,49 6778,59 -

51 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Size of the ”easy” IBM benchmarks

Benchmark # var # clauses Size

01 SAT dat.k10.cnf 9275 38802 1265607 SAT dat.k30.cnf 11081 31034 1135307 SAT dat.k35.cnf 12116 33469 118361 11 SAT dat.k10.cnf 28280 111519 54786(1 11 SAT dat.k15.cnf) 44993 178110 (525236)(18 SAT dat.k10.cnf) 17141 69989 (711381)—18 SAT dat.k15.cnf— 25915 10632520 SAT dat.k10.cnf 17567 72087 54839(2 14 SAT dat.k10.cnf) 12859 49351 (286783)—2 14 SAT dat.k15.cnf— 20302 7839523 SAT dat.k10.cnf 18612 76086 10251—23 SAT dat.k15.cnf— 29106 11963526 SAT dat.k10.cnf 55591 277611 148

52 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The reasons for failure

Input format discussable several standard format do exist forgate descriptions (netlists, trace format,...).

No benchmarks to play with the only benchmarks currentlyavailable in the edimacs format are the onessubmitted for the special track.

No solver to play with There is currently no solver able toread the edimacs input format.

53 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

Next SAT contest?

Will be in 2007...book your T-shirts now!

54 / 55

The SAT 2005Competition

What’s new thisyear

The benchmarks

First stage results

All categories

Random category

Crafted category

Industrial category

Second stageresults

Random category

Crafted category

Industrial category

Certified UNSATSpecial track

Non clausal specialtrack

Next contest?

Pseudo Booleanevaluation

The first pseudo boolean solver evaluation

To be presented by Olivier Roussel and Vasco Manquinho

55 / 55