engineering static analyzers with soufflé · table: data structure evaluation tools for...

117
Engineering Static Analyzers with Soufflé Bernhard Scholz The University of Sydney, Australia Herbert Jordan University of Innsbruck, Austria Pavle Subotic University College London, UK Alexander Jordan Oracle Labs, Brisbane

Upload: others

Post on 19-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

EngineeringStaticAnalyzerswithSoufflé

BernhardScholzTheUniversityofSydney,Australia

HerbertJordanUniversityofInnsbruck,Austria

Pavle SuboticUniversityCollegeLondon,UK

AlexanderJordanOracleLabs,Brisbane

Page 2: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Agenda

• SouffléOverviewasaTool

• BriefIntroductiontoDatalog

• SouffléasaLanguage

• Use-Cases:StaticProgramAnalysis

Page 3: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Soufflé:OverviewBernhardScholz

TheUniversityofSydney

Page 4: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

StaticProgramAnalysisTools

• Lot’sofapplicationsforstaticprogramanalysis• Bugfinding,compileroptimisation,programcomprehension

• Fullyfunctionalanalysisforreallanguagesisexpensive!• State-of-the-arttools

• DevelopedinlanguageslikeC++/MLOCs• Testing• Fine-Tuning:scalabilityvs.precisionvs.efforttodevelop

• Needtorapidlydevelopstaticprogramanalysistools• Deepdesignspaceexplorations

Page 5: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Datalog asDSLforStaticProgramAnalysis

• Datalog instaticprogramanalysis• Reps’94,Engler’96,…

• Datalog isrestrictedHorn-Logic• Declarativeprogrammingforrecursiverelations• Finiteconstantset• Noback-trackingforevaluation/fast• Extensional/Intensional database

• Extractor• Syntactictranslationtologicalrelations

• Datalog Engine• ExtensionalDatabase/Facts:inputrelations• Intensional Database/Rules:programanalysisspecification

InputProgram

InputRelations

Result

ProgramAnalysis

Extractor

DatalogEngine

Page 6: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

SecurityAnalysisinDatalog

• Vulnerablestatementmustbeprotectedinthecode• SafeandUnsaferegionsintheCFGExample: Security Analysis in Java

void m(int i, int j){

s: while (i < j){

l1: protect();

l2: i++;

}

l3: vulnerable();

}

s `1

`3 `2

// I/O relations

E(s:Node,d:Node) input

P(node:Node) input

I(node:Node) output

// Security Analysis

I("s").

I(y) :- I(x), E(x,y), !P(y).

Security Analysis

Security sensitive methods, e.g., vulnerable()

Permission checks prior to security sensitive method, e.g., protect()

Check that on all path protect() is invoked

Datalog specification: concise security check with rules & facts

B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July 7, 2016 6 / 1

s

l3

l1

l2

Source: CFG: Datalog Program:Unsafe(”s”).Unsafe(y) :-Unsafe(x), Edge(x, y), !Protect(y).

Violation(x) :-Vulnerable(x),Unsafe(x).

Page 7: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

WhyisDatalog noteverywhere?Performance Gap: C++ vs. Datalog

On graph with 100 vertices connected by 100K randomly edges

C++: 2 sec, 34 MB

us i ng Tuple = s td : : a r r ay<i n t ,2>;us i ng Re l a t i o n = s td : : s e t<Tuple>;R e l a t i o n edge , t c ;edge = someSource ( ) ;t c = edge ;auto d e l t a = tc ;wh i l e ( ! d e l t a . empty ( ) ) {

Re l a t i o n nDe l ta ;f o r ( const auto& t1 : d e l t a ) {

auto a = edge . l ower bound ({ t1 [ 1 ] , 0} ) ;auto b = edge . upper bound ({ t1 [ 1 ]+1 ,0} ) ;f o r ( auto i t = a ; i t != b ; ++i t ) {

auto& t2 = ⇤ i t ;Tuple t r ({ t1 [ 0 ] , t2 [ 1 ]} ) ;i f ( ! c o n t a i n s ( tc , t r ) )

nDe l ta . i n s e r t ( t r ) ;}

}t c . i n s e r t ( nDe l ta . beg i n ( ) , nDe l ta . end ( ) ) ;d e l t a . swap ( nDe l ta ) ;

}}

µZ Datalog: 340 sec, 1667 MB

path (X,Y) :� edge (X,Y ) .path (X, Z) :� path (X,Y) ,

edge (Y, Z ) .

B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July 7, 2016 9 / 1

Performance Gap: C++ vs. Datalog

On graph with 100 vertices connected by 100K randomly edges

C++: 2 sec, 34 MB

us i ng Tuple = s td : : a r r ay<i n t ,2>;us i ng Re l a t i o n = s td : : s e t<Tuple>;R e l a t i o n edge , t c ;edge = someSource ( ) ;t c = edge ;auto d e l t a = tc ;wh i l e ( ! d e l t a . empty ( ) ) {

Re l a t i o n nDe l ta ;f o r ( const auto& t1 : d e l t a ) {

auto a = edge . l ower bound ({ t1 [ 1 ] , 0} ) ;auto b = edge . upper bound ({ t1 [ 1 ]+1 ,0} ) ;f o r ( auto i t = a ; i t != b ; ++i t ) {

auto& t2 = ⇤ i t ;Tuple t r ({ t1 [ 0 ] , t2 [ 1 ]} ) ;i f ( ! c o n t a i n s ( tc , t r ) )

nDe l ta . i n s e r t ( t r ) ;}

}t c . i n s e r t ( nDe l ta . beg i n ( ) , nDe l ta . end ( ) ) ;d e l t a . swap ( nDe l ta ) ;

}}

µZ Datalog: 340 sec, 1667 MB

path (X,Y) :� edge (X,Y ) .path (X, Z) :� path (X,Y) ,

edge (Y, Z ) .

B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July 7, 2016 9 / 1

Performance Gap: C++ vs. Datalog

On graph with 100 vertices connected by 100K randomly edges

C++: 2 sec, 34 MB

us i ng Tuple = s td : : a r r ay<i n t ,2>;us i ng Re l a t i o n = s td : : s e t<Tuple>;R e l a t i o n edge , t c ;edge = someSource ( ) ;t c = edge ;auto d e l t a = tc ;wh i l e ( ! d e l t a . empty ( ) ) {

Re l a t i o n nDe l ta ;f o r ( const auto& t1 : d e l t a ) {

auto a = edge . l ower bound ({ t1 [ 1 ] , 0} ) ;auto b = edge . upper bound ({ t1 [ 1 ]+1 ,0} ) ;f o r ( auto i t = a ; i t != b ; ++i t ) {

auto& t2 = ⇤ i t ;Tuple t r ({ t1 [ 0 ] , t2 [ 1 ]} ) ;i f ( ! c o n t a i n s ( tc , t r ) )

nDe l ta . i n s e r t ( t r ) ;}

}t c . i n s e r t ( nDe l ta . beg i n ( ) , nDe l ta . end ( ) ) ;d e l t a . swap ( nDe l ta ) ;

}}

µZ Datalog: 340 sec, 1667 MB

path (X,Y) :� edge (X,Y ) .path (X, Z) :� path (X,Y) ,

edge (Y, Z ) .

B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July 7, 2016 9 / 1

WhytheGap?

- Generalevaluationalgorithms- ExistingenginesfocusonDBapps

Whatcanwedo?

Page 8: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

:ANewDatalog SynthesisTool

• NewParadigmforEvaluatingDatalog Programs• Toachievesimilarperformancetohand-writtenC++code

• Assumptions• Rulesdonotchangeinstaticprogramanalysistools• Facts(=inputprogramrepresentation)maychange• Executedonlargemulti-coreshared-memorymachines

• Solution:• SynthesiswithFutamura projections• Applypartialspecializationtechniques• Synthesisinstages

• Eachstageopensarenewopportunitiesforoptimisations

Page 9: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

HowdoesSouffléwork?

FactExtractor

SOUFFLÉ

Inputs Facts

Spec Result

AST

RAM

C++ ANALYZER

Spec

Facts

Result

Standalonetoolasbinaryorlibrary

Page 10: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Futamura Projections

• Specialization • Specialization• Hierarchy

Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|

Futamura Projection

Interpreter'

Program'

Input' Output'

Interpreter' Program'

Output'

Mix'

Specialized'Program'

Compiler'

Source'Language'

Target'Program'Input'

(a) Interpretation (b) Futamura Projection

B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July 7, 2016 16 / 1

OracleConfidenGal–Restricted 10

FutamuraProjecGon

Specializations

Declarative to Imperative

Level 1: Semi-Naıve evaluation + Datalog rules =) RAM

Level 2: RAM evaluator + RAM operations =) templetized C++

Level 3: C++ compiler + template code =) program

Professor Yoshihiko Futamura

B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July 7, 2016 17 / 1

DatalogSpecialization

RAMSpecialization

C++Specialization

Page 11: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Performance Gap: C++ vs. Datalog

On graph with 100 vertices connected by 100K randomly edges

C++: 2 sec, 34 MB

us i ng Tuple = s td : : a r r ay<i n t ,2>;us i ng Re l a t i o n = s td : : s e t<Tuple>;R e l a t i o n edge , t c ;edge = someSource ( ) ;t c = edge ;auto d e l t a = tc ;wh i l e ( ! d e l t a . empty ( ) ) {

Re l a t i o n nDe l ta ;f o r ( const auto& t1 : d e l t a ) {

auto a = edge . l ower bound ({ t1 [ 1 ] , 0} ) ;auto b = edge . upper bound ({ t1 [ 1 ]+1 ,0} ) ;f o r ( auto i t = a ; i t != b ; ++i t ) {

auto& t2 = ⇤ i t ;Tuple t r ({ t1 [ 0 ] , t2 [ 1 ]} ) ;i f ( ! c o n t a i n s ( tc , t r ) )

nDe l ta . i n s e r t ( t r ) ;}

}t c . i n s e r t ( nDe l ta . beg i n ( ) , nDe l ta . end ( ) ) ;d e l t a . swap ( nDe l ta ) ;

}}

µZ Datalog: 340 sec, 1667 MB

path (X,Y) :� edge (X,Y ) .path (X, Z) :� path (X,Y) ,

edge (Y, Z ) .

B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July 7, 2016 9 / 1

Third Stage: RAM to C++

Meta-Programming in C++

Translates RAM to C++ with templates

Relation type determines underlying data-structure for indexesUse Tries for relations with 1-2 attributesUse B-Tries for relations with 2 or more attributes

Avoid virtual-dispatch for basic data-structureTemplates are used for ”static” dispatch

Highly optimized for parallel execution (optimistic locking, ...)

Tool Time [s] Memory [MB]Sou✏e / B-tree (sequential) 1.26 25.6Sou✏e / B-tree (parallel) 0.42 26.3Sou✏e / Trie (sequential) 0.38 3.5Sou✏e / Trie (parallel) 0.12 4.5

Table: Data structure evaluation tools for random-graph connectivity problem.B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July 7, 2016 20 / 1

Soufflé'sPerformance

• Example

• PerformanceNumbers

• Vs.Hand-crafted:2s/34MM

Page 12: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

USECASEA:SecurityInOpenJDK7

OpenJDK7:7MLOC,1.4Mvariables,350Kheapobjects,160Kmethods,590Kinvocations,

1Gtuples

Soufflé–c

Points-to.dlSecurityAnalysisX.dl

AnalyzerXJDKXbuildY Results

Page 13: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

USECASEB:AWSVPCNetworks

~10-100KInstances,90%staticrules,10%dynamicrules

ExternalTool JavaInterface

Soufflécompiler

SouffléInterpreter 10%

90%

Analysis+Data

Results

Page 14: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

USECASEC:StaticParallelC/C++CodeAnalysisFramework

parallel

C++parallel

C++parallel

C++

Fron

tend

Backen

d

IR exe

Optimizer

Analysis Transformation

Insieme CompilerFramework

Souffle

Insieme’s DatalogAnalysisFramework

pointsto escape …

IR

FactExtractor

ResultExtractor

InputFacts

OutputFacts

AnalysisResults

Insieme’s Datalog basedAnalysis Module

Page 15: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

USECASED:SecurityInSmartContracts

Page 16: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

USECASEE:Points-ToAnalysis

• JavaPoints-ToAnalysisinDatalog• BuildsmemoryabstractionsforJavaprograms• Feature-Rich:varioustypesofpoints-tocontexts• Efficientimplementation

• Pipeline• SOOTforextractingsemanticrelations• Parameterizable analysis

• RecentlyportedtoSoufflé• SOAP’17

x

y

z

a

b

f

Variables

HeapObjects

vP

hP

Page 17: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

ABriefIntroductiontoDatalog

AdaptedfromKifer/Bernstein/Lewis/Roehm

Page 18: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Datalog

• Datalog islogic-based querylanguage• Easytouserelationalprogramminglanguage• Recursive queries• AdaptsProlog-stylesyntax

• Basedonfirst-orderlogic• Decidablefragmentoflogic• FiniteUniverse• Nofunctors

Page 19: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

RecentlyNewInterestinDatalog

• OriginalResearchisfromthe1980’sand’90s• cf.systemCoral,LDL++

• Datalog fordeclarativequeryingrecursivestructures• E.g.graphsornetworks

• Newapplicationscanbenefitfromthis• DataIntegration• ProgramAnalysis• DeclarativeNetworking• Security• Graphdatabases• …

• Cf.Huang,Green,Loo:“Datalog andEmergingApplications:AnInteractiveTutorial” SIGMOD2011.

Page 20: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

BasicSyntaxofDatalog

• ThreetypesofHornClauses• Factfact.

• Rule:head:- body.

• Query:?- body.

• BuildingblocksforclausesarePredicates• ABooleanfunctiontakingafixednumberofargs• Apredicatefollowedbyitsargumentsiscalledanatom

Page 21: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

ʻHappyDrinkerʼ Example:Facts

• Predicatesexample:frequents(Drinker,Bar)likes(Drinker,Beer)sells(Bar,Beer,Price)

• Factsexample:frequents(“jon”,“the_rose”).likes(“jon”,“vb”).sells(“the_rose”,“vb”,5).

• Queryexample:?- sells(Bar,Beer,Price).

Page 22: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

BasicRuleStructureofDatalog

• Definearelationperson thatlistsallnamesofpersonsfromlikes relationperson(P):– likes(P, _).

• DefinearelationcheapBars containingthenamesofallbarssellingcheap“loewenbrau”beer(costinglessthan$4).

cheapBars(B,P):– sells(B,“loewenbrau”,P),P<4.• RetrievethecostofLoewenbraeu atthe“ueberbar”

?- cheapBars(“ueberbar”,P).• TofindthebarnameandbeerpricesofallbarsincheapBars thatsell“loewenbrau”forlessthan$4

?- cheapBars(B,P),P <4

Page 23: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

happy(D) :- frequents(D,Bar) ,likes(D,Beer),sells(Bar,Beer,P)

AnatomyofaRule

Body(read ‘,’ as AND)

HeadRead thissymbol “if”

n The rule is a query asking for “happy” drinkers --- those that frequent a bar that serves a beer that they like.

n Variable bindings as equivalence constraintsn Variables must be grounded, i.e., show up in Atoms in Body.

Page 24: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

AtomsinBody

• Anatom isapredicate,orrelationnamewithvariablesorconstantsasarguments.

• Theheadofaruleisanatom;thebodyistheconjunctionofoneormoreatoms

• Example:sells(Bar,Beer,P)

The predicate= name of arelation

Arguments arevariables

Page 25: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

RuleInterpretation

happy(D) :- frequents(D,Bar) ANDlikes(D,Beer) AND sells(Bar,Beer,P)

Distinguishedvariable

Nondistinguishedvariables

Interpretation: drinker D is happy if there exist aBar, a Beer, and a price P such that D frequentsthe Bar, likes the Beer, and the bar sells the beerat price P.

Page 26: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

NegatedAtoms

• WemayputNOTinfrontofanatomtonegateitsmeaning.• Example:expensiveBars(B,P):– sells(B,_,P),!cheapBars(B,P).

• Circularnegateddefinitions(direct/indirect)arenotpermitted.• Example: A(X):- !A(X).<=nogood.

Page 27: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Datalog Program

• ADatalog program isacollectionoffacts,rules,andaquery.• Inaprogram,predicatescanbeeither

• EDB =ExtensionalDatabase =setofgroundfacts,e.g.,predicateswhoserelationsarestoredinadatabase.

• IDB =Intensional Database =relationsdefined&computedbyrules.

• Twomajortypesofevaluationstrategies• Top-Down,i.e.,fromthegoaltothefacts.• Bottom-Up,i.e.,fromthefactstothegoal.

Page 28: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

References

• Abiteboul/Hull/Vianu:FoundationsofDatabases(ebook)• Chapter12-15

• Kifer/Bernstein/Lewis:DatabaseSystems:AnApplication-OrientedApproach,IntroductoryVersion (2ndedition)

• Chapter13.6• Ramakrishnan/Gehrke:DatabaseManagementSystems (3rdedition- the‘Cow’book)

• Chapter24• Garcia-Molina/Ullman/Widom:DatabaseSystems:TheCompleteBook (1stedition)

• Chapter10

Page 29: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Soufflé:TheLanguageBernhardScholz

TheUniversityofSydney

Page 30: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Soufflé:Extensions

• Datalog• Lackofastandard• Everyimplementationhasitsownlanguage

• Soufflé• Syntaxinspiredbybddbddb andmuZ/z3• Formulti-coreserverswithlargememory

• largescalecomputinginmind

• SouffléLanguage• MakesDatalog Turing-Equivalent(arithmeticfunctors)• Softwareengineeringfeaturesforlarge-scalelogic-orientedprogramming

• Performance• Ruleandrelationmanagementviacomponents

Page 31: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Agenda

1. Firstexample2. Relationdeclaration3. Typesystemforattributes4. Arithmeticexpressions5. Aggregation6. Records7. Components8. Performance/Profilingfacilities

Page 32: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Installation

• Supportedsystem• UNIX:Debian,FreeBSD,MACOSX,Win10subsystem,etc.

• Releasesareissuedregularly• http://github.com/souffle-lang/souffle/releases

• CurrentreleaseV1.1• AsaDebian Package• AsaMACOSXPackage

• Fromsourcecode• http://github.com/souffle-lang/souffle/

Page 33: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

InvocationofSoufflé

• Invocationofsoufflé:souffle <flags><program>.dl• Evaluateinputprogram<program>.dl

• Setinputfactdirectorywithflag–F<dir>• Specifiestheinputdirectoryforrelations(default:current)

• Setoutputdirectorywithflag–D<dir>• Specifiestheoutputdirectoryforrelations(default:current)• If<dir> is”-”;outputiswrittentostdout.

Page 34: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

TransitiveClosureExample

• Typethefollowinginfilereachable.dl.decl edge(n:symbol,m:symbol)edge(“a”,“b”)./*factsofedge*/edge(“b”,“c”).edge(“c”,”b”).edge(”c”,”d”)..decl reachable(n:symbol,m:symbol).outputreachable//outputrelationreachablereachable(x,y):- edge(x,y).//baserulereachable(x,z):- edge(x,y),reachable(y,z).//inductiverule• Evaluate:souffle -D- reachable.dl

Page 35: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Exercise

• Extendcodefrompreviousslide• AddanewrelationSCC(x,y)• RulesforSCC

• Ifnodexreachesnodeyandnodeyreachesnodex,then(x,y)isinSCC

• Checkwhetheranodeiscyclic• Checkwhetherthegraphisacyclic• Omittheflag“-D-”

• Whereistheoutput?

Page 36: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

SameGenerationExample• Givenatree,findwhobelongstothesamegeneration.decl Parent(n:symbol,m:symbol)Parent("d","b").Parent("e","b").Parent("f","c").Parent("g","c").Parent("b","a"). Parent("c","a")..decl Person(n:symbol)Person(x):- Parent(x,_).Person(x):- Parent(_,x)..decl SameGeneration (n:symbol,m:symbol)SameGeneration(x,x):- Person(x).SameGeneration(x,y):- Parent(x,p),SameGeneration(p,q),Parent(y,q)..outputSameGeneration

b

gd fe

c

a

Page 37: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Data-FlowAnalysisExample• DFA determines static properties of programs • DFA is a unified theory; provides information for global analysis• DFA based on control flow graph, and node properties • Example: Reaching Definition

• Assignment of variable can directly affect the value at another point• Unambiguous definition dof variable v

• Definition reaches a statement u if all paths from d to u does not contain any unambiguous definition of v

• Note that functions can have side-effects to variables

d: v = <expression>;

Page 38: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

d2: v=…

…v…d1 : v=…

Example: Reaching Definition

• Unambiguous definitions d1 and d2 of variable v

• Might reach d1 node B3?

• Might reach d2 node B3?

• Paths and effects of basic blocks influence solution

• Forward problem

start

end

B1

B4

B2 B3

38

Page 39: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

d2: v=…

…v…d1 : v=…

Example: Reaching Definition.decl Edge(n:symbol,m:symbol)Edge("start","b1").Edge("b1","b2").Edge("b1","b3").Edge("b2","b4").Edge("b3","b4").Edge("b4","b1").Edge("b4",”end")..decl GenDef(n:symbol,n:symbol)GenDef("b2","d1").GenDef("b4","d2")..decl KillDef(n:symbol,n:symbol)KillDef("b4","d1").KillDef("b2","d2")..decl Reachable(n:symbol,n:symbol)Reachable(u,d):- GenDef(u,d).Reachable(v,d):- Edge(u,v),Reachable(u,d),!KillDef(u,d)..outputReachable

start

end

B1

B4

B2 B3

39

Page 40: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Soufflé’sInput:Remarks&C-Preprocessor

• Souffléusestwotypesofcomments(likeinC++)• Example://thisisaremark/*thisisaremarkaswell*/

• CpreprocessorprocessesSoufflé’sinput• Includes,macrodefinition,conditionalblocks

• Example:#include“myprog.dl”#defineMYPLUS(a,b)(a+b)

Page 41: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

DeclarationsofRelations

• Relationsmustbedeclaredbeforebeingused:

.decl edge(a:symbol,b:symbol)

.decl reachable(a:symbol,b:symbol)

.outputreachable

edge(“a”,”b”).edge(“b”,”c”).edge(“b”,”c”).edge(“c”,”d”).reachable(a,b):- edge(a,b).reachable(a,c):- reachable(a,b),edge(b,c).

Type

Page 42: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

I/ODirectives

• Inputdirective• Readfromatab-separatedfile<relation-name>.facts• Stillmayhaverules/factsinthesourcecode• Example:.input<relation-name>

• Outputdirective• Factsarewrittentofile<relation-name>.csv(orstdout)• Example:.output<relation-name>

• Printsizeofarelation• Example:.printsize <relation-name>

Page 43: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Exercise:RelationQualifier

.decl A(n:symbol)

.inputA

.decl B(n:symbol)B(n):- A(n).

.decl C(n:symbol)

.outputCC(n):- B(n).

.decl D(n:symbol)

.printsize DD(n):- C(n).

• ReadfromfileA.facts facts

• CopyfactsfromAtoB

• CopyfactsfromBtoCandoutputittofileC.csv

• CopyfactsfromCtoDandoutputthenumberoffactsonstdout

Page 44: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

NoGoalsinSoufflé

• SouffléhasnotraditionalDatalog goals• Goalsaresimulatedbyoutputdirectives• Advantage

• severalindependentgoalsbyoneevaluation• Soufflé’slanguagewasdesignedfortoolintegration

• ManydesigndecisiontakenfromBDDBDDB/Z3• Currentstate:

• interactiveprocessingviasqlite3/db only• Future:

• ProvenanceandqueryprocessorforcomputedIDBs(comingsoon)

Page 45: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

MoreInfoaboutI/ODirectives

• Relationscanbeloadedfrom/storedto• ArbitraryCSVfiles(changedelimiters/columns/filenames/etc.)• Compressedtextfiles• SQLITE3databases

• Thefeaturesarecontrolledviaalistofparameters• Example:.decl A(a:number,b:number).outputA(IO=sqlite,dbname="path/to/sqlite3db")

• Documentation:http://souffle-lang.org/docs/io/

Page 46: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

• Ruleswithmultipleheadspermitted• Syntacticsugartominimizecodingeffort• Example:.decl A(x:number)A(1).A(2).A(3)..decl B(x:number).decl C(x:number)

B(x),C(x):- A(x)..outputB,C

RuleswithMultiple-Heads

.decl A(x:number)A(1).A(2).A(3)..decl B(x:number)B(x):- A(x)..decl C(x:number)C(x):- A(x)..outputB,C

Equivalent

Page 47: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

DisjunctionsinRuleBodies

• Disjunctioninbodiespermitted• Syntacticsugartoshortencode• Example:.decl edge(x:number,y:number)edge(1,2).edge(2,3)..decl path(x:number,y:number)path(x,y):-edge(x,y);edge(x,q),path(q,y)..outputpath

Equivalent

.decl edge(x:number,y:number)edge(1,2).edge(2,3)..decl path(x:number,y:number)path(x,y):- edge(x,y).path(x,y):- edge(x,q),path(q,y)..outputpath

Page 48: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

TypeSystem

• Soufflé'stypesystemisstatic• Definestheattributesofarelation• Typesareenforcedatcompile-time• Supportsprogrammerstouserelationscorrectly• Nodynamicchecksatruntime

• Evaluationspeedisparamount

• Typesystemreliesonthesetidea• Atypereferstoeitherasubsetofauniverseortheuniverseitself

• Elementsofsubsetsarenotdefinedexplicitly

• Subsetscanbecomposedoutofothersubsets

Page 49: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

PrimitiveTypes

• Souffléhastwoprimitivetypes• Symboltype:symbol• Numbertype:number

• Symboltype• Universeofallstrings• Internallyrepresentedbyanordinalnumber

• E.g.,ord(“hello”)representstheordinalnumber• Symboltableusedtotranslatebetweensymbolsandnumberid

• Numbertype• Universeofallnumbers• Simplesignednumbers:setto32bit

Page 50: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:PrimitiveTypes

.decl Name(n:symbol)Name(“Hans”).Name(“Gretl”).

.decl Translate(n:symbol,o:number)Translate(x,ord(x)):- Name(x)..outputTranslate• Notethatord(x) convertsasymboltoitsordinalnumber

PrimitiveTypes

Page 51: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Base&UnionTypes

• Primitivetypes• Largeprojectsrequirearichtypesystems• Largeprojects:severalhundredrelations(e.g.DOOP,SecurityAnalysis)• Howtoensurethatprogrammersdon’tbindwrongattributetypes?

• Partitionnumber/symboluniverse

• Formsub-setlatticesoverbasesubsets

Page 52: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

BaseType

• Symboltypesforattributesaredefinedby.symbol_type declarative.symbol_type City.symbol_type Town.symbol_type Village

• Define(assumingly)distinct/differentsetsofsymbolsinasymboluniverse

City Town Village

SymbolUniverse

Page 53: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

SymbolUniverse

UnionType

• Uniontypeisacompositionaltype• Unifiesafixednumberofsymbolsettypes(base/uniontypes)• Syntax.type<ident>=<ident1>|<ident2>|… | <identk>

• Example.typePlace=City|Town|Village

City Town Village

Place

Page 54: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Exercise:TypeSystem

.symbol_type City

.symbol_type Town

.symbol_type Village

.typePlace=City|Town|Village

.decl Data(c:City, t:Town,v:Village)Data(“Sydney”,”Ballina”,“Glenrowan”).

.decl Location(p:Place)outputLocation(p):- Data(p,_,_);Data(_,p,_);Data(_,_,p).• SetLocation receivesvaluesfromcellsoftypeCity,Town,andVillage.• Notethat; denotesadisjunction(i.e.,or)

Page 55: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

LimitationsofaStaticTypeSystem

• Disjointsetpropertynotenforcedatruntime• Example:.symbol_type City.symbol_type Town.symbol_type Village.typePlace=City|Town|Village.decl Data(c:City,t:Town,v:Village)Data(“Sydney”,”Sydney”,“Sydney”).

• Element“Sydney”ismemberoftypeCity,Town,andVillage.

Page 56: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Base/UnionTypesforNumbers

• Numbersubsetscannotbemixedwithsymbolsubsets• Basetypeisdefinedby.number_type <name>• Example:.number_type Even.number_type Odd.typeAll=Even|Odd

NumberUniverse

Odd Even

All

Page 57: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Exercise:Base/UnionTypesforNumbers

.number_type Even

.number_type Odd

.typeAll=Even|Odd

.decl myEven(e:Even)myEven(2)..decl myOdd(o:Odd)myOdd(1)..decl myAll(a:All).outputmyAllmyAll(x):- myOdd(x);myEven(x).

Page 58: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

ArithmeticExpression

• Arithmeticfunctors arepermitted• GoesbeyondpureDatalog semantics

• Variablesinfunctors mustbegrounded• Terminationmightbecomeaproblem• Example:.decl A(n:number).outputAA(1).A(x+1):- A(x),x<9.

Page 59: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Exercise:FibonacciNumber

• Createthefirst10numbersofseriesofFibonacciNumbers• Firsttwonumbersare1• Everynumberafterthefirsttwoisthesumofthetwoprecedingones• Example:1,1,2,3,5,8,…• Solution.decl Fib(i:number,a:number).outputFibFib(1,1).Fib(2,1).Fib(i +1,a+b):- Fib(i,a),Fib(i-1,b),i <10.

Page 60: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

ArithmeticFunctors andConstraints

• ArithmeticFunctors• Addition:x+ y• Subtraction:x- y• Division:x/ y• Multiplication:x* y• Modulo:a% b• Power:a^ b• Counter:$• Bit-Operation:

• xband y, x bor y, xbxor y,and bnot x• Logical-Operation

• xland y, xlor y,and lnot x

• ArithmeticConstraints• Lessthan:a<b• Lessthanorequalto:a<=b• Equalto:a=b• Notequalto:a!=b• Greaterthanorequalto:a>=b• Greaterthan:a>b

Page 61: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

NumbersinSoufflé

• Numbersindecimal,binary,andhexadecimalsystem• Example:

.decl A(x:number)A(4711).A(0b101).A(0xaffe).

• Decimal,hexadecimal,andbinarynumbersinthesourcecode• Restriction:infactfilesdecimalnumbersonly!

Page 62: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

LogicalOperation:NumberEncoding

• NumbersaslogicalvalueslikeinC• 0representsfalse• <>0representstrue

• Usedonforlogicaloperations• xland y, xlor y,and lnot x

• Example:.decl A(x:number).outputAA(0lor 1).

Page 63: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

TicketMachine:Counters

• Functor $• Issueanewnumbereverytimewhenthefunctor isevaluated

• Limitation• Notpermittedinrecursiverelations

• Createuniquenumbersforsymbols.decl A(x:symbol)A(“a”).A(“b”).A(“c”).A(“d”).

.decl B(x:symbol,y:number)

.outputBB(x,$):- A(x).

Page 64: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Exercise:CreateSuccessorRelationforaSet

• GivensetA(x:symbol)• CreateasuccessorrelationSucc(x:symbol,y:symbol)• Example:A={“a”,“b”,“c”,“d”}Succ ={(“a”,”b”),(“b”,”c”),(”c”,”d”)}

• Assumetotalordergivenbyordinalnumberofsymbols• Ordinalnumberofasymbolisobtainedbyord functor• Example:ord(“hello”)givestheordinalnumberofstring“hello”

Page 65: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Solution:CreateaSuccessorRelation

.decl A(x:symbol)input

.decl Less(x:symbol,y:symbol)Less(x,y):- A(x),A(y),ord(x)<ord(y).

.decl Transitive(x:symbol,y:symbol)Transitive(x,z):- Less(x,y),Less(y,z).

.decl Succ(x:symbol,y:symbol)Succ(x,y):- Less(x,y),!Transitive(x,y).

.outputLess,Transitive,Succ

Page 66: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Extension:ComputeFirst/LastofSuccessors

Computethefirstandthelastelementofthesuccessorrelation

.decl First(x:symbol)outputFirst(x):- A(x),!Succ(_,x).

.decl Last(x:symbol)outputLast(x):- A(x),!Succ(x,_).

Page 67: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

StringFunctors andConstraints

• StringFunctors• Concatenation:cat(x,y)• StringLength:strlen(x)• Sub-string:substr (x,idx,len)where idx isthestartpositioncountingfrom0andlen isthelengthofthesub-stringofx.

• RetrieveOrdinalnumber:ord(x)

• StringConstraints• Substringcheck:contains(sub,str)• Matching:match(regexpr,str)

Page 68: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:StringFunctors &Constraints

.decl S(s:symbol)S(“hello”).S(“world”). S(“souffle”)..decl A(s:symbol)A(cat(x,cat(““,y))):- S(x),S(y).//stitchtwosymbolstogetherw.blank.decl B(s:symbol)B(x):- A(x),contains(“hello”,x)..decl C(s:symbol)C(x):- A(x),match (“world.*”,x)..outputA,B,C//outputdirective

Page 69: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

AnotherStringExample

• Generateallsuffixesofastring.decl A(x:symbol)A("hello").//initialstring

A(substr(x,1,strlen(x)-1)):- //inductiveruleA(x),strlen(x)>1.

.outputA

Page 70: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Aggregation

• Summarizesinformationofqueries• Aggregatesonstable relationsonly(cf.negationinDatalog)

• Aggregationresultcannotbeusedforthesub-termoftheaggregatedirectlyorindirectly.

• Aggregationisafunctor• Varioustypesofaggregates

• Counting• Minimum• Maximum• Sum

Page 71: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Aggregation:Counting

• Countthesetsizeofitssub-goal• Syntax:count:{<sub-goal>}• Noinformationflowfromthesub-goaltotheouterscope• Example:.decl Car(name:symbol,colour:symbol)Car(“Audi”,”blue”).Car(“VW”,“red”).Car(“BMW”,“blue”).

.decl BlueCarCount(x:number)BlueCarCount(c):- c=count:{Car(_,”blue”)}..outputBlueCarCount

Page 72: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Aggregation:Maximum

• Findthemaximumofaset• Noinformationflowfromthesub-goaltotheouterscope,i.e.,nowitness

• Syntax:max<var>:{<sub-goal(<var>)>}• Example:.decl A(n:number)A(1).A(10).A(100)..decl MaxA(x:number)MaxA(y):- y=maxx:{A(x)}..outputMaxA

Page 73: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Aggregation:Minimum&Sum

• Findtheminimum/sumofasub-goal• Noinformationflowfromthesub-goaltotheouterscope

• nowitness

• Minsyntax:min<var>:{<sub-goal(<var>)>}• Sumsyntax:sum<var>:{<sub-goal(<var>)>}

Page 74: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Aggregation:Witnessesnot permitted!

• Witness:tuplesthatproducestheminimum/maximumofasub-goal• Example:.decl A(n:number,w:symbol)A(1,”a”).A(10,”b”).A(100,”c”)..decl MaxA(x:number,w:symbol)MaxA(y,w):- y=maxx:{A(x,w)}.<=notpermitted!!

• Witnessisboundinthemaxsub-goalandusedintheouterscope• Causessemantic/performanceissues

• Memorizingaset;whatdoesitmeanforcount/sum?• Forbiddenbythetype-checker

Page 75: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Records

• RelationsaretwodimensionalstructuresinDatalog• Large-scaleproblemsmayrequiremorecomplexstructure

• RecordsbreakoutoftheflatworldofDatalog• Atthepriceofperformance(i.e.extratablelookup)

• RecordsemanticssimilartoPascal/C• Nopolymorphtypesatthemoment

• RecordTypedefinition.type<name>=[<name1>:<type1>,…,<namek>:<typek>]

• Note:nooutputfacilityatthemoment

Page 76: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:Records

//Pairofnumbers.typePair=[a:number,b:number]

.decl A(p:Pair)//declareasetofpairsA([1,2]).A([3,4]).A([4,5]).

.decl Flatten(a:number,b:number)outputFlatten(a,b):- A([a,b]).

Page 77: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

• Eachrecordtypehasahiddentyperelation• Translatestheelementsofarecordtoanumber

• Whileevaluating,ifarecorddoesnotexist,itiscreatedonthefly.• Example:.typePair=[a:number,b:number].decl A(p:Pair)A([1,2]).A([3,4]).A([4,5]).

Records:Howdoesitwork?

Ref a b

1 1 2

2 3 4

3 4 5

p

1

2

3

A Pair

References

Page 78: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

RecursiveRecords

• Recursivelydefinedrecordspermitted• Terminationofrecursionvianil record• Example.typeIntList =[next:IntList,x:number].decl L(l:IntList)L([nil,10]).L([r1,x+10]):- L(r1),r1=[r2,x],x<30..decl Flatten(x:number)Flatten(x):- L([_,x])..outputFlatten

Ref next x

1 0 10

2 1 20

3 2 30

l

1

2

3

L IntList

nil

References

Page 79: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

RecursiveRecords

• Semanticsistricky• Relations/setsofrecursiveelements(i.e.setofreferences)

• Monotonicallygrow

• Structuralequivalencebyidentity• Newrecordsarecreatedon-the-fly

• seamlesslyfortheprogrammer

• Closertoafunctionalprogrammingsemantics• Future:

• Polymorphismmightbepossibleattheexpenseofspeed/space

Page 80: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

ComponentsinSoufflé

• Logicprogramshavenostructure• Amorphousmassofrules&relationdeclarations

• Createsserioussoftwareengineeringchallenges• Encapsulation:separationofconcerns• Replicationofcodefragments• Adaptionofcodefragments,etc.

• Solution:Soufflé'sComponentModel• MetasemanticsforDatalog• GeneratorforDatalog code;dissolvedatevaluationtime• SimilartoC++templates

Page 81: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Components(cont’d)

• Definition• Definesanewcomponenteitherfromscratchorbyinheritance• Permitted:componentdefinitionsinsidecomponentdefinitions• Syntax:.comp<name>[<params,… >]

[:<super-name>1[<params,… >],…,<super-name>k [<params,… >]]{<code>}

• Instantiation• Eachinstantiationhasitsownnameforcreatinganamespace• Typeandrelationdefinitionsinsidecomponentinheritthenamespace• Syntax:.init <name>=<name>[<params,… >]

Page 82: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:Component&NameScoping

.compmyComp {.decl A(x:number).outputAA(1).A(2).

}.init c1=myComp.init c2=myComp

.decl c1.A(x:number)

.outputc1.Ac1.A(1).c1.A(2).

.decl c2.A(x:number)output

.outputc2.Ac2.A(1).c2.A(2).

Expansionafter

instantiation

• Instantiationcreatesownnamespaceforrelationdeclarationsandtypes

Page 83: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:ComponentInheritance

.symbol_type s

.decl A(x:s,y:s)

.inputA

.compmyC {.decl B(x:s,y:s).outputBB(x,y):- A(x,y).

}.compmyCC:myC {B(x,z):- A(x,y),B(y,z).

}.init c=myCC

//outerscope:nonamespace.decl A(x:s,y:s).inputA

//namescoping//BisdeclaredinsidemyC/myCC.decl c.B(x:s,y:s).outputc.Bc.B(x,y):- A(x,y).c.B(x,z):- A(x,y),c.B(y,z).

ExpansionAfter

Instantiation

• ComponentmyCC inheritsfromcomponentmyC

Page 84: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

OverridingRulesofSuperComponents

• Example:.compmyC {.decl A(x:number)overrideable.outputAA(1).A(x+1):-A(x),x<5.

}.compmyCC:myC {.overrideAA(5).A(x+1):-A(x),x<10.

}.init c=myCC

• Instantiationresult:.decl c.A(x:number)outputc.A(5).c.A(x+1):-c.A(x),x<10.

• Rules/factsof thederivedcomponentoverridestherulesofthesupercomponent

• Relationmustbedefinedwithqualifieroverrideable insupercomponent

• Componentthatoverwritesrulesrequires:.override<rel-name>

Page 85: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

ComponentParameters

• Example.decl A(x:number).outputA.compcase<option>{.compone{A(1).}.comptwo{A(2).}.init c1=option

}.init c2=case<one>

• Componentone andtwo resideincomponentcasewithparameteroption

• Dependingonvalueofoption• Componentone or twoexpanded

• Conditionalexpansionofmacros

• Parametrizationofcomponents

Page 86: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:Components

• Developalibraryofcomponentsforgraphs• Thelibraryshouldcontainvariousfunctionality

• Acomponentforadirectedgraph• Acomponentforanundirectedgraph• Acomponentthatchecksforacycleinagraph• Acomponentthatcheckswhetheragraphisacyclic

• TheComponentlibraryshouldbeextendable,reuseothercomponents,etc.

Page 87: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Summary:Components

• Encapsulationofspecifications• Namespacesprovidedfortypes/relations• Instantiationproducesascopingnameofacomponent

• Repeatingcodefragments• Writeonce/instantiatedmultipletimes

• Components• Inheritanceofseveralsuper-components,i.e.,multipleinheritance• Hierarchiesoffunctionalities

• Parameters• Adaptcomponents/specialize

• Future:refinementofthecomponentmodel

Page 88: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Soufflé'sPerformanceAspect

• HowtogainfasterDatalog programs?• Compiletoachievepeakperformance• Schedulingofqueries

• Userannotationsorautomated• Findfasterqueries• Findfasterdatamodels

• Profilingisparamount• Textualandgraphicaluserinterfaceforprofilingprograms

• Practicalobservation• Onlyahandfulofruleswilldominatetheexecutiontimeofaprogram

Page 89: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Performance:Souffle’s CompilationFlags

• Compileandexecuteimmediately• Option–c• Example:souffle –ctest.dl

• Generatestand-aloneexecutable• Option–o<executable>• Example:souffle –otesttest.dl

• Feedback-DirectedCompilation• Option:--auto-schedule• Example:souffle --auto-scheduletest.dl

Page 90: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

PerformanceTuning

• Soufflécomputesoptimaldata-representationsforrelations• Queryschedulingcanbemadeautomatic

• Souffléflag: --auto-schedule• Sub-optimalduetounrefinedmetricsforSelinger’s algorithm

• Forhigh-performance:• Programmerre-orderstheatomsinthebodyofarule

• Disableauto-schedulerforarulebythestrictqualifier• Syntax:<rule>..strict

• Provideyourownqueryschedule• Syntax:<rule>..plan{<#version>:(idx1,…,idxk)}

Page 91: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

PerformanceExample

.decl Edge(x:number,y:number)Edge(1,2).Edge(500,1).Edge(i+1,i+2):- Edge(i,i+1),i <499.

.decl Path(x:number,y:number)

.printsize PathPath(x,y):- Edge(x,y).//Path(x,z):- Path(x,y),Path(y,z)..strict//Path(x,z):- Path(x,y),Edge(y,z)..strict//Path(x,z):- Edge(x,y),Path(y,z)..strict

Page 92: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Profiling

• ProfilingflagforSoufflé:-p<profile>• Producesaprofilelogafterexecution• Usesouffle-profile toprovideprofileinformationsouffle-profile<profile>

• Simpletext-interfaceandHTMLoutputwithJavaScript• Commands

• Help:help• Rule:rul [<id>]• Relations:rel [<id>]• Graphplotsforfixed-point:graph<id><type>

Page 93: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Profiling(cont’d)

• Option–jproducesHTMLfile;GraphicalRepresentationofPerformance

Additional features

Additional features

Additional features

Page 94: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

ProgramAnalysisExamplesBernhardScholz

TheUniversityofSydney

Page 95: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Points-ToAnalysis

• Flow-insensitive,inclusion-based,context-insensitive(cf.Whalley’04)

• AbstractDomain• Variables

• Local,actual/formalparameters,return-values,bases,this-variables• Heap-allocatedobjects

• Creation-siteasanabstractionfordynamicallycreatedobjects• Heap-allocatedobjecthavefields

• Relationsforcomputingpoints-toanalysis• vP(v,h):variablevmaypointtoheapobjecth• hP(h1,f,h2):fieldfofh1 maypointtoh2

Page 96: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Points-ToAnalysis

JavaCode Datalog Encoding

Allocations h: v=newC() vP(v,h) :- “h:v=newC()”.

Store v1.f=v2 hP(h1,f,h2):- “v1.f=v2”,vP(v1,h1),vP(v2,h2).

Load v2 =v1.f vP(v2,h2) :- “v2 =v1.f”,hP(h1,f,h2),vP(v1,h1).

Moves,Arguments v2 =v1 vP(v2,h) :- “v2 =v1”,vP(v1,h).

Page 97: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

a:x=new Foo()

y=x;

if (cond) {z = y;

} else { b:z=new G();

z.f = y;

}

Points-ToExample

x

y

z

a

b

f

Variables

HeapObjects

vP

hP

Page 98: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:Points-ToinSouffle

• SimpleInputLanguagewith4statementtypesAllocation:<var> = new() Assignment:<var1>=<var2>Store:<var1>.<field> = <var2>Load:<var1>=<var2>.<field>

• Example:v1= new()v2= new()v1.f= v2v3=v1.f

Page 99: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:Extractor

• ExecuteExtractoral-extractor <input-program>

• Expectsadirectoryfactsincurrentdirectory • Exampleprogramv1 = new()v2 = v1v1.f = v2v3 = v1.f

• Producesfiles• facts/new.facts facts/assign.facts facts/load.facts facts/store.facts

Page 100: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:Extractor

• ExecuteExtractoral-extractor <input-program>

• Expectsadirectoryfactsincurrentdirectory • Exampleprogramv1 = new()v2 = v1v1.f = v2v3 = v1.f

• Producesfiles• facts/new.facts facts/assign.facts facts/load.facts facts/store.facts

Page 101: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:FactFilesEncoding

• VariableMappingtoNumbers• v1→1, v2→2, v3→3• Firstallocation:0

• GeneratedRelations:

• Exampleprogramv1 = new()v2 = v1v1.f = v2v3 = v1.f

Var Object

1 0

Var Var

2 1

new.facts assign.facts

Var Var Field

3 1 f

load.factsVar Var Field

1 f 2

store.facts

Page 102: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:Points-ToEDB• Souffle’s relationdeclarationsforfactfiles// v1 = new().decl new( var : Var, o : Obj )// v1 = v2.decl assign( trg : Var, src : Var )// v1 = v2.f.decl load( trg : Var , src : Var , field : Field )// v1.f = v2.decl store( trg : Var , field : Field, src : Var ).input new, assign, load, store

Page 103: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:Points-ToRules// v = new()pointsTo(v,o):-new(v,o).

// v = xpointsTo(v,o):-assign(v,x),pointsTo(x,o).

// v = x.fpointsTo(v,o):-load(v,x,f),pointsTo(x,t),heap(t,f,o).

// v.f = xheap(o,f,t):-store(v,f,x),pointsTo(v,o),pointsTo(x,t).

Page 104: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

LambdaCalculusCaseStudy

• Step1:Inputlanguage• Untyped lambdacalculus

• Coregrammar:• term::=<var> //avariable,e.g.x

|\<var>.term //anabstraction(=function),e.g.“\x.x”|termterm //anapplication(=call),e.g.“ab”

• Extendedwith:• letbindingsandparenthesis support(forusability)

• IR:ASToflambdacalculusexpression

Page 105: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

ConstructingtheAnalysis

• Step2:EncodingintoRelations• EncodingofASTinrelationalformatvialc-extractor

• Step3:SpecifyAnalysis• OnegenericDataFlow analysiscomponenthandlingparameterpassing

• Utilizingaspecializationofitselffordeterminingcontrolflowinformation

• Instantiatedfor:• Controlflow information:whichfunctioniscalledwhere(dynamicbinding)• Booleanvalue analysis:analysisresultrequestedbytheuser• Arithmeticvalue analysis:analysisresultrequestedbytheuser

• Step4:InterpretResult• theresultingtablescontain(over-approximation)valuesofBooleanandArithmeticvalueofrootterm

Page 106: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example1

• Expressionsarelabeledandencodedinabs andvars,bool_lit,num_lit,root,etc.

• Applicationsencodedinapps e.g.,235means2isanapplicationwhereweapply5to3

• Wanttocomputethevalueofroot(0)

((((\x . x ) (\y . y) ) (\z . z ) ) true)01234 56 78 9

Page 107: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

TheAnalysis• Weuseacomponentparametrizedbyvalue

.comp DataFlow<Value> {…}

• Havethreerules//thevalueofavariableisthevalueboundtoitinanapplication

_var(n,v) :- app(_,f,a), ctrl.term(f,l), abs(l,n,_), term(a,v).

//thevalueofavariableisthevalueassignedtoitterm(i,v) :- var(i,n), _var(n,v).

//thevalueofanappisthevalueofthebodyofthetargetedabsterm(i,v) :- app(i,f,_), ctrl.term(f,l), abs(l,_,b), term(b,v).

• Instantiateforeachvalueanalysis.init bool = DataFlow<Bool>bool.term(i,v) :- bool_lit(i,v).

• Starttheanalysisbool_res(v) :- root(i), bool.term(i,v).

Page 108: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example2

• Noticetheimpression• Why?• Homeworkforthekeen:

• Changetheanalysistobepreciseforexample2

letfst =\x.\y.xinletid=\x.xinfst (id4)(id5)

Page 109: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

CurrentWorkonSoufflé

• SouffléintheCloud• HowtodistributeDatalog programsintheCloud

• Provenance• Explainingresults;onlinedebugging

• QueryScheduling• Improvingschedulingperformance

• BenchmarkSuite• ForSoufflé,Z3,bddbddb,Logicblox

Page 110: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

FutureWork

• Moredata-types• Booleandata-type• Floats/doublesmissing• Integersofvariouslength

• FunctionPredicates• In-buildAssertions

• Alongwish-list….• Refactoring• Documentations

Page 111: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

JointheCommunity

• Newresearchprojects• plentyofideas– notenoughpersonpower

• FeatureExtensions• Refactoring• BugFixing• Documentation• Souffléongithub

• http://github.com/souffle-lang/souffle

Page 112: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Appendix

Page 113: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

C++Interface/IntegrationintootherTools

• Souffle producesaC++classfromaDatalog program• C++classisaprogramonitsownright• Canbeintegratedinownprojectsseamlessly• Interfacesfor

• PopulatingEDBrelations• Runningtheevaluation• Queryingtheoutputtables

• Useofiteratorsforaccessingtuples• Examples:souffle/tests/interfaces/ofrepo

Page 114: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

Example:C++Interface

• Example…if(SouffleProgram *prog=ProgramFactory::newInstance(”mytest")){prog->loadAll(”fact-dir”);//orinsertviaiteratorprog->run();prog->printAll();//orprintviaiteratordeleteprog;

}…

Page 115: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

C++Interface:InputRelations

• Insertmethodforpopulatingdataif(Relation*rel =prog->getRelation(”myRel")){for(autoinput:myData){

tuplet(rel);t<<input[0]<<input[1];rel->insert(t);

}}

Page 116: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

C++Interface:OutputRelations

• Accessoutputrelationviaiteratorif(Relation*rel=prog->getRelation(”myOutRel")){

for(auto&output:*rel){output>>cell1>>cell2;std::cout<<cell1<<"-"<<cell2<<"\n";

}}

Page 117: Engineering Static Analyzers with Soufflé · Table: Data structure evaluation tools for random-graph connectivity problem. B.Scholz, H.Jordan, P. Subotic, T.Westmann (Oracle) July

JNIInterface

• Recentlydesigned/implementedbyP.Subotic (UCL)• CreateDatalog programviaASTobjects

• Noparsingofsourcecode

• Applications• implementaDSLinSCALA• useDatalog asabackend

• Example:• Seesouffle/interfaces/examples/Main.scala