Download - Source Code Analysis Using BAT
![Page 1: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/1.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Source Code Analysis
Using BAT
![Page 2: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/2.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
What is Static Analysis?
• Mining source code for information.
• Using that information to present abstractions of, and answer questions about, software structure.
![Page 3: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/3.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
What can we get from source code analysis?
• Type of information is model dependent– In almost any language, we can find out
information about variable usage: Who? Where? etc.
– In an OO environment, we can find out which classes use other classes, which are a base of an inheritance structure, etc.
– We can also find potential blocks of code that can never be executed in running the program (dead code).
![Page 4: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/4.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
BAT
• Is a tool that lets us perform static analysis on Java programs (class files).
– Builds an XML database of entities and relationships in a system.
– Can use several tools for querying and visualizing the data.
![Page 5: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/5.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Entities
• ‘Entities’ are individuals that live in the system, and attributes associated with them.
Some examples:– Classes, along with information about their superclass, their scope,
and ‘where’ in the code they exists.
– Methods/functions and what their return type or parameter list is, etc.
– Variables and what their types are, and whether or not they are static, etc.
![Page 6: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/6.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Relationships
• ‘Relationships’ are interactions between the entities in the system.
Relationships include:– Classes inheriting from one another.
– Methods in one class calling the methods of another class, and methods within the same class calling one another.
– One variable referencing another variable.
![Page 7: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/7.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Creating BAT Databases
• BAT is really a library that can process JAR files• BATAnalyzer is a small app wrapped around BAT
to return a full XML database from BAT for later processing– Found at: BATROOT/analyzer/src
• To run:export PATH=/usr/remote/serg/jdk1.5.0_11/bin/:$PATHjava -Xmx2G -cp /usr/remote/serg/binbat2toxml.jar:/usr/remote/serg/bin/batanalyzer.jar
batanalyzer.Main <JAR> <OUTPUT>
Need to give Java a lot of Memory to
process large projects
Project to analyze
BAT APIXML output file
Call to analyzer
![Page 8: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/8.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Provided Tools to deal with BAT
• bdef – A BASH wrapper around XSLT queries to get entity information
• bref – A BASH wrapper around XSLT queries to get relationship information
• dot – A visualization tool. Takes information from query and displays it as a graph.
• On TUX to get the scripts do: export PATH=$PATH:/usr/remote/serg/bin/
![Page 9: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/9.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
bdef Syntax
• bdef takes information from the entities database based on a query, and returns the results in an ascii-table.
bdef xml_file entity_kind entity_name [attr=val]
– xml_file is the xml file containing the extracted database– entity_kind is the ‘type’ of entity to retrieve.– entity_name is a pattern to match for names of entities.– attr=val are bindings to match for attributes of the entity
![Page 10: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/10.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Entity Kinds
• Chava recognizes several types of entity ‘kinds’ for use in the bdef/bref commands.
• m is for Method
• c is for Class
• f is for Field
• - is a match for any entity_kind
![Page 11: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/11.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Entity Names
• An entity name can assume many forms following regEX patterns
– Explicit name (e.g., ‘myTempStringVar’)
– Wild-card Pattern (e.g., ‘myTemp.*’)
– A complete wild-card, denoted with ‘.*’
![Page 12: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/12.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Attribute=Value
• Attribute=Value settings are used to further restrict a query based on some condition specified as regEX.
• Any field is searchable
• The most common restriction is to restrict to a specific file, or to filter out a file. E.g.,bdef file.xml - - filename=FileIDoLike.java
bdef file.xml - - filename=[^(FileIDoNOTLike.java)]
![Page 13: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/13.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Fields
• Class– name, filename, scope, deprecated, final, abstract
• Method– name, class, filename, scope, static, deprecated, final,
abstract, varargs, bridge, native, synchronized, return, parameters
• Field– Name, class, filename, type, scope, static, deprecated
final, transient, volatile, enum
![Page 14: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/14.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Query• Assume that we want to find all the methods in a specific file (in
this case, World.java) that start with ‘get’. Our query would look like the following:
bdef sim.xml m "get.*" filename="World\.java”
World.java is a part of a Discrete Event Simulator that contains information about the simulation environment
![Page 15: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/15.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Results (bdef)
bdef sim.xml m "get.*" filename="World\.java"
getWorldArray:World:World.java:public:false:false:false:false:false:false:false:false:getWorldString:World:World.java:public:false:false:false:false:false:false:false:false:getWorldString:World:World.java:public:false:false:false:false:false:false:false:false:getWorldMaskString:World:World.java:public:false:false:false:false:false:false:false:false:getEmpty:World:World.java:public:false:false:false:false:false:false:false:false:getWidth:World:World.java:public:false:false:false:false:false:false:false:false:getHeight:World:World.java:public:false:false:false:false:false:false:false:false:
![Page 16: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/16.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Results Explained
• The bdef query resulted in a collection of : separated lists. The data in the columns mean the following:
– name is the name of the method– class is the class the method belongs too– filename the file containing this method– scope the scope of the method– static if the method is static– deprecated if the method is deprecated– final if the method is final– abstract if the method is abstract– varargs if the method uses variable arguments– bridge if the method is a bridge– native if the method is native– synchronized if the method is synchronized– return the method’s return type– parameters the types of parameters accepted
![Page 17: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/17.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Exercise
• This exercise uses some Unix utilities along with our use of bdef. The exercise involves two things:
– Counting the number of methods of class World (in World.java).
– Printing out a list of methods in the form of their name, return type, and parameter list.
![Page 18: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/18.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Using Unix(Part One)
• In order to count the number of lines of a document, one can use the command line tool wc.– The –l option makes it count lines.– Piping to it makes it count the lines of output
from a program.{bdef query} | wc –l
counts the number of lines in a bdef query.
![Page 19: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/19.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
The solution is …
• The solution to the first problem is:
bdef sim.xml m ".*" filename="World\.java" | wc -l
![Page 20: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/20.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Using Unix(Part Two)
• For the second question, we will again use the unformatted output of bdef.– This time, we’ll take note of the format of the
unformatted output! We’ll keep this limited to the case of unformatted output for methods.
– Each field of the unformatted output is delimited by a colon. The fields we care about are the name, return-type, and parameter-list fields. These are fields 1, 13, and 14, respectively.
![Page 21: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/21.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Using Unix(Part Two)
• The final piece in the puzzle of displaying the specific fields is getting the fields themselves out of the output.– The cut utility will do nicely. We can send it a
delimiter, and a list of field numbers for a file, and it will return those fields for each line.
– The delimiter flag for cut is –d. The field numbers delimiter is –f, followed by a series of comma separated numbers.
![Page 22: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/22.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
The solution is …
• Our target query is thus:
bdef sim.xml m ".*" class="World" | cut -d ":" -f 1,13,14
![Page 23: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/23.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Output for Exercise
• Question One: 13
• Question Two:<init>::(int,int,)removeEntity::(Location,)moveEntity::(Location,Location,)addEntity::(Location,)checkBounds:boolean:(Location,)checkLocation:boolean:(Location,)getWorldArray:char[][]:()getWorldString:java.lang.String:(char[][],)getWorldString:java.lang.String:()getWorldMaskString:java.lang.String:(java.util.Vector,java.util.Vector,)setBox::(char[][],int,int,int,int,char,)getEmpty:char:()getWidth:int:()getHeight:int:()<clinit>::()
• Not very pretty, but useful (we hope…).
![Page 24: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/24.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
bref
• bref is a tool that displays relationship information by linking one entity to another
![Page 25: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/25.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
bref Syntax
bref xml kind1 name1 kind2 name2
– kind1 and kind2 are entity kinds– name1 and name2 are entity names– xml the XML file containing the database
![Page 26: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/26.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Query
• Here’s a query to find all class-class relationships in the database.
bref sim.xml c “.*” c “.*”
![Page 27: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/27.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Results (bref)
• bref sim.xml c “.*” c “.*"AutoCar" -> "Car”
"AutoControl" -> "java.lang.Object"
"Car" -> "Entity”
"CarControlException" -> "java.lang.Exception"
"CarCrashException" -> "java.lang.Exception"
"CarMoveController" -> "Entity"
"CarOutOfBounds" -> "java.lang.Exception"
"CarParkTrafficGenerator" -> "Entity"
………………………
![Page 28: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/28.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Results Explained
• bref returned a list of classes.
• Each line represents a relationship between the entities
• The entity on the right is the first entity asked for
• The entity of the left is the second entity asked for
![Page 29: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/29.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Exercise – bref
• In these exercises, we’ll examine various relations between the entities of a system.
• We’ll go over:– Inheritance relationships.– Method-Method relationships.– How to write a shell script using BAT tools
![Page 30: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/30.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Exercise #1
• We’ve already seen how to find the entire inheritance tree from our example, so this exercise should be easy:
– Find all the classes that Entity inherits from, and all the classes that subclass it.
![Page 31: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/31.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Inheritance Relation
• The relation between classes that we are interested in is subclassing.
• But which entity in the relation subclasses the other?– The answer is that the first entity subclasses the
second.
![Page 32: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/32.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Inheritance Relation (Cont’d)
– The answer to the question “which class is Entity a subclass of” is:
bref sim.xml c “Entity” c “.*”
– We can analogously find which classes subclass Entity :
bref sim.xml c “.*” c “Entity “
![Page 33: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/33.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Exercise #2
• This exercise concentrates on method-to-method relations.
• Our task is to find what the fan-in and fan-out of a function are.
• We’ll use World.addEntity function in the example
![Page 34: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/34.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Definition: Fan-In/Fan-Out
• Fan-In– The fan-in of a function/method is the number
of functions/methods that invoke that method.
• Fan-Out– The fan-out of a function/method is the number
of functions/methods that it invokes.
![Page 35: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/35.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Finding Fan-In, Fan-Out
• The fan-in of a method can be calculated thusly:bref sim.xml m ".*" m "World.addEntity" | wc -l
• The fan-out of a method can be calculated analogously:bref sim.xml m "World.addEntity" m ".*" | wc -l
![Page 36: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/36.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Exercise #3
• In this Exercise, we’ll write a shell script to determine if one class is an ancestor or a descendent of another.
![Page 37: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/37.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Descendent Relation
• A class X is an descendent of class Y if X subclasses Y, or X’s superclass is a descendent of Y.
• This sets up a nice recursion, which will make our job easy.
![Page 38: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/38.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Shell Scripting
• Our first step is to come up with an exact specification of what we want:– Given two classes, D and A, our script should
report a 1 if D is an descendent of A, and 0 otherwise.
![Page 39: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/39.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Shell Scripting…
• Our first coding step is to determine what shell to use. For this exercise, we’ll be using the C shell.
• This makes our shebang line like:#!/bin/csh
![Page 40: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/40.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Shell Scripting
• To make this a little nicer to look at, we’ll make a few small helper-scripts…– One to return whether one class subclasses
another.– One to return the ‘name’ field from
unformatted BAT output.– One to return the names of all the classes that
inherit from a given class.
![Page 41: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/41.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Helper Script (does_subclass)
• Our first script is pretty simple:
#!/bin/csh
@ z = `bref $1 c $2 c $3 | wc -l` != 0
echo ${z}
![Page 42: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/42.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Helper Script (get_name)
• Our get_name script only has to return the value of one field. We’ll just make a small script to do it.
cut -d " " -f1
![Page 43: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/43.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Helper Script (subclasses)
• A script to get all the subclasses is also relatively trivial:
bref $1 c ".*" c $2 |get_name
![Page 44: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/44.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
The Actual Script (ancestor)
• Since our relation is a recursive one, we have to start our code by taking care of the base case (which is that D is a subclass of A. Parent-Child relationship…).
#!/bin/csh
if (`bref $1 c $2 c $3 | wc -l ` != 0) then
echo 1
exit
endif
![Page 45: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/45.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
The Rest of the Script
• The rest of the script deals with the recursion. We have to check every subclass to see if it is an ancestor of the target class.
foreach child (`subclasses $1 $3`) if (`ancestor $1 $2 $child`) then echo 1 exit endifend
![Page 46: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/46.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
However…
• There’s a better way to do this, which would be to traverse up from the descendent.– There can be multiple subclasses to any class.
– In Java, there is only one superclass to a class.
• We’ll call this the ancestor relation, defined as:– X is an ancestor of Y if X is Y’s superclass,
– or X is an ancestor of Y’s superclass.
• We’ll write two little helper scripts to do the rewrite.
![Page 47: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/47.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Helper Scripts, II (other_name)
• A script to get the name of the second entity of a relation could be useful.
cut -d " " -f3
![Page 48: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/48.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Helcper Scripts, II (parent)
• A second script, to return the parent of a class, if it exists, would be:
#!/bin/csh
bref $1 c $2 c ".*" | other_name
![Page 49: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/49.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Making the Finished Product
• First take care of the base case of the recursion:
#!/bin/cshif (`other_name $1 $2 $3`) then echo 1 exitendif
![Page 50: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/50.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Last Bit o’ Code
• The rest of the code deals with recursing up the inheritance tree…
if (`parent $1 $2 | wc -l ` != 0) then
ancestor $1 `parent $1 $2` $3
else
echo 0
endif
![Page 51: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/51.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Visualizing Relationships
• We will be using DOT and Graphviz to visualize BAT relationships
– dot: Used to draw a ‘directed graph.’
– Graphviz: Visualizes DOT format
![Page 52: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/52.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Graphs (Definition)
• A graph G(V, E) is a set of vertices, V, and a set of edges, E.
• For each edge e in E, there are two vertices, (x, y), in V such that E is an edge between x and y.
![Page 53: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/53.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Graph Details
• Edge Crossings
• Directed Graphs
• Parallel Edges
![Page 54: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/54.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Graph Examples
• A road map of a large area is a graph. Cities are vertices, and roads are edges.
• An inheritance tree is a directed graph.
• A call tree is a graph.
![Page 55: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/55.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
DOT Format
digraph mdg { "First" -> "java.lang.Object" "First" -> "Second" "Second" -> "java.lang.Object" "Second" -> "java.lang.System" "Second" -> "java.io.PrintStream"}
![Page 56: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/56.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Relationship to DOT
• The relationship queries already return in DOT format, minus the header.
• All we need to do is append the following to the head:– digraph mdg {
• And the following to the tail:– }
![Page 57: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/57.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
XSLT
• Both bdef and bref are wrappers around XSLT queries
• XSLT/XPATH – Used to query the database.– Firefox can render XSLT stylesheets over XML
datasets
![Page 58: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/58.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
XSLT/XPATH Tutorials/Tools
• References– http://www.w3schools.com/– http://www.zvon.org/xxl/XSLTreference/Output/index.html
– http://www.xml.com/pub/a/2000/08/holman/index.html
• Tools– xsltproc on *nix systems– Windows:
http://www.microsoft.com/downloads/details.aspx?familyid=2fb55371-c94e-4373-b0e9-db4816552e41&displaylang=en
– Firefox can apply XSLT stylesheets
![Page 59: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/59.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Source Code Analysis
Using Chava
![Page 60: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/60.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
What is Static Analysis?
• Mining source code for information.
• Using that information to present abstractions of, and answer questions about, software structure.
![Page 61: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/61.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
What can we get from source code analysis?
• Type of information is model dependent– In almost any language, we can find out
information about variable usage: Who? Where? etc.
– In an OO environment, we can find out which classes use other classes, which are a base of an inheritance structure, etc.
– We can also find potential blocks of code that can never be executed in running the program (dead code).
![Page 62: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/62.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Chava
• Is a tool that lets us perform static analysis on Java programs (source or class files).
– Builds a database of entities in a system.
– Builds a database of relationships in a system.
– Includes several tools for querying the databases for data, and some tools for visualizing results.
![Page 63: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/63.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Entities
• ‘Entities’ are individuals that live in the system, and attributes associated with them.
Some examples:– Classes, along with information about their superclass, their scope,
and ‘where’ in the code they exists.
– Methods/functions and what their return type or parameter list is, etc.
– Variables and what their types are, and whether or not they are static, etc.
![Page 64: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/64.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Relationships
• ‘Relationships’ are interactions between the entities in the system.
Relationships include:– Classes inheriting from one another.
– Methods in one class calling the methods of another class, and methods within the same class calling one another.
– One variable referencing another variable.
![Page 65: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/65.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Creating Chava Databases
• Chava takes java/class files, and turns them into data files (.A ext) that can be integrated into a database– Create a .A file for a given Java file:
chava –c filename.java– Create .A files for all Java files in directory:
chava –c *.class
![Page 66: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/66.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Pulling it all together…
• Chava then takes .A files and creates the databases.– Create databases out of two .A files:
chava –l f1.A f2.A– Create databases for all .A files in directory:
chava –l *.A
![Page 67: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/67.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Chava Tools
• cdef/vdef – Used to query the entities database.
• cref/vref – Used to query the relationship database.
• dagger/dot – A visualization tool. Takes information from chava databases and displays it as a graph.
![Page 68: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/68.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
cdef/vdef
• cdef takes information from the entities database based on a query, and returns the results in an ascii-table.
• vdef actually shows the code of the entities from a query.
![Page 69: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/69.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Syntax
• cdef and vdef share the same syntax:
{vdef|cdef} entity_kind entity_name [attr=val]..
– entity_kind is the ‘type’ of entity to retrieve.
– entity_name is a pattern to match for names of entities.
– attr=val are bindings to match for attributes of the entity
![Page 70: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/70.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Entity Kinds
• Chava recognizes several types of entity ‘kinds’ for use in the cdef/vdef/cref/vref commands.
• p is for Package
• f is for File
• m is for Method
• c is for Class
• l is for Field
• s is for String
• i is for Interface
• - is a match for any entity_kind
![Page 71: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/71.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Entity Names
• An entity name can assume many forms
– Explicit name (e.g., ‘myTempStringVar’)
– Wild-card Pattern (e.g., ‘myTemp*’)
– A complete wild-card, denoted with ‘-’
![Page 72: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/72.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Attribute=Value
• Attribute=Value settings are used to further restrict a query based on some condition.
• The most common restriction is to restrict to a specific file, or to filter out a file. E.g.,
cdef - - file=FileIDoLike.java
cdef - - file!=FileIDoNOTLike.java
![Page 73: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/73.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Query• Assume that we want to find all the methods in a specific file (in
this case, ANSIDisplay.java) that start with ‘get’. Our query would look like the following:
cdef m ‘get*’ file=./ANSIDisplay.java
• Or, to see the code…vdef m ‘get*’ file=./ANSIDisplay.java
![Page 74: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/74.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Results (cdef)
• cdef m ‘get*’ file=ANSIDisplay.java
name scope file bline eline
====================== ======= ================= ===== =====
String getEscapeSequen public ANSIDisplay.java 76 82
String getEscapeSequen public ANSIDisplay.java 38 42
String getEscapeSequen public ANSIDisplay.java 93 108
String getEscapeSequen public ANSIDisplay.java 118 128
String getEscapeSequen public ANSIDisplay.java 139 153
![Page 75: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/75.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Results Explained
• The cdef query resulted in a table with several columns. The data in the columns mean the following:
– name: The name of the entity.
– scope: The scope of the entity within its ‘parent’ entity (the entity it resides in).
– file: The name of the file that the entity is in.
– bline: The line that the entity begins on.
– eline: The line that the entity ends on.
![Page 76: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/76.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Results (vdef)
• vdef m ‘get*’ file=ANSIDisplay.java(partial results)
public static String getEscapeSequence(int colour, boolean foreground)
{
colour = setColour(colour, foreground);
return (ESCAPE + Integer.toString(colour) + "m");
}
public static String getEscapeSequence(int value)
{
if (!ANSIDisplaySwitchCheck.validSwitch(value))
throw new IllegalArgumentException("Bad Switch");
return (ESCAPE + Integer.toString(value) + "m");
}
![Page 77: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/77.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Results Explained
• vdef printed out the entities we asked about, exactly how they appear in the source code.
![Page 78: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/78.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Finding all File Names
• Knowing all the file names could be important, so let’s see how to do that with chava.– We want to use cdef for this, and just have chava output a list of
file names.
– We also want to restrict the entity_kind to that of file. If you remember, ‘f’ is the type for file.
– We also want any file in the database to be listed, so we want to match against any entity_name. ‘-’ will do.
cdef f -
![Page 79: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/79.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Exercise
• It would be nice to know how a class interacts with its superclass.
• We’ll take a peek at this with the classes ANSIColourPrinter and ANSIPrinter.
![Page 80: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/80.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Class-Superclass
• This problem is a bit more than just one cdef/vdef command. First step…– We need to see how ANSIColourPrinter
calls its super-constructor.– We want to see the calls, so we’ll use vdef.– Constructors are methods in chava.
![Page 81: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/81.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Class-Superclass(Step One)
• The query we need to see the constructors of ANSIColourPrinter is:
vdef m ANSIColourPrinter
This results in…
![Page 82: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/82.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Class-Superclass Interaction(Step 1 – Results)
public ANSIColourPrinter(OutputStream out)
{
this(out, m_defaultColour);
}
public ANSIColourPrinter(OutputStream out,boolean doReset)
{
this(out, m_defaultColour,doReset);
}
public ANSIColourPrinter(OutputStream out, ANSICharacterColour colour)
{
this(out, colour, m_defaultReset);
}
publicANSIColourPrinter(OutputStream out,ANSICharacterColour colour, boolean doReset)
{
super(colour, out, doReset);
}
![Page 83: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/83.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Class-Superclass Interaction(Step One – Analysis)
• We now know that ANSIColourPrinter accepts: – An OutputStream,– An ANSICharacterColour– A boolean.
• When not supplied with either of the last two parameters, the constructor uses some defaults.
![Page 84: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/84.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Class-Superclass Interaction(Step Two)
• The next step is to examine what ANSIPrinter does in its constructor.
• This is basically the same thing as peeking at the ANSIColourPrinter constructors.
vdef m ANSIPrinter
![Page 85: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/85.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
public ANSIPrinter(OutputStream out, ANSIEscapeSequenceType sequence){
this(out, sequence, m_defaultReset);}public ANSIPrinter(OutputStream out, ANSIEscapeSequenceType sequence, boolean doReset){
this(sequence, out, doReset, true);}public ANSIPrinter(OutputStream out, ANSIEscapeSequenceType sequence, boolean doReset,
boolean resetOnLeave){
super(out);
m_escape = sequence;m_reset = doReset;m_resetOnFinalize = resetOnLeave;
m_showEscape = false;}
Class-Superclass Interaction(Step Two – Results)
![Page 86: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/86.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Class-Superclass Interaction(Step Two – Analysis)
• Apparently, the constructor for ANSIPrinter accepts values for:– an OutputStream– an ANSIEscapeSequenceType– two booleans.
• From what we see the constructor of ANSIPrinter doing, we know that the constructor does nothing more than just set some variables to what we pass to it. Nothing really that special.
![Page 87: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/87.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Another Exercise
• This exercise uses some Unix utilities along with our use of cdef/vdef. The exercise involves two things:
1. Counting the number of methods of ANSICharacterColour (in ANSICharacterColour.java).
2. Printing out a list of methods in the form of their name, return type, and parameter list.
![Page 88: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/88.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Using Unix(Part One)
• In order to count the number of lines of a document, one can use the command line tool wc.– The –l option makes it count lines.– Piping to it makes it count the lines of output
from a program.{cdef query} | wc –l
counts the number of lines in a cdef query.
![Page 89: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/89.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Using Unix(Part One)
• Problem with using wc–wc counts all lines, including the ones for
our formatted output table.– Passing the –u option to cdef gives
unformatted output, which is very useful for integrating chava with unix tools. The syntax is:
cdef [-u] kind name [attr=val]
![Page 90: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/90.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
The solution is …
• The solution to the first problem is:cdef –u m – file=./ANSICharacterColour.java | wc -l
![Page 91: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/91.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Using Unix(Part Two)
• For the second question, we will again use the unformatted output of cdef.– This time, we’ll take note of the format of the
unformatted output! We’ll keep this limited to the case of unformatted output for methods.
– Each field of the unformatted output is delimited by a semicolon. The fields we care about are the name, return-type, and parameter-list fields. These are fields 2, 5, and 9, respectively.
![Page 92: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/92.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Using Unix(Part Two)
• The final piece in the puzzle of displaying the specific fields is getting the fields themselves out of the output.– The cut utility will do nicely. We can send it a
delimiter, and a list of field numbers for a file, and it will return those fields for each line.
– The delimiter flag for cut is –d. The field numbers delimiter is –f, followed by a series of comma separated numbers.
![Page 93: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/93.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
The solution is …
• Our target query is thus:cdef –u m – file=./ANSICharacterColour.java | cut –d’;’ –f2,5,9
![Page 94: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/94.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Output for Exercise
• Question One: 13
• Question Two:ANSICharacterColour;void;(acin.common.ansi.ANSIColour,acin.common.ansi.ANSIColour)
create;acin.common.ansi.ANSICharacterColour;(acin.common.ansi.ANSIColour,acin.common.ansi.ANSIColour)
create;acin.common.ansi.ANSICharacterColour;(acin.common.ansi.ANSIColour,int)
create;acin.common.ansi.ANSICharacterColour;(acin.common.ansi.ANSIColour,java.lang.String)
create;acin.common.ansi.ANSICharacterColour;(int,acin.common.ansi.ANSIColour)
create;acin.common.ansi.ANSICharacterColour;(int,int)
create;acin.common.ansi.ANSICharacterColour;(int,java.lang.String)
create;acin.common.ansi.ANSICharacterColour;(java.lang.String,acin.common.ansi.ANSIColour)
create;acin.common.ansi.ANSICharacterColour;(java.lang.String,int)
create;acin.common.ansi.ANSICharacterColour;(java.lang.String,java.lang.String)
getANSIString;java.lang.String;()
getBackground;acin.common.ansi.ANSIColour;()
getForeground;acin.common.ansi.ANSIColour;()
• Not very pretty, but useful (we hope…).
![Page 95: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/95.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
cref/vref
• cref is a tool that displays information from the Chava relationship database, returning the results in a table.
• vref displays the actual entities involved in a relationship.
![Page 96: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/96.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Syntax
• cref and vref share the same syntax{cref|vref} kind1 name1 kind2 name2 [attr=val].
– kind1 and kind2 are entity kinds– name1 and name2 are entity names
– Attributes are a bit different…
![Page 97: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/97.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
cref/vref Attributes
• Attr=val pairs in cref/vref are different because they have to deal with two different entities. This is solved by appending a ‘1’ or a ‘2’ on the attribute.
E.g.,– file1=myFile.java– file2!=yourFile.java
![Page 98: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/98.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Query
• Here’s a query to find all class-class relationships in the database.
cref c – c –
or, to see the results:
vref c – c –
![Page 99: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/99.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Results (cref)
• cref c – c –kind1 name1 file1 kind2 name2 file2 rk
===== ======== ================ ===== ======== ================ ==
class ANSIChar ANSICharacterCol class ANSIEsca ANSIEscapeSequen su
class ANSIColo ANSIColour.java class ANSIEsca ANSIEscapeSequen su
class ANSIColo ANSIColourPrinte class ANSIPrin ANSIPrinter.java su
class ANSIColo ANSIColourPrinte class ANSIPrin ANSIPrinterMap.j su
class ANSICurs ANSICursorMove.j class Object su
class ANSICurs ANSICursorMoveSe class Object su
class ANSIDisp ANSIDisplay.java class Object su
class ANSIDisp ANSIDisplaySwitc class Object su
class ANSIEsca ANSIEscapeSequen class ANSIEsca ANSIEscapeSequen su
class ANSIEsca ANSIEscapeSequen class Object su
class ANSIPrin ANSIPrinter.java class PrintStream su
class ANSIPrin ANSIPrinterMap.j class Object su
class ANSIPrin ANSIPrinterOptio class Object su
![Page 100: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/100.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Results Explained
• cref returned a table. The columns are just like cdef columns, except some have a ‘1’ and some have a ‘2’ appended.
• Columns with a ‘1’ appended refer to the first entity.
• Columns with a ‘2’ refer to the second entity.
![Page 101: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/101.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
That last column…
• The last column, rk, denotes the kind of relationship. Its values can be:
• Reference• Fieldread• Fieldwrite• Implements• Subclass
![Page 102: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/102.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Results (vref)
• vref c – c – (partial results)RECORD NUMBER 0### ANSICharacterColour.java ###public class ANSICharacterColour extends ANSIEscapeSequenceType{ private ANSIColour m_foreground; private ANSIColour m_background;
/** * Method to create an <code>ANSICharacterColour</code> from two integers representing the
foreground and background colour, as defined in <code>ANSIColourConstants</code>. * * @param foreground The value representing the colour to be the foreground. * @param background The value representing the colour to be the background. * @exception java.lang.IllegalArgumentException Thrown if the foreground and background values
aren't valid ANSI colours. */
public static ANSICharacterColour create(int foreground, int background) { return new ANSICharacterColour(new ANSIColour(foreground, true), new ANSIColour(background,
false)); }
![Page 103: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/103.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Exercise – cref/vref
• In these exercises, we’ll examine various relations between the entities of a system.
• We’ll go over:– Inheritance relationships.– Method-Method relationships.– How to write a shell script using Chava tools
![Page 104: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/104.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Exercise #1
• We’ve already seen how to find the entire inheritance tree from our example, so this exercise should be easy:
– Find all the classes that ANSIEscapeSequenceType inherits from, and all the classes that subclass it.
![Page 105: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/105.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Inheritance Relation
• The relation between classes that we are interested in is subclassing.
• But which entity in the relation subclasses the other?– The answer is that the first entity subclasses the
second.
![Page 106: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/106.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Inheritance Relation (Cont’d)
– The answer to the question “which class is ANSIEscapeSequenceType a subclass of” is:
cref c ANSIEscapeSequenceType c -
– We can analogously find which classes subclass ANSIEscapeSequenceType:
cref c - c ANSIEscapeSequenceType
![Page 107: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/107.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Exercise #2
• This exercise concentrates on method-to-method relations.
• Our task is to find what the fan-in and fan-out of a function are.
![Page 108: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/108.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Definition: Fan-In/Fan-Out
• Fan-In– The fan-in of a function/method is the number
of functions/methods that invoke that method.
• Fan-Out– The fan-out of a function/method is the number
of functions/methods that it invokes.
![Page 109: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/109.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Finding Fan-In, Fan-Out
• A key piece of information to know here is that the –u option from cdef works in cref.
• The fan-in of a method can be calculated thusly:cref –u m – m my_method | wc –l
• The fan-out of a method can be calculated analogously:cref –u m my_method m - | wc –l
![Page 110: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/110.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Exercise #3
• In this Exercise, we’ll write a shell script to determine if one class is an ancestor or a descendent of another.
![Page 111: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/111.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Descendent Relation
• A class X is an descendent of class Y if X subclasses Y, or X’s superclass is a descendent of Y.
• This sets up a nice recursion, which will make our job easy.
![Page 112: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/112.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Shell Scripting
• Our first step is to come up with an exact specification of what we want:– Given two classes, D and A, our script should
report a 1 if D is an descendent of A, and 0 otherwise.
![Page 113: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/113.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Shell Scripting…
• Our first coding step is to determine what shell to use. For this exercise, we’ll be using the C shell.
• This makes our shebang line like:#!/bin/csh
![Page 114: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/114.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Shell Scripting
• To make this a little nicer to look at, we’ll make a few small helper-scripts…– One to return whether one class subclasses
another.– One to return the ‘name’ field from
unformatted chava output.– One to return the names of all the classes that
inherit from a given class.
![Page 115: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/115.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Helper Script (does_subclass)
• Our first script is pretty simple:
#!/bin/csh
@ z = `cref –u c $1 c $2 | wc –l` != 0
echo ${z}
![Page 116: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/116.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Helper Script (get_name)
• Our get_name script only has to return the value of one field. We’ll just make a small awk script to do it.
awk –F ‘;’ ‘{print $3}’
![Page 117: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/117.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Helper Script (subclasses)
• A script to get all the subclasses is also relatively trivial:
cref –u c – c $1 | get_name
![Page 118: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/118.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
The Actual Script (ancestor)
• Since our relation is a recursive one, we have to start our code by taking care of the base case (which is that D is a subclass of A. Parent-Child relationship…).
if (`cref –u c $1 c $2 | wc -l` != 0) then
echo 1
exit
endif
![Page 119: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/119.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
The Rest of the Script
• The rest of the script deals with the recursion. We have to check every subclass to see if it is an ancestor of the target class.
foreach child (`subclasses $2`)if (`ancestor $1 $child`)
thenecho 1exit
endifend
![Page 120: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/120.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
However…
• There’s a better way to do this, which would be to traverse up from the descendent.– There can be multiple subclasses to any class.
– In Java, there is only one superclass to a class.
• We’ll call this the ancestor relation, defined as:– X is an ancestor of Y if X is Y’s superclass,
– or X is an ancestor of Y’s superclass.
• We’ll write two little helper scripts to do the rewrite.
![Page 121: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/121.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Helper Scripts, II (other_name)
• A script to get the name of the second entity of a relation could be useful.
awk -F ';' '{print $17}'
![Page 122: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/122.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Helper Scripts, II (parent)
• A second script, to return the parent of a class, if it exists, would be:
#!/bin/csh
cref -u c $1 c - | other_name
![Page 123: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/123.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Making the Finished Product
• First take care of the base case of the recursion:
if (`does_subclass $1 $2`) then
echo 1
exit
endif
![Page 124: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/124.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Last Bit o’ Code
• The rest of the code deals with recursing up the inheritance tree…
if (`parent $1 | wc –l` != 0) then
ancestor `parent $1` $2
else
echo 0
endif
![Page 125: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/125.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Visualizing Chava
• There are two tools we’ll be using to visualize chava queries.– dagger: Lets us use a cref-esque query to
create a ‘directed graph.’– dot: Used to draw a ‘directed graph.’
![Page 126: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/126.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Graphs (Definition)
• A graph G(V, E) is a set of vertices, V, and a set of edges, E.
• For each edge e in E, there are two vertices, (x, y), in V such that E is an edge between x and y.
![Page 127: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/127.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Graph Details
• Edge Crossings
• Directed Graphs
• Parallel Edges
![Page 128: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/128.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Graph Examples
• A road map of a large area is a graph. Cities are vertices, and roads are edges.
• An inheritance tree is a directed graph.
• A call tree is a graph.
![Page 129: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/129.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
The dagger Tool
• The dagger tool takes a cref-style query, and returns the results as a graph of the relationships.
![Page 130: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/130.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Syntax
• dagger syntax is exactly like cref syntax (except for lack of options).
dagger kind1 name1 kind2 name2
![Page 131: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/131.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
dagger to dot
• dagger only creates a representation describing a graph.
• dot takes that representation and outputs something that can be visualized.– Can make dotty files.– Can also make postscript files.
dagger kind1 name1 kind2 name2 | dot -Tps
![Page 132: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/132.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Example Query
• A sample query will show just how the output of dagger -> dot looks.
• A good thing to check is the class inheritance heirarchy.– We already know the cref query for this.– The dagger query is
dagger c – c – | dot –Tps > classes.ps
![Page 133: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/133.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Viewing PostScript
• A good PostScript viewer is ghostview.– The command to use ghostview is
ggv <file>
• Use ghostview to look at the class heirarchy graph that you just created.
![Page 134: Source Code Analysis Using BAT](https://reader036.vdocument.in/reader036/viewer/2022062409/56814f45550346895dbce5d3/html5/thumbnails/134.jpg)
Reverse Engineering Reverse Engineering (Source Code Analysis)(Source Code Analysis) © SERG
Does Chava have siblings?
• Chava is really a tool that uses the CIA system, from AT&T Labs - Research.
• The CIA system can be extended to any type of structured language.
• Other implementations exist for:– C/C++, HTML, ksh, etc.