Download - Graph-based RelationalData Visualization
pdf at: www.icmc.usp.br/pessoas/junio
Graph-based RelationalGraph-based RelationalData VisualizationData Visualization
Daniel Mário Daniel Mário de Limade Lima
José Fernando José Fernando Rodrigues Jr.Rodrigues Jr.
Agma Juci Agma Juci Machado TrainaMachado Traina
<danielm@icmc.<[email protected]>usp.br>
<[email protected]><[email protected]> <[email protected]><[email protected]>
Instituto de Ciências Matemáticas e de ComputaçãoUniversidade de São Paulo
17th International ConferenceInformation Visualization
15, 16, 17 and 18 July 2013SOAS, University of London ● London ● UK
pdf at http://www.icmc.usp.br/~junio/PublishedPapers/Lima-et_al_IV-2013.pdf
pdf at: www.icmc.usp.br/pessoas/junio
OutlineOutline1. Introduction
2. Method
3. Experiments
4. Conclusions
pdf at: www.icmc.usp.br/pessoas/junio
1. Introduction1. Introduction
pdf at: www.icmc.usp.br/pessoas/junio
IntroductionIntroduction• Large datasets are common
• unstructured: text• semi-structured: XML, RDF, sensor data• structured: relational (DBMS), network (graph-like)
• Analysis Process• Data Representation / Transformation• Storage / Retrieval• Statistics• Visualization• Analysis
Iter
ate
pdf at: www.icmc.usp.br/pessoas/junio
IntroductionIntroduction• How to spot interesting facts in the relationships
of large relational databases?• How are the entities on the database related to
each other?• How are the entities distributed over the
relations of the database?• How do the several attributes of the database
influence the relationships of the entities?• How do we quickly and intuitively browse the
relational database, considering its complex structure?
pdf at: www.icmc.usp.br/pessoas/junio
Our approachOur approach• Use graph representation• Graph-partitioning techniques• Graph-processing• Interactive Visualization
Database Graph Partitioning Visualization Analysis
pdf at: www.icmc.usp.br/pessoas/junio
Relationships as GraphsRelationships as Graphs
Author Publish Work
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Relationships as GraphsRelationships as Graphs
Author Publish Work
Alice ABob B
Charles C…
A 1B 2C 3A 2
…
1 Optic Fiber2 Networks3 Cryptography
…
11
22
33
AA
BB
CC
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Relationships as GraphsRelationships as Graphs
Author Publish Work
Alice ABob B
Charles C…
A 1B 2C 3A 2
…
1 Optic Fiber2 Networks3 Cryptography
…
11
22
33
AA
BB
CC
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Relationships as GraphsRelationships as Graphs
Author Publish Work
Alice ABob B
Charles C…
A 1B 2C 3A 2
…
1 Optic Fiber2 Networks3 Cryptography
…
11
22
33
AA
BB
CC
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Graph PartitioningGraph PartitioningDB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Graph PartitioningGraph PartitioningDB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Graph PartitioningGraph PartitioningDB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Hierarchical PartitioningHierarchical PartitioningDB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Hierarchical PartitioningHierarchical Partitioning
cut 0
subgraph 1 subgraph 2
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Hierarchical PartitioningHierarchical Partitioning
cut 0
subgraph 1 subgraph 2
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Hierarchical PartitioningHierarchical Partitioning
cut 0
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
subgraph 1 subgraph 2
pdf at: www.icmc.usp.br/pessoas/junio
Hierarchical PartitioningHierarchical Partitioning
cut 0
DB m×mDB m×m GraphGraph VisualizationVisualization AnalysisAnalysisGraphTreeGraphTreePartitioningPartitioning
pdf at: www.icmc.usp.br/pessoas/junio
Hierarchical PartitioningHierarchical Partitioning
cut 1 cut 0 cut 2
DB m×mDB m×m GraphGraph VisualizationVisualization AnalysisAnalysisGraphTreeGraphTreePartitioningPartitioning
pdf at: www.icmc.usp.br/pessoas/junio
Hierarchical PartitioningHierarchical Partitioning
cut 1 cut 0 cut 2
subgraph 1-1
subgraph 1-2
subgraph 2-1
subgraph 2-2
DB m×mDB m×m GraphGraph VisualizationVisualization AnalysisAnalysisGraphTreeGraphTreePartitioningPartitioning
pdf at: www.icmc.usp.br/pessoas/junio
SuperGraphSuperGraph
SuperNode 1-1
SuperNode 1-2
cut 1 cut 0 cut 2
subgraph 2-1
subgraph 2-2
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
SuperGraphSuperGraph
SuperEdge 1
SuperNode 1-1
SuperNode 1-2
cut 0 cut 2
subgraph 2-1
subgraph 2-2
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
SuperGraphSuperGraph
SuperEdge 2
SuperNode 2-1
SuperNode 2-2
cut 0
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
SuperGraphSuperGraph
cut 0
subgraph 1 subgraph 2
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
SuperGraphSuperGraphSuperNode 2SuperNode 1
cut 0
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
SuperGraphSuperGraphSuperNode 2SuperNode 1
SuperEdge 0
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
SuperGraphSuperGraphDB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
SuperGraphSuperGraph• Further details in the paper
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
PP AA
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
PP AA
local
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
USUS
USUS BRBR
PP AA
BRBR
local
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
USUS
USUS BRBR
PP AA
BRBR
localyear
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
USUS
USUS BRBR
PP AA
BRBR
’00-’06’00-’06
localyear
’06-’11’06-’11
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
USUS
USUS BRBR
PP AA
BRBR
’00-’06’00-’06
’06+’06+ **
‘95+‘95+ ’02+’02+
localyear
’06-’11’06-’11
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
USUS
USUS BRBR
PP AA
BRBR
’00-’06’00-’06
’06+’06+ **
‘95+‘95+ ’02+’02+
agelocalyear
’06-’11’06-’11
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
USUS
USUS BRBR <40<40 >40>40
PP AA
<40<40
>40>40
BRBR
’00-’06’00-’06
’06+’06+ **
‘95+‘95+ ’02+’02+
agelocalyear
’06-’11’06-’11
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
USUS
USUS BRBR <40<40 >40>40
PP AA
<40<40
>40>40
BRBR
’00-’06’00-’06
’06+’06+ **
‘95+‘95+ ’02+’02+
agelocalyear dept
’06-’11’06-’11
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
USUS
USUS BRBR <40<40 >40>40
PP AA
<40<40
>40>40
BRBR
’00-’06’00-’06
IMEIME **
’06+’06+ **
EESCEESC
ICMCICMC
‘95+‘95+ ’02+’02+
agelocalyear dept
’06-’11’06-’11
FFLCHFFLCH
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
USUS
USUS BRBR <40<40 >40>40
PP AA
<40<40
>40>40
BRBR
’00-’06’00-’06
IMEIME **
’06+’06+ **
‘95+‘95+ ’02+’02+
agelocalyear dept
’06-’11’06-’11
FFLCHFFLCH
Connectivity SuperEdges
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
Attribute-based PartitioningAttribute-based PartitioningPaper Author
PaperPaper AuthorAuthor
USUS
USUS BRBR <40<40 >40>40
PP AA
<40<40
>40>40
BRBR
IMEIME **
’06+’06+ **
‘95+‘95+ ’02+’02+
agelocalyear dept
FFLCHFFLCH
Connectivity SuperEdges
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
Left relation: Paper = {idPaper, country, year, title}Rght relation: Author = {idAuthor, age, dept, authorName}
pdf at: www.icmc.usp.br/pessoas/junio
R-Mine PrototypeR-Mine Prototype• Based on the GMine System• Test platform with minimalistic design
• SuperNode tree:• node-link, radial layout, partial focus
• SuperEdge graphs:• node-link, bipartite layout, edge filtering
• Leaf SuperNode graphs: typical node-link
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
R-Mine PrototypeR-Mine PrototypeDB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
3. Experiments3. Experiments
DB m×mDB m×m GraphGraph GraphTreeGraphTreePartitioningPartitioning VisualizationVisualization AnalysisAnalysis
pdf at: www.icmc.usp.br/pessoas/junio
Tycho USP databaseTycho USP database• Data from several USP systems
• Personnel, Supervisions, Publications, Events…
pdf at: www.icmc.usp.br/pessoas/junio
Tycho USP databaseTycho USP database• Using 5 entities and 5 relationships
• 350k events• 380k examinations• 691k publications• 50k people• 26k supervisions
• 1.5 million nodes total• 1.8 million edges (relationships)
pdf at: www.icmc.usp.br/pessoas/junio
Q1: active authorsQ1: active authors• Which group of People (by age) have the
largest number of recent publications?
SQL: SELECT a.age, count(*) num FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 JOIN Person a ON a.id = x.author GROUP BY a.age ORDER BY num DESC
pdf at: www.icmc.usp.br/pessoas/junio
Q1: active authorsQ1: active authors
pdf at: www.icmc.usp.br/pessoas/junio
Q1.b: active authorsQ1.b: active authors• Who are them?• SQL: SELECT a.name, p.title
FROM PersonPublication xJOIN Publication p ON p.id = x.publication AND p.year >= 2008JOIN Person a ON a.id = x.authorWHERE a.age IN (SELECT age FROM (SELECT a.age age, count(*) num FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 JOIN Person a ON a.id = x.author GROUP BY a.age ORDER BY num DESC) T)
pdf at: www.icmc.usp.br/pessoas/junio
Q1.b: active authorsQ1.b: active authors
pdf at: www.icmc.usp.br/pessoas/junio
Q2: favorite countriesQ2: favorite countries• Which country receives the largest number of recent
publications from this group of people?• SQL: SELECT a.name, p.title
FROM PersonPublication xJOIN Publication p ON p.id = x.publication AND p.year >= 2008JOIN Person a ON a.id = x.author AND a.age BETWEEN 56 AND 63WHERE p.country IN (SELECT country FROM (SELECT p.country country, count(*) num FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 JOIN Person a ON a.id = x.author AND a.age BETWEEN 56 AND 63 GROUP BY p.country ORDER BY num DESC) T)
pdf at: www.icmc.usp.br/pessoas/junio
Q2: favorite countriesQ2: favorite countries
pdf at: www.icmc.usp.br/pessoas/junio
Q2: favorite countriesQ2: favorite countries
pdf at: www.icmc.usp.br/pessoas/junio
Q3: active authors per Q3: active authors per countrycountry• Now in one specific country, which group of People is the
most active recently?• SQL: SELECT a.name, p.title
FROM PersonPublication xJOIN Publication p ON p.id = x.publication AND p.year >= 2008 AND p.country = ‘Estados Unidos’JOIN Person a ON a.id = x.authorWHERE a.age IN (SELECT age FROM (SELECT a.age age, count(*) num FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 AND p.country = ‘Estados Unidos’ JOIN Person a ON a.id = x.author GROUP BY a.age ORDER BY num DESC) T)
pdf at: www.icmc.usp.br/pessoas/junio
Q3: active authors per Q3: active authors per countrycountry
pdf at: www.icmc.usp.br/pessoas/junio
Q3: active authors per Q3: active authors per countrycountry
pdf at: www.icmc.usp.br/pessoas/junio
Performance: individual Performance: individual queriesqueries150 analytical questions: PostgreSQL × R-Mine
pdf at: www.icmc.usp.br/pessoas/junio
Performance: accumulated Performance: accumulated timetime150 analytical questions: PostgreSQL × R-Mine
pdf at: www.icmc.usp.br/pessoas/junio
Performance: loading timePerformance: loading time
SuperNode Load(s)Connectivity
to all siblings (seconds)
SQL (seconds)
(initial loading) 6.032 - -
Person 0.057 5.847 7.349
Event 0.271 5.276 26.716
Publication 0.160 4.484 27.677
Total 6.520 15.607 61.742
pdf at: www.icmc.usp.br/pessoas/junio
4. Conclusions4. Conclusions
pdf at: www.icmc.usp.br/pessoas/junio
Our approachOur approach• Can use the Relational information
• To guide the partitioning• To give an initial context to the analyst
• Faster than running SQL queries
• Make neighborhood exploration easy• Interactive Visualization environment
pdf at: www.icmc.usp.br/pessoas/junio
ConsiderationsConsiderations• Initial parameters
• Which entities, relationships and attributes?• In which order?• How to define partitions? Ranges?• How many partitions?
• Different interaction tasks• Ongoing usability evaluation