seeing things in the clouds over concept lattices with tag clouds browsing semi-structured data...

34
Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois connection knowledge discovery mining software repositories join focus visualization information retrieval meet navigation Stellenbosch Computer Science

Upload: samantha-alderman

Post on 01-Apr-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Seeing

Things

in the Clouds

over concept lattices

with tag clouds

browsingsemi-structured data

Bernd Fischer

object

attribute context table

relation

Galois connection

knowledge discovery

mining

software repositories

join

focus

visualization

information retrieval

meet

navigation

Stellenbosch

Computer Science

Page 2: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find stuff on the Internet?

concept-based browsing

query

Page 3: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find stuff on the Internet?

Yikes! 3 370 000 results!

Page 4: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find stuff on the Internet?

concept-based browsing

query

lattice

Page 5: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find stuff you didn’t look for?

Retrieval: extract objects that satisfy a pre-defined criterion• query describes criterion• main operation is matching: check satisfaction against query• main goal is precision: show only relevant objects

Browsing: spontaneously explore a collection • focus describes current position and selection• main operation is navigation: change the focus• main goal is recall: show all relevant objects

Page 6: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you browse?

(hierarchical)navigation structure

focus

selection

Page 7: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you browse semi-structured data?

What is semi-structured data?

What is structured data?

Structured data has...• ... a very high degree of regularity• ... an explicit, tight format (schema)

Typical examples:• spreadsheets• relational databases (SQL: structured query language)

Page 8: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you browse semi-structured data?

What is semi-structured data? Semi-structured data ...

• ... contains both free-text and formatted fields

• ... has large structural variance• ... is implicitly formatted

Typical examples:• product reviews• newspaper articles

+ meta-data• revision control logs

Page 9: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Approach:• find a suitable abstract data representation

– bag-of-words, graphs, binary relations, RDF triples, XML, ...• find a suitable hierarchy

– metric spaces, graphs, concept lattices, ...• find a suitable visual representation

– lists, graphs, tag clouds, city scapes, ...• find a navigation algorithm

How do you browse semi-structured data?

Page 10: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you represent data?

Structured data is represented by n-ary relations or tables:• each object becomes a row• each column represents

an attribute type• text remains unstructured

author title year venue

Fischer Specification-based browsing... 2000 J. ASE

van Zijl Supernondeterministic finite... 2001 CIAA

Page 11: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you represent data?

Structured data is represented by n-ary relations or tables:• each object becomes a row• each column represents

an attribute type• text remains unstructured• set-valued attributes require normalization

author title year venue

Fischer Specification-based browsing... 2000 J. ASE

van Zijl Supernondeterministic finite... 2001 CIAA

Greene ConceptCloud: A Tag-cloud... 2014 FSE

Fischer ConceptCloud: A Tag-cloud... 2014 FSE

Page 12: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you represent data?

Structured data is represented by n-ary relations or tables:• each object becomes a row• each column represents

an attribute type• text remains unstructured• set-valued attributes require normalization

Semi-structured data can be represented by binary relations:• text is split into words• each occurring value and

word becomes an attribute• build context table: add cross if attribute applies to object

– word appears in document, meta-data, references ...

id title year venue

08 Specification-based browsing... 2000 J. ASE

15 Supernondeterministic finite... 2001 CIAA

42 ConceptCloud: A Tag-cloud... 2014 FSE

id author

08 Fischer

15 van Zijl

42 Greene

42 Fischer

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

Page 13: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find hierarchy in relations?

Formal concept analysis:• formal context: (O, A, ~ₓ)

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

Page 14: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find hierarchy in relations?

Formal concept analysis:• formal context: (O, A, ~ₓ) • common attributes:

α(O) = { a ∈ A | ∀o ∈ O : o ~ₓ a } α({08, 42} =

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

α({08, 42} = {Fischer, browsing}

Page 15: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find hierarchy in relations?

Formal concept analysis:• formal context: (O, A, ~ₓ) • common attributes:

α(O) = { a ∈ A | ∀o ∈ O : o ~ₓ a }• common objects:

ω(A) = { o ∈ O | ∀a ∈ A : o ~ₓ a }• concept:

(O, A) s.t. α(O) = A ∧ ω(A) = O

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

α({08, 42} = {Fischer, browsing}

ω({Fischer, browsing}

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

ω({Fischer, browsing} = {08, 42}

extent intent

Page 16: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

How do you find hierarchy in relations?

Formal concept analysis:• formal context: (O, A, ~ₓ) • common attributes:

α(O) = { a ∈ A | ∀o ∈ O : o ~ₓ a }• common objects:

ω(A) = { o ∈ O | ∀a ∈ A : o ~ₓ a }• concept:

(O, A) s.t. α(O) = A ∧ ω(A) = O

{08}{F,browsing,’00}

{42}{F,G,browsing,tag,’14}

{08, 42}{F,browsing}

{42}{tag}

extent intent

Page 17: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

How do you find hierarchy in relations?

Formal concept analysis:• formal context: (O, A, ~ₓ) • common attributes:

α(O) = { a ∈ A | ∀o ∈ O : o ~ₓ a }• common objects:

ω(A) = { o ∈ O | ∀a ∈ A : o ~ₓ a }• concept:

(O, A) s.t. α(O) = A ∧ ω(A) = O• sub-concept ordering:

(O₁, A₁) ≤ (O₂, A₂) iff O₁ ⊆ O₂ iff A₁ ⊇ A₂• concept lattice: concepts of a context form a complete lattice

{08}{F,browsing,’00}

{42}{F,G,browsing,tag,’14}

{08, 42}{F,browsing}

{42}{tag}

Page 18: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Are we there yet?

Nope.

Concept lattices induce • enough structure for navigation...• ... but too much to show directly!

Page 19: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you visualize concept lattices?

Approach:• don’t show the lattice• use concepts as focus• visualize only focus concept

– but in relation to lattice

Page 20: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you visualize concepts?

Approach:• don’t show the lattice• use concepts as focus• visualize only focus concept

– but in relation to lattice• use extent to derive tag cloud

Page 21: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you build tag clouds for concepts?

What is a tag cloud?

• visual representation of text data– summarize large data set

– emphasize important tags

• single words or short phrases• importance reflected as size

– frequency in document

– number of tagged items

– number of page hits

• different layout methods

Page 22: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you build tag clouds for concepts?

• intent looks like tag cloud...• ... but is common to all objects

⇒ all tags same size• instead: collect all attributes

from all objects in extent– can be expressed in

concept lattice:

– also add extent via object identifiers

• intent shown as largest tags– smaller tags are related

information

{08, 42}{Fischer,browsing}

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

2 1 - 2 1 1 - 1

08 42 2000 2014 browsingFischer Greene tag

Page 23: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

The ConceptCloud Browser

by: Gillian Greene, US

file

message

date

author

controls

Page 24: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

The ConceptCloud Browser

most prolificcontributor

Page 25: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you navigate with tag clouds?

Navigation modes:• refinement: narrow the selection

– select a new tag

• widening: extend the selection– remove a selected tag

Page 26: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you navigate with concept lattices?

Navigation modes:• refinement: narrow the selection

– select a new tag: f’ = f ∧ δ(t)

• widening: extend the selection– remove a selected tag

(ω({t}), α(ω({t}))) if t∈Aδ(t) = (α(ω({t})), ω({t})) if t∈O

focusconcept

tagconcept

focusconcept

focusconcept

Page 27: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you navigate with concept lattices?

Navigation modes:• refinement: narrow the selection

– select a new tag: f’ = f ∧ δ(t)

• widening: extend the selection– remove a selected tag: f’ = f ∨ δ(t)

(ω({t}), α(ω({t}))) if t∈Aδ(t) = (α(ω({t})), ω({t})) if t∈O

tagconcept

focusconcept

tagconcept

focusconcept

focusconcept

focusconcept

Page 28: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you navigate with concept lattices?

Navigation modes:• refinement: narrow the selection

– select a new tag: f’ = f ∧ δ(t)

• widening: extend the selection– remove a selected tag: f’ = f ∨ δ(t) f’ = ∧i∈π(f) \ {t} δ(i)– join-based widening can be

useful as well

(ω({t}), α(ω({t}))) if t∈Aδ(t) = (α(ω({t})), ω({t})) if t∈O

tagconcept

focusconcept

tagconcept

focusconcept

Page 29: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Navigation in the ConceptCloud Browser

Page 30: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Navigation in the ConceptCloud Browser

Page 31: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Navigation in the ConceptCloud Browser

Page 32: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

The Percept Browser

by: Carl Kritzinger, Fireworks

Page 33: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

• Semi-structured data is common but hard to analyze• Tag clouds are a good visualization approach...• ... and the combination with concept lattices makes it easy to

navigate and find related information• Flexible approach, generic tool

– different data sets– different types of contexts ( different types of analysis)⇒

• Scalability– DBLP, IMDb, Wikipedia?

• Customizability– context extraction– tool scripting

Conclusions & Future Work

Page 34: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Conclusions & Future Work

• Semi-structured data is common but hard to analyze• Tag clouds are a good visualization approach...• ... and the combination with concept lattices makes it easy to

navigate and find related information• Flexible approach, generic tool

– different data sets– different types of contexts ( different types of analysis)⇒

• Scalability– DBLP, IMDb, Wikipedia?

• Customizability– context extraction– tool scripting