Transcript
Page 1: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Linked  Data-­‐‑based  Concept  Recommendation:  Comparison  of  Different  Methods  in  Open  

Innovation  Scenario Danica Damljanovic, Milan Stankovic,

Philippe Laublet

Page 2: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Innovation

Page 3: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Innovation  Platforms

Challenge:  Promote  innovation  problems  to  an  audience  of  solvers  who  can  propose  relevant  innovative  solutions

Page 4: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Finding  Meaningful  Connec0ons  

Clay  mining  …  

Kaolinite  extrac0on  from  

rocks  …  

Different  communi-es  use  different  terms  and  concepts  to  speak  about  seman-cally  related    things.  Such  “language”  defines  communi-es  and  separates  them.  Being  able  to  find  

meaningful  connec-ons  between  concepts  would  enable  us  to  build  bridges  between  people  and  content.  

h;p://bit.ly/hyProximity  

Page 5: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Concept  recommenda0on  •  Concepts  you  might  not  know  but  might  want  to  use:  to  annotate  

your  content,  to  search  for  content,  to  search  for  people…  •  Help  problem  promoters  discover  relevant  concepts  (problem  

promoters  some0mes  not  field  experts)  •  Discovery  =  relevance  +  unexpectedness  

h;p://bit.ly/hyProximity  

Page 6: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

•  HyProximity, a structure-based similarity •  Structure-based Statistical Semantics Similarity

Random Indexing, a well-known statistical semantics from Information Retrieval to RDF

Discovering  Direct  and  Lateral  Concepts  

Page 7: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Linked  Data-­‐based  Concept  Recommenda0on    

Zemanta Textual  Input

DBPedia  Concepts  found  in  the  text

DBPedia  Exploration suggestions

h;p://bit.ly/hyProximity  

Page 8: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

hyProximity  

•  We  start  from  several  seed  concepts  found  directly  in  the  text,  and  search  the  DBPedia  graph  

•  The  concepts  found  in  the  proximity  of  several  seed  concepts  are  considered  more  “in  context”  for  the  given  input  

•  Concepts  found  at  a  shorter  distance  from  the  seed  concepts  have  higher  hyProximity  

Page 9: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

•  Hierarchical:  exploring  skos:broader  rela9ons  •  Transversal:  exploring  transversal  links  •  mixed:  a  linear  combina0on  of  hierarchical  and  transversal    

Different  Distance  Func0ons  skos:broader  

other  property  

2   2   2   2+1  

research.hypios.com/hyproximity  

Paris Seine

Rivers in France Cities in France

Things in France

Products of France

Marne Chanel

Car Industry

BMW Peugeot

Page 10: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Different  Distance  Func0ons  

“fashion”  1   1  

research.hypios.com/hyproximity  

1  

Paris Seine

Rivers in France Cities in France

Things in France

Products of France

Marne

Car Industry

BMW Peugeot Chanel

flows through competitor

skos:broader  

other  property  

famous for

•  Hierarchical:  exploring  skos:broader  rela0ons  •  Transversal:  exploring  transversal  links  •  Mixed:  a  linear  combina0on  of  hierarchical  and  transversal    

Page 11: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Random  Indexing •  Words which appear in the similar context - with the

same set of other words - are contextually related e.g. synonyms.

•  Synonyms tend not to co-occur with one another directly, so indirect inference is required to draw associations between words used to express the same idea

Page 12: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Two  steps  to  Random  Indexing

•  Indexing o  For an RDF graph, generate virtual documents o  Prepare the corpus (pre-processing) o Generate semantic index

•  Search - given a term X calculate a cosine similarity between the vector of that term and other vectors in the semantic space

Page 13: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Building  context    vectors

d1 0 0 -­‐‑1 1 -­‐‑1 1

d2 -­‐‑1 1 0 0 1 -­‐‑1

… dp 0 1 0 -­‐‑1 -­‐‑1 1

d1 d2 .. dp t1 1 2 .. 0

t2 3 0 .. 0

.. .. .. .. ..

tq 0 1 10

t1 t2 … tq

X =

Dimensionality  =  n

Seed  length

M

D

T

Page 14: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Indexing:  virtual  documents

14

S

O2

O1

L7

P7

L3

L2

L1

P4

L4

P1

P2

P3

L8

L6

L5

P10 P9 P8

lexicalise

Representative  subgraph  for  URI=S Virtual  document  for  URI=S

P5 P6

P1 S P2 L2 S L1

S P3 L3

S

L5

P4 P5 L4 O1 S P4 O1 P6 S L6 S

L8

P7

P7 P9 O2

L7 P8

O2 S P7 O2 P10

S P7 O2 S P4 O1

Page 15: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Experiments •  26 real innovation problems from Hypios •  Measure of success: the suggested concepts

appear in the actual solutions (precision, recall, f-measure)

(+) reasonable list of concepts from real scenarios (-) not complete:

o  User study: measure discovery = relevance+unexpectedness

Page 16: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

DBpedia  Dataset •  Select a number of properties relevant to the Open

Innovation-related scenario •  dbo:product, dbp:pruducts, dbo:industry,

dbo:service, dbo:genre, and properties serving to establish a hierarchical categorization of con- cepts, namely dc:subject and skos:broader

Page 17: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Evaluation •  “Gold standard”

o  Extract problem URIs o  Extract solution URIs

•  Baseline: o Google Adwords Keyword Tool: finds similar

topics based on their distribution in textual corpora and the corpora of search queries.

o  Suggesting up to 600 concepts which are then used for Web crawling for finding experts.

Page 18: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Evaluation:  Results

! !

!!

Page 19: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

User  Study •  Suggestions being both relevant and unexpected

o  the most valuable discoveries for the user •  12 users •  34 problem evaluations

o  3060 suggested concepts/keywords.

•  For the chosen innovation problem, the evaluators were presented with the lists of 30 top-ranked suggestions generated by adWords, hyProximity (mixed approach) and Random Indexing.

Page 20: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Example

Page 21: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

User  Study:  Results

Page 22: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Conclusion •  Linked Data valuable source of knowledge for

concept recommendation •  Our two methods complementary

o  hyProximity better for precision o  Random Indexing better for recall

•  User study: unexpectedness higher with our methods than with baseline

•  Subjective user comment: o  Random Indexing: generic o  hyProximity: granular o adWords: redundant

Page 23: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario

Thank  You! •  Find out more: •  http://research.hypios.com/?page_id=165

Contact us: •  Danica Damljanovic @dancheeee •  Milan Stankovic: @milstan


Top Related