content discovery through entity driven search

Post on 21-Jan-2017

195 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ECIR 2014 Industry DayContent Discovery Through Entity Driven Search

Alessandro Benedettihttp://uk.linkedin.com/in/alexbenedetti

Antonio David Perez Morales http://es.linkedin.com/in/adperezmorales16th April 2014

• Experienced at building and delivering a wide range of enterprise solutions across the whole information life cycle

• Alfresco & Ephesoft certified Platinum Partner

• Red Hat Enterprise Linux Ready Partner

• Crafter & Varnish Gold Partners

• Search Solutions ConsultantAlfresco Partner of the Year 2012 and

2013

Working effectively together

Who We Are

3

Antonio David Pérez Morales

- R&D Senior Engineer- Master in Engineering and Technology Software- Digital Identity and Security expert- Enterprise Search Background- Semantic, NLP, ML Technologies and Information Retrieval lover- Apache Stanbol Committer- Apache contributor

@adperezmoraleshttp://es.linkedin.com/in/adperezmorales/

Alessandro Benedetti

- R&D Senior Engineer- Master in Computer Science- Information Retrieval background-- Enterprise Search specialist- Semantic, NLP, ML Technologies and Information Retrieval lover

@AlexBenedettihttp://uk.linkedin.com/in/alexbenedetti

Working effectively together

Agenda

4

• Context

• Problem

• Solution

• Demo

• Future Works

Working effectively together

Agenda

5

• Context

• Problem

• Solution

• Demo

• Future Works

Working effectively together

Zaizi R&D Department

6

•Giving sense to the content

• Enriching it semantically

•Adding value to ECM/CMS

• More structured content, easy to manage, link and search,

•Improving search

• Across different domains, data sources, User Experience

• Machine Learning applied research

• Content Organization – Recommendation Systems

Working effectively together

Agenda

7

• Context

• Problem

• Solution

• Demo

• Future Works

Working effectively together

Enterprise Search Problems

8

Challenge : Search within Big and Heterogeneus Repositories

• Heterogeneus Data Sources

• Filesystem, DB, ECM/CMS, Email, …

• Unstructured Content

• PDFs, text plain, Word, …

• Documents not linked between each other

• Federated Search needed

• Search across data sources

• Different permissions

• Centralized endpoint

Working effectively together

Current Enterprise Search Weaknesses

9

• Keyword based

• Low precision

• Ambiguous terms not in context

• Not accurate weighting when keywords are combined in a query

Working effectively together

Agenda

10

• Context

• Problem

• Solution

• Demo

• Future Works

Working effectively together

Entity Driven Search

11

• Moves from keywords to Entities

•More understandable to a Human

• Process the unstructured text

• Enrich it

• Build specific indexes

• Use entities and concepts in searches

Working effectively together

Sensefy

12

• Semantic Enterprise Search Engine

• Federated Search

• Evolved User Experience

• Based on cutting-edge Open Source Frameworks

Working effectively together

Architecture

13

Working effectively together

RedLink

14

• Semantic Cloud platform

• Providing Software as a Service

• Manage unstructured data

• Extract knowledge and intelligence

• Make sense of information

• Feed into business processes

• Open-Source based components

• Entity Linking using Knowledge Bases

Working effectively together

NLP & Semantic Enrichment

15

• From unstructured to structured

• NLP Analysis. POS Tagging

• Named Entities Recognition

• Linked Data

• Entity Linking using Knowledge Bases

• Disambiguation

• Indexing in Solr

Working effectively together

Smart Autocomplete

16

• Multi Phase suggestions

• Closer to natural language query formulation

• Named Entities infix

• Entity types infix

• Multi Language entity type support

• Properties driven query approach

Working effectively together

Smart Autocomplete Configuration

17

• Entity type properties

• Interesting to our use case and scenario

• Properties inheritance through type hierarchy

• Enhance type information from external resource

•Freebase, DbPedia , Custom Data Set

Working effectively together

Semantic Search

18

• Search by Named Entity

• Search by Entity Type

• Search by Entity Type properties

• Grouping Results by Sense

• Contextualize Results Using Semantic Information

Working effectively together

Semantic More Like This

19

• Search for Similar Documents based on Entities and Entities’ categories

• Similarity Function based on Documents’ Sense

• Not based on text tokens

• Entity Frequency / Inverted Document Frequency

• Entity Type Frequency / Inverted Document Frequency

Working effectively together

Agenda

20

• Context

• Problem

• Solution

• Demo

• Future Works

Working effectively together

Agenda

21

• Context

• Problem

• Solution

• Demo

• Future Works

Working effectively together

Future Work

22

• Semantic More Like This new approach (Graph relations)

• Machine Learning components: Classification, Topic annotation, Clustering

• Semantic facets

• Secured Entity Search

• Image and Media searches

Working effectively together

Conclusions

23

• Better user experience

• More precision in search results

• Closer to human language

Zaizi HeadquartersBrook House4th Floor, North Wing229-243 Shepherd’s Bush RoadLondon W6 7ANUnited KingdomT: (+44) 20 3582 8330 Zaizi IberiaCalle Gremios 13-15, Edificio DiseñoPlanta 1, Oficina 541927 Mairena del Aljarafe SevillaSpainT: (+34) 666 42 43 64 Zaizi Asia50 Flower RoadColombo 07Sri LankaT: (+94) 112 301 461 Zaizi Singapore14 Robinson Road #13-00Far East Finance BuildingSingapore 048545T: (+65) 3158 5886F: (+65) 6323 1839

VAT Registration No GB 932 8855 89Registered in England and Wales with registration number 6440931

www.zaizi.com

Thanks!

top related