updatacapital the context the application latest technical developments

17
UpdataCapital The Context The Application Latest technical developments

Upload: augusta-griffith

Post on 18-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

UpdataCapital

The Context

The Application

Latest technical developments

The Human Touch

Searching for documents or knowledge?– Knowledge is embedded in people

– Collexis behaves like a (human) expert

– Collexis finds information, experts and organizations

– Collexis enhances Knowledge sharing and Collaboration

Human Communication

• Humans communicate in explicit language

• including many variations and ambiguities

• final aim of communication is sharing “concepts”

• Concepts are “real life entities”

• constituting the reference framework of human knowledge

Text search versus Concept search

Word driven Subject driven (concepts)

Only exact words Also synonyms andvariants (normalization)

No relative weight of words Relative weight of subjects

(conceptual fingerprint)

Low accuracy High accuracy

Moderate performance High performance

Limited size of search text No limitation in search text size

Basic Issues

Coordination and facilitation of Information Sharing

• People/Experts

• Agencies/Departments/Organisations

• Document/Projects

• Interrelationship

Basic elements of Collexis’ Technology

• Collexis Conceptual FingerPrinting • Thesauri • Data and metadata validation • Flexible organisation/communication

The power of Fingerprints

• Collexis is based on the principle of Fingerprinting

• Fingerprint: a profile of a piece of information

• A Fingerprint contains a list of weighted concepts

• Concepts are derived from a Thesaurus

• Fingerprint characteristics: unique and small

AcronymOrganisationcontact detailse-mail

IOC

text

Categories(hidden butsearchable)

text

Accpeted concepts

Title/descriptors

Name:Institutecontact detailse-mail

IBC

Organisations People Activities

dynamic linksdynamic links

output asdynamic

combinations

FTC

Basic thinking: Basic FunctionalityBasic thinking: Basic Functionality

Fingerprints (CFP’s)

C19881 0.99C92992 0.67C02002 0.66C99229 0.44C00392 0.33C93939 0.21

consolidated knowledgeC19881 0.99C92992 0.67C02002 0.66C99229 0.44C00392 0.33C93939 0.21

                                      100%  Malaria                35% Agencies              30%  Enthusiastic              28%  Collaboration             27%  Funding             27%  Africa            25% Science        15%  Dedications        15%  Applaud        15%  agenda        14%  Inaccurate       14%  advocacy       13%  hope       13%  research funding       13%  Fund Raising

The Collexis Fingerprinting concept

Defining a Search

Word-basedSearching

What? Why?How? Who? indexing

indexing

Conceptmatching

The magic of Fingerprinting

contents fingerprints

addadd

people fingerprints

addadd

organization fingerprint

JobsCV’s, Skills

Articles,books

Emails,Word RFP’s

C19881 0.99C92992 0.67C02002 0.66C99229 0.44P00392P00392 n.aO93939O93939 n.a

Semantic typesSemantic typesCo-occurrence dataCo-occurrence data

The construction of Knowlets®

Name: AInstitutecontact detailse-mail

IBC

text

AcronymOrganisationcontact detailse-mail

DOI DOI

metadata

text

Titlemetadata

DOI

The “knowlet®”

Connecting: content, people, organisations

content, people, organisations

•Publications•Molecular Databases•Image databases•Patents•Events•Calls

AircraftAircraft

AirplaneAirplane

Simplified Thesaurus example

Means of transportMeans of transport

TrainTrain

AutomobileAutomobile

CarCar

TruckTruck

LorryLorry

Motor VehicleMotor Vehicle PlanePlane

Text text text Text more text text Text text more text Text text text Text more and more text text Text text text Text text tt text text Text more text text Text text more text Text text text Text more and more text text Text text text Text text textText text text Text more text text Text text more text Text text text Text more and more text text Text text text Text text text etc. Text text text Text more text text Text text more text Text text text Text more and more text text Text text text Text text t text text Text more text text Text text more text Text text text Text more and more text text Text text text Text text textText text text Text more text text Text text more text Text text text Text more and more text text Text text text Text text text etc.more text text Text text text Text text textText text text Text more text text Text text more text Text text text Text more and more text text Text text text Text text text etc.more and more text text Text text text Text text textText text text Text more text text Text text more text Text text text Text more and more text text Text text text Text text text etc.more text text Text text text Text text textText text text Text more text text Text text more text Text text text Text more and more text text Text text text Text text text etc.text Text text text etc.more text text Text text text Text text textText text text Text more text text Text text more text Text text text Text more and more text text Text text text Text text text etc.

Search text or Document

Search text or Document

RemovingStop wordsRemovingStop words

NormalizationNormalization

Concept LookupConcept Lookup FrequencyFrequency

SimilaritySimilarity

SpecificitySpecificity

Determination of relevant concepts

Determination of concept weight

Selected ConceptsSelected Concepts

ConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConcept

ConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConceptConcept

Search / DocumentFingerprint

Search / DocumentFingerprint

Source Result

ClusteringClustering

Abstraction Component