andreas becks

10
Visual Text Mining with SWAPit Detection of semantic relationships among text documents and associated data sources Andreas Becks Fraunhofer-Institute of Applied Information Technology Sankt Augustin & Aachen, Germany Aachen St.Augustin Roma, 24 novembre 2005

Upload: fell

Post on 19-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Aachen. St.Augustin. Roma, 24 novembre 2005. Visual Text Mining with SWAPit Detection of semantic relationships among text documents and associated data sources. Andreas Becks Fraunhofer-Institute of Applied Information Technology Sankt Augustin & Aachen, Germany. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Andreas Becks

Visual Text Mining with SWAPitDetection of semantic relationships among text documents and associated data sources

Andreas BecksFraunhofer-Institute of Applied Information TechnologySankt Augustin & Aachen, Germany

Aachen

St.Augustin

Roma, 24 novembre 2005

Page 2: Andreas Becks

2© Fraunhofer-FIT 2005

Lost in the Ocean of Text Documents?

Text Mining helps to explore and analyse natural-language texts

uncover relationships, recognize trendsgroup, condense pieces of knowledge

categorize text information

A huge amount of organisational knowledge is stored in text documents

85 to 90 percent of all corporate data according to Merrill Lynch and Gartner studies

Even when DMS and desktop search are used, a huge amount of time is necessary to find important information

80% of companies and 40% of public administrations need more than one day [Zylab survey]

Page 3: Andreas Becks

3© Fraunhofer-FIT 2005

SWAPit Helps You to Navigate Through Your Text DataThe tool visualises semantic relationships among text documents...

X-ray view for document archives

Page 4: Andreas Becks

4© Fraunhofer-FIT 2005

SWAPit Integrates Text and Data Mining... and allows to navigate, search, browse and analyse text documents and associated data and metadata

text documents

catalogue oftext categories

related structured data

Similarity View

Category View Tools for analysis and

search

Fact View

cate

goriz

atio

n

associations

Page 5: Andreas Becks

5© Fraunhofer-FIT 2005

Application Example: Document Management

New text documents

Protocollazione

Titolario

Information about type,

AOO/UO, ‘Fascicoli’, etc.

Project selection

Document similarity helps

to create ‘fascicoli’ and

find misclassified documents

DL-based categorization

Page 6: Andreas Becks

10© Fraunhofer-FIT 2005

SWAPit as a Single Point of Access

operational databases

text documents

user-specific schema & integrated access

DL-based integration

Virtual Integrated Database

From scattered information...

...to integrated informationmulti-schema databases,

distributed & data-centred accessintuitive, user-centred

access

DL-based categorization

Page 7: Andreas Becks

11© Fraunhofer-FIT 2005

Monitoring Documents with SWAPit and DL

unfiltered and unstructured

text documents DL-based filter

conceptually filtered, relevant text documents

DL-based catalogue

builder

3 news in 1 minute 1 document map per dayFrom information

overflow...intuitively structured

text documents

...to information overview

Page 8: Andreas Becks

12© Fraunhofer-FIT 2005

Displaying XML Documents in SWAPitFrom complex, machine-readable documents...

...to a human-oriented presentation

data with technically rich structural annotation

customized, task-oriented view

web ontology

metadata (selected attributes and elements)

text content from specified attributes

and elementsXML

XMLXML

XMLXML

XMLXMLXMLXML

ontology-context of specified elements

Page 9: Andreas Becks

13© Fraunhofer-FIT 2005

Conclusion: Visual and Intuitive Text Mining with SWAPitSWAPit combines views on text documents and associated data sources on a single sreen

Overview instead of overflow Improves quality of text access tasks Leverages knowledge sources

Flexible architecture Designed to integrate Semantic Web technology

Derives additional power from integration of DL technologies Can be integrated easily into existing infrastructures or company

portals Can be tailored to specific needs of different market segments

Long-standing experience in research and practical applications Document Management, Business Intelligence, Customer Relationship

Management, ... Main sectors: Insurance, Textile, Engineering, Social Science

Technology has been extended in a joint project with Maurizio Lenzerini (SEWASIE)

Page 10: Andreas Becks

14© Fraunhofer-FIT 2005

Grazie dell’attenzione!