artificial intelligence meets big and open data

32
0 Copyright 2014 FUJITSU Human Centric Innovation Fujitsu Forum 2014 19th – 20th November

Upload: fujitsu-global

Post on 01-Jul-2015

585 views

Category:

Technology


5 download

DESCRIPTION

Big data is bringing new opportunities for innovation. In particular, the use of open data is expected to have a significant impact on the global economy. In 2006, Linked Data was proposed by Tim Berners-Lee. This new web technology based on artificial intelligence has the potential to open up a new world of data analytics. This session will introduce how this technology creates new knowledge or insight from big data and open data. Speakers: Mr. Nobuyuki Igata (Fujitsu Laboratories, Ltd.)

TRANSCRIPT

Page 1: Artificial intelligence meets big and open data

0 Copyright 2014 FUJITSU

Human Centric Innovation

Fujitsu Forum 2014

19th – 20th November

Page 2: Artificial intelligence meets big and open data

1 Copyright 2014 FUJITSU

Artificial intelligence meets Big and Open Data

Nobuyuki Igata Research Manager

Knowledge Platforms Laboratory

Social Innovation Laboratories

Fujitsu Laboratories Ltd. (Japan)

Page 3: Artificial intelligence meets big and open data

2 Copyright 2014 FUJITSU

Contents

Why Artificial Intelligence is now again

New wave in World Wide Web

Web of data: Linked Data

Page 4: Artificial intelligence meets big and open data

3 Copyright 2014 FUJITSU

Why Artificial intelligence is now again

Page 5: Artificial intelligence meets big and open data

4 Copyright 2014 FUJITSU

Overview of AI trend

Expert System

DeepBlue 512Core

DeepQA 2,880 Core

Millions Books of data

2000 2010 1980 1990 2020

1997

2011

Evolution of AI 2012

Deep Learning 16,000 Core

10 Millions images

Performance of ICT

Storage

Network

Exponentially increased

CPU

How many Web site?

18,000

100M

1B

0

200

400

600

800

1000

1200

1995 2006 2014

Millions site

DeepQA 2,880 Core

Millions Books of data

Deep Learning 16,000 Core

10 Millions images

Page 6: Artificial intelligence meets big and open data

5 Copyright 2014 FUJITSU

Deep Learning

Image recognition by Google 2012

16,000 CPU Core

Deep Learning Algorithm

• Based on multi-layer Neural network

10 Millions images from YouTube

Autonomously Learning for 1 week

• (so-called Unsupervised Learning)

Finally the system can distinct face

of Cats and Human

Automatic feature extraction

Distinct the face of cat

training images

Deep Learning

Source: Quoc V. Le, etc. “Building high-level features using large-scale unsupervised learning”, ICML2012

Unsupervised

Page 7: Artificial intelligence meets big and open data

6 Copyright 2014 FUJITSU

Computer Shōgi

Shōgi

Japanese Traditional Board Game

Similar rules with Chess

Difficulties compared with Chess

Chess Shōgi

Board Size 64: 8x8 81: 9x9

Num of Pieces 32 40

Different Pieces 6 8

Legal Positions 1047 1071

Possible Games 10123 10226 Source: http://en.wikipedia.org/wiki/Shogi

Page 8: Artificial intelligence meets big and open data

7 Copyright 2014 FUJITSU

Develop a Strong Computer Shōgi from Data

Past Approach

Machine Learning based Approach

Human programmer Strong Shōgi Player

Past (Professional) Game results Value of Pieces Position of Pieces

Top-level professional

100M features

Automatic Feature Extraction from past data

Amateur-level

500 features

Human extract features from his experience

87 点 569 点

歩 角

Page 9: Artificial intelligence meets big and open data

8 Copyright 2014 FUJITSU

DeepQA: Question Answering

IBM Watson won Quiz Champion on Jeopardy! 2011

2,880 CPU Core

Millions Books of data

Combined several techniques

• NLP, IR, KR, ML, Reasoning

Source: http://en.wikipedia.org/wiki/Watson_(computer)

Page 10: Artificial intelligence meets big and open data

9 Copyright 2014 FUJITSU

Tōdai Robot Project in Japan

"Can a robot pass entrance exam of the University of Tokyo (Todai) ? ”

Grand Challenge by National Institute of Informatics, Japan (2011 – 2021)

An open platform to evaluate various AI technologies in a real-world situation

Collaboration of a variety of AI technologies

Fujitsu Labs joins to this project especially Mathematical exam

Need Reasoning mechanism like Human thoughts

Page 11: Artificial intelligence meets big and open data

10 Copyright 2014 FUJITSU

Current Math Solver System

“Find the radius of a circle with area 4π.”

• Parsing • Anaphora resolution • Discourse analysis

?(r)[ ∀C. circle(C)∧ area(C) = 4π → radius(C) = r ]

Question

Language Understanding

Logical Form

Solver algorithms

Answer

Formula Rewriting

Solver Input ?(r)[∀s. πs2 = 4π ∧ s > 0 → s = r ] • Quantifier Elimination • Groebner Basis • Theorem Prover etc.

r = 2

circle(C) ∧ area(C) = πr2 radius(C) = r

Math Knowledge Base

Math Lexicon

(dictionary)

Page 12: Artificial intelligence meets big and open data

11 Copyright 2014 FUJITSU

Demo of Tōdai Robot (Math)

Demo

Video

Page 13: Artificial intelligence meets big and open data

12 Copyright 2014 FUJITSU

Current Math Solver System

“Find the radius of a circle with area 4π.”

• Parsing • Anaphora resolution • Discourse analysis

?(r)[ ∀C. circle(C)∧ area(C) = 4π → radius(C) = r ]

Question

Language Understanding

Logical Form

Solver algorithms

Answer

Formula Rewriting

Solver Input ?(r)[∀s. πs2 = 4π ∧ s > 0 → s = r ] • Quantifier Elimination • Groebner Basis • Theorem Prover etc.

r = 2

circle(C) ∧ area(C) = πr2 radius(C) = r

Math Knowledge Base

Math Lexicon

(dictionary)

Combinatorial Explosion

How to create Knowledge Base

Page 14: Artificial intelligence meets big and open data

13 Copyright 2014 FUJITSU

New wave in World Wide Web

Web of Data: Linked Data

Page 15: Artificial intelligence meets big and open data

14 Copyright 2014 FUJITSU

Background: Open Data

G8 (Fr, US, UK, DE, JP, IT, CA, RU) Open Data Charter released in June 2013 Identified 14 high-value categories

Commit to releasing data by Dec.2015

Provide data in Machine-readable format

Over 40 countries in the world have websites for data disclosure

Japan cabinet office defined the E-government open data strategy

Global Open Data Index

from http://global.census.okfn.org/ by Open Knowledge Foundation

Page 16: Artificial intelligence meets big and open data

15 Copyright 2014 FUJITSU

What is Machine-readable format?

5 Star Open Data By Tim.B Lee

Available on the Web under an open license e.g., PDF

Available as structured data e.g., Excel

Use non-proprietary formats e.g., CSV

Use URIs so that others can point at your stuff

Link your data to other people’s data to provide context e.g., Linked Data

Source: http://5stardata.info/

Page 17: Artificial intelligence meets big and open data

16 Copyright 2014 FUJITSU

Web of Documents

HTML

HTML

HTML

PDF

Link

25 years ago, World Wide Web appeared in the internet. Web has a link structure called HyperText and provided the mechanism to disclose documents that had been stored in individual computers.

Page 18: Artificial intelligence meets big and open data

17 Copyright 2014 FUJITSU

Web of Documents

From the middle of 90’s, many kind of information had been open to the public on Web. Web brought a dramatic revolution to the method of information accesses

HTML

HTML

HTML

PDF

Link

Human-readable

Page 19: Artificial intelligence meets big and open data

18 Copyright 2014 FUJITSU

HTML

HTML

HTML

PDF

Link

d d

Web of Documents

However traditional web is optimized to “Human read it and understand it”, individual data still remains inside each document.

a b

c Human-readable

Page 20: Artificial intelligence meets big and open data

19 Copyright 2014 FUJITSU

d

Web of Data: Linked Data

And then Linked Data appears to solve this issue. Linked data can release individual data from each document. And data can have a link to other data like web document.

a b

c

Page 21: Artificial intelligence meets big and open data

20 Copyright 2014 FUJITSU

d

Web of Data: Linked Data

In Linked Data, the semantics or the relationship of data are represented by “Link”. Computer can easily understand the semantics of data. This is the reason why Linked Data is Machine-readable.

a b

c Machine-readable

Page 22: Artificial intelligence meets big and open data

21 Copyright 2014 FUJITSU

From 2007, dataset described in Linked Data format has been released on Web. Currently these dataset make a big data network on Web. These dataset and data network are called “Linked Open Data(LOD)”.

Linked Open Data(LOD)

Page 23: Artificial intelligence meets big and open data

22 Copyright 2014 FUJITSU

Activities of Fujitsu Labs

Data Conversion

Federated different datasets

LOD Search Engine

http://lod4all.net/

Retrieve LOD in the world

Statistics

Finan

cial

Sales

Logistics

Page 24: Artificial intelligence meets big and open data

23 Copyright 2014 FUJITSU

Current Status and Issues of LOD

Many web sites have published their own data separately

LOD Site-A

Internet ?

I don’t know which site has the data I want

Problem1 Requires complex processing in the application to do the processing that combines data from multiple sites

Problem2

User

LOD Site-B

data in a site without SPARQL endpoint is not searchable

Problem3

Complex processing

Application

Page 25: Artificial intelligence meets big and open data

24 Copyright 2014 FUJITSU

LOD Search Engine by Fujitsu Labs

Store LOD from around the world (several tens of billion data items), and provide a high-speed search

LOD Site-A LOD Site-B

LOD Search Engine Search UI Standard API

Collect Collect Collect

Use data without being aware of what site it is on

Solution1 Applications need not to implement complex processes!

Solution2

Can even search sites without search functions!

Solution3 Use Use

User Application

Page 26: Artificial intelligence meets big and open data

25 Copyright 2014 FUJITSU

LOD Search Engine by Fujitsu Labs

LOD4ALL Search

or http://lod4all.net

Demo

Video

Page 27: Artificial intelligence meets big and open data

26 Copyright 2014 FUJITSU

Heterogeneous Financial Data

How to establish interoperability between heterogeneous data?

No-links, data silos

Format heterogeneity (XML, schemas, APIs, Web crawling, CSV files, etc.)

Page 28: Artificial intelligence meets big and open data

27 Copyright 2014 FUJITSU

Standard API

LOD Search Engine

Use case: Financial Domain

Financial Reports, etc.

NY Times

LEI: Legal Enterprise Identifier

Analysts

Linked Open Data (LOD) Public

Public Dashboard

Da

ta

Con

vers

ion

Crunchbase

Public

Market Data (Stock Price)

LOD

Cra

wli

ng

DBpedia

Application

Page 29: Artificial intelligence meets big and open data

28 Copyright 2014 FUJITSU

Use case: Financial Domain

Demo

Video

Page 30: Artificial intelligence meets big and open data

29 Copyright 2014 FUJITSU

LOD for all: Realising Data-utilization Services

LOD for all

http://lod4all.net/

Unified access to LOD across the world in a batch

Healthcare

Tourism

Financial

Open Government

Go to Demo#A14!

LOD Search Engine

Page 31: Artificial intelligence meets big and open data

30 Copyright 2014 FUJITSU

Hyper-connected World (IoT) Billions devices connected to Internet

Deep Leaning has the potential to realise Artificial Senses like human

Linked Data/LOD Linked data is not just data format but also

Description Logic based

• Todai Robot solves questions by using family logic (First-order predicate Logic)

LOD has the potential to be used as Web Scale Knowledge Base for Artificial Thought

• From machine-readable to machine-understandable

Need High Performance Computer

Future of Artificial Intelligence with Linked Data

Huge Graph High Complexity

Billions Devices Connected

Huge Graph High Complexity

I can understand First-order predicate logic

Web scale Artificial intelligence

Page 32: Artificial intelligence meets big and open data

31 Copyright 2014 FUJITSU