ying-jie chen/ sap taiwan july, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50%...

16
大資料時代的異質資料整合 Ying-Jie Chen/ SAP Taiwan July, 2012

Upload: others

Post on 24-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

大資料時代的異質資料整合

Ying-Jie Chen/ SAP Taiwan

July, 2012

Page 2: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 2

Legal Disclaimer

The information in this presentation is confidential and proprietary to SAP and may not be disclosed without

the permission of SAP. This presentation is not subject to your license agreement or any other service or

subscription agreement with SAP. SAP has no obligation to pursue any course of business outlined in this

document or any related presentation, or to develop or release any functionality mentioned therein. This

document, or any related presentation and SAP's strategy and possible future developments, products

and/or platforms directions and functionality are all subject to change and may be changed by SAP at any

time for any reason without notice. The information on this document is not a commitment, promise or legal

obligation to deliver any material, code or functionality. This document is provided without a warranty of any

kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness

for a particular purpose, or non-infringement. This document is for informational purposes and may not be

incorporated into a contract. SAP assumes no responsibility for errors or omissions in this document, and

shall have no liability for damages of any kind including without limitation direct, special, indirect, or

consequential damages that may result from the use of this document. This limitation shall not apply in

cases of intent or gross negligence.

All forward-looking statements are subject to various risks and uncertainties that could cause actual results

to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-

looking statements, which speak only as of their dates, and they should not be relied upon in making

purchasing decisions.

Page 3: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 4

Big Data – 大量的異質性資料如何處理 ? - 現況是 … 八仙過海, 各顯神通

9 out of 10 organizations use relational databases

-- 93% are considering other options

63% are exploring in-memory databases

50% are exploring columnar databases

50% are exploring Hadoop

The Challenge of Big Data Benchmarking Large-Scale Data Management, Ventana Research, January 2012

Page 4: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 5

關聯

Present

處理

Process

儲存

Store

吸收

Ingest Kafka Flume

Scribe

Azkaban Oozie Pig Hive

Hadoop MapReduce S4 Storm

Voldemort Cassandra Hbase

Big data applications ?

因應 “Big Data” 挑戰的解決方案

- Open-source 軟體的戰國時代 ?

Page 5: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 6

SAP real-time data platform

MP

P

sc

ale

-ou

t

Open developer APIs and protocols

Co

mm

on

lan

dsc

ap

e m

an

ag

em

en

t

Ap

ach

e H

ad

oo

p

3rd P

arty

DB

SAP Solutions for Enterprise Information Management

SAP Sybase

Replication Server SAP Data

Services

SAP HANA platform

SAP Master Data Governance

SAP Master Data Management

SAP

Sybase

IQ

SAP

Sybase

ASE

SAP Sybase SQL

Anywhere

SAP Sybase Event

Stream Processor

Co

mm

on

m

od

eli

ng

SA

P S

yb

as

e P

ow

erD

es

ign

er

3rd Party

BI Client

SAP NetWeaver (On Premise / Cloud)

SAP Business

Suite

SAP Business

Warehouse

SAP Big Data Applications

SAP Analytics

SAP Mobile

Custom Apps

SAP Solutions for EIM

關聯

Present

處理

Process

儲存

Store

吸收

Ingest

SAP Real-time Data Platform

- Open Source 與 Enterprise 軟體各取所長的混合式架構

Page 6: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

EIM = Air Traffic Control For Data

Page 7: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 8

MOVE

IMPROVE

UNLOCK

GOVERN

One Runtime Architecture &

Services

Business UI (Information Steward)

Unified Metadata

Technical UI (Data Services)

SAP BusinessObjects Data Services

ETL

Data Quality

Profiling

Text Analytics

One Administration Environment

(Scheduling, Security, User Management)

One Set of Source/Target Connectors

First and only, all-in-one solution for Data Integration, Data Quality, Data

Governance and Unstructured Data Processing.

集成全面資料管理需求的一條龍解決方案 - SAP BusinessObjects Data Services 單一平台

Page 8: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 9

Scorecard to

measure DQ from

a Data Steward’s

perspective

Key Quality

Dimensions (KPI

for data)

Drill into scorecard

details

Data quality score metrics

Latest quality score

Quality trend

與 ETL 轉檔集成的資料品質儀表板 - 各種轉檔資料的數量、品質 KPI、趨勢分析

Page 9: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 10

SAP Data Services 資料治理平台

BO Universe

Target DM

10

銀行

OS390

保險

ODS

DW ETL ETL

BO Universe BO Deski/Webi 報表

SAP BO BI Platform

Sybase PD

Modeling

Target DM

信用卡

/基金

AS400

ERP, CRM

文字報表 / 交換檔

ETL & Data Source

端到端的資料治理解決方案

- 包含從前台交易系統到報表層全程的資料字典萃取

Page 10: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 11

11

資料沿襲與衝擊分析 - 快速與立體化查詢報表關聯

Page 11: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 12

企業業務字彙 Wiki - 報表欄位 、KPI 業務定義與技術規範, 全域關聯與搜尋

Page 12: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 13

8月2日 日期

台北 城市名

重大火災事故, 刑事處罰,事故原因, 名詞

11•15, 6起, 一, 26名,50萬,5人 數字

高偉忠,姚亞明, 黃佩信 人名

地方法院,國營企業 政府機關

工程 其他

8月2日, 台北市 "11.15"特別重大火災事故相關 6 起刑事案件作出一審判決,

台北地方法院分別判處高偉忠等 26 名被告有期徒刑 16 年至免予刑事處罰。

法院認為高偉忠、姚亞明等人濫用職權是造成事故原因之一, 法院同時查明,

2004 年至 2010 年期間, 高偉忠等人利用職務便利幫助他人承包工程等, 收

受賄賂, 其中高偉忠受賄 50 萬餘元, 周建民受賄 60 萬元, 黃佩信、馬義鎊利

用在國營企業中從事公務的職務便利幫助他人承包工程等, 分別受賄 250 萬

餘元和 360 萬餘元。此外, 支上邦等 5 人為承包工程還向他人行賄。

由非結構化文字資料萃取關鍵訊息 - 內建多種語言的文字分析引擎

Page 13: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 14

106台北市信易路五段2306號四十五樓102室

106-台-北-市-信-易-路-五-段-2306-號-四-十-五-樓-102-室

106-台-北-市-信-義-區-信-義-路-5-段-2306-號-45-樓-102-室

PostCode:106

Country: 台灣

Region: 台灣

Reg Desc: 省

Locality1: 台北

Loc1 Desc: 市

Locality2: 信義

Loc2 Desc: 區

Last line process

Street Name: 信義路5段

Street Type: 路

Street Num: 2306

Num Desc: 號

Building Name: 幸福大厦

primary address process

Floor Num: 45

Floor Desc: 樓

Unit Num: 102

Unit Desc: 室

Secondary address process

Work Break

Normalization

Search & Match

街道地址清理與歸戶 - 結合定位服務與電子地圖, 強化企業營運優勢

Page 14: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 15

3

關聯與消費 3 On Demand

Services

收集與儲存 1 分析與處理 2

2

1

SAP Data Services – Hadoop 適配器 - 將非結構訊息關聯既有企業應用, 加速 Big Data 價值實現

Page 15: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

© 2012 SAP AG. All rights reserved. 16

計算與定義是否符合最

新法規、主管機關規定、

客戶要求與公司規範 ?

網站訪問的銷售分析

維度是否具有足夠代

表性 ?

資料的時效性如何 ?

他們的品質符合標準

嗎 ?

這些數據來自哪些原

始交易 ? 它們的關聯

性正確嗎?

重拾對於企業報表的信心 - Big Data 需要新一代的 SAP EIM 企業資料治理平台

Page 16: Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012  · 63% are exploring in-memory databases 50% are exploring columnar databases ... SAP HANA platform SAP Master Data Governance

Thank you

Contact information:

Ying-Jie Chen

[email protected]