sap businessobjects data services...adabas isam vsam enscribe ims/db rms both direct & change...
TRANSCRIPT
Steve Hill – OEM Solutions Engineer
Wednesday, June 24, 2009
SAP BusinessObjects Data Services
© SAP 2008 / Page 2
Data Services Overview
Data Quality
Data Integration
Architecture Overview
OEM Deployment Scenarios
Object XML Toolkit
Agenda
Common Customer Challenges …
Challenge #1:
Where is the
data?
Challenge #2:
Why are there
errors in the data?
Challenge #3:
How current is this
information?
Challenge #4:
Where did this
information come from?
Disparate Data
Disparate
Businesses
Disparate
Applications
?
Every Successful Business And IT Initiative
Need To …
Integrate Businesses
Integrate Applications
Integrate Data
© SAP 2008 / Page 5
Data Services is both a platform and a
product It is the platform upon which all new IM
functionality will be based
Existing functionality, such as data
federation and profiling will be applied to it.
Our Single Platform addresses many
ETL/Data Quality capabilities that
have historically been “silo-based” - Profiling
Data Integration
Data Quality
Scheduling
Administration
Development
Metadata Management
Introduction to Business Objects
Data Services XI
© SAP 2009 / Page 6
SAP BusinessObjects Data Services
The first single tool for DI and DQ
One Runtime Architecture
One Development User Interface
One Metadata
Repository
One Administration EnvironmentOne Set of Connectors
Data Services
Improve
Deliver
Transform
Access
Runtime Architecture
Metadata
Repository
Development User Interface
Administration and Connectors
Runtime Architecture
Metadata
Repository
Development User Interface
Administration and Connectors
Data Integrator XI R2
Data Quality XI R2
Data Quality
© SAP 2008 / Page 7
Data Content errors Missing data Invalid data Out-of-date data
Data Inconsistency Multiple formats for same data
elements Different meanings for the
same code value Field overuse: used for
unintended purpose Data in filler
Error in migration (ETL) Normalization inconsistencies
Duplicate or lost data
Data Structure problems
Data Quality is the fitness for use of the dataCan you use the data or is it junk?
Data Quality – What is it?
Data Quality Framework
Increase the value of data
assets
Measure and analyze data
through data assessment and
continuous monitoring
Cleanse and enhance
customer and operational data
anywhere across the enterprise
Match and consolidate data at
multiple levels within a single
pass for individuals, households,
or corporations
Improve and automate the
delivery of direct mail and goods
Data Cleansing -- Customer data
Multi-line Input record Output record
Data parsed into individual
components
Maggie.kline@future_electronics.com
Margaret Smith-Kline phd
FUTURE Electronics
5/23/03
101 6th ave
Manhattan ny
10012
001124367
Salutation: Ms.
First name: Margaret
Last name: Smith-Kline
Postname: Ph. D.
Match standards: Maggie, Peg, Peggy
Gender: Strong Female
Company name: Future Electronics
Address 1: 101 Avenue of the Americas
City: New York
State: NY
ZIP+4: 10013-1933
Email: maggie.kline@future_electronics.com
SSN: 001-12-4367
Date: May 23, 2003
Casing and
standardization
Corrections
Enhancements
MATCHING AND CONSOLIDATION
Ms Margaret Smith-Kline Ph.D.
Future Electronics
101 Avenue of the Americas
New York NY 10013-1933
maggie.kline@future_electronics.com
May 23, 2003; E3 Stamping Machine
Name: Ms. Margaret Smith-Kline Ph.D.
Firm name: Future Electronics Co. LLC
SSN: 001-12-4367
Address: 101 Avenue of the Americas
City, State, ZIP: New York, NY 10013-1933
Latitude: 40.722970
Longitude: -74.005035
Fed code: 36061
Phone: (222) 922-9922
Email:
maggie.kline@future_electronics.com
Purchase history: 5/23/03; E3 Stamper, $1,300,000
10/21/04; A1 Injector, $520,000
6/30/05; C2 Fabricator, $23,000,000
Input records
Consolidated Record With
Child Purchase Records
Maggie Smith
Future Electronics Co. LLC
101 6th Ave.
Manhattan, NY 10012
maggie.kline@future_electronics.com
001-12-4367
30-6-2005; Fabrication Facility class C
Ms. Peg Kline
Future Elect. Co.
101 6th Ave.
New York NY 10013
001-12-4367
(222) 922-9922
10/21/04; Victory Injection MolderThen match across those records and
consolidate them into one record
Can parse multiple record structures and sources
Data Integration
© SAP 2008 / Page 12
Maximize Developer Productivity Single interface to design
Multi-Developer Collaboration
Deliver ETL Scalability Parallelism and intelligent distributed
processing Services-based architecture enabling
right-time data delivery Powerful prepackaged
transformations
Scalable, enterprise-caliber DI platform that allows organizations to
explore, extract, transform and deliver data anywhere so end users
have information that is accurate, timely, and trustworthy.
Data Integration
Deliver Trusted Information
Report to source data lineage
Validation firewall – profiling, validating, auditing,
and cleansing
Comprehensive Complete data integration functionality End-to-end BI integration Broad source and target support
Scalable, enterprise-caliber DI platform that allows organizations to
explore, extract, transform and deliver data anywhere so end users
have information that is accurate, timely, and trustworthy.
Data Integration
© SAP 2008 / Page 15© SAP 2007/Page 15
Enterprise-Wide Data Access
Broad connectivity to databases, applications, legacy, file formats, and unstructured data
Support for structured and unstructured data
Oracle
DB2
Sybase & IQ
SQL Server
Informix
Teradata
ODBC
MySQL
Netezza
JD Edwards
Oracle Apps
PeopleSoft
Siebel
Salesforce.com
SAP BI
SAP R/3
– ABAP
– BAPI
– IDoc
Text delimited
Text fixed width
EBCDIC
XML
Cobol
Excel
HTTP
JMS
SOAP(Web Services)
ADABAS
ISAM
VSAM
Enscribe
IMS/DB
RMS
Both direct & change data
Any text file type
32 languages
Databases Applications Files/Transport Mainframe(with partner)
Unstructured Data
© SAP 2008 / Page 16
Data Lineage Helps Users Make Confident
Decisions
Where did this number come from?
Data lineage provides information on how a number in your BI
report is calculated during the ETL process and its origin.
Data Integration Costs — Pay Now or Pay
Later
Startup Costs:
• Software licenses
• Training
• Hardware
• Consulting
Maintenance Costs:
• Changing business
requirements
• Growing complexity
• Re-architecting
CO
ST
Tool-Based:• ETL/Data Integration
TIME
Mostly Manual:• Homegrown ODS
• Hand-coded ETL
• Low-end replication
• SQL Generators TIMEC
OS
T
Approx. 75% of the effort to build and maintain a BI solution is in managing
the ETL process…….Gartner
The failure to manage this aspect of BI deployments is one of the primary reasons for unsuccessful Business Intelligence initiatives……TDWI
Architecture
© SAP 2008 / Page 18
© SAP 2008 / Page 19
Basic Architecture
At a high level, Data Services
comprises the following:
Job server
Access Server
Repository
Address Server
Dictionaries/Directories
Designer
Management Console
OEM Deployment
© SAP 2008 / Page 20
OEM Deployment Scenarios
On Premise Solution
Full Installation
Designer, Job Server, Repository…..
Module Installation
Pick and choose components
No need for designer
SaaS
Individual Repositories
Updates done when you need them
OEM Integration Points
Web Services
Real Time API
Command Line Interface
© SAP 2008 / Page 21
Improved ability to Embed into other
applications
Object Creation XML Toolkit
Allow creation of jobs/dataflows through an XML based representation of
Data Services objects
Clean-up from existing XML representation
New “export as XML” in Designer to generate sample XML
Web services support for importing, validating, deleting (+ compact
repository) and executing objects
Benefits:
Reduce maintenance costs
Reduced installation footprint
Flexibility in deployment scenarios
© SAP 2008 / Page 23
OPERATIONAL EXCELLENCE
Lowest Total Cost of Ownership
Complete flexibility to scale from
small pilot projects to large
enterprise-wide deployments
Provide dramatic reduction in costs
and resources for installing,
maintaining and support with just one
application (TCO)
Substantially accelerate product
proficiency with one easy-to-use
solution (TCO)
Radically simplify IT infrastructure
with one environment for security,
admin, development, and execution
(TCO)
Conclusion
Performance Better data availability timing.
Use different loading mechanisms.
Change data capture
Scalability Scale with additional CPU’s or additional servers
De-centralized configuration based on geography
Reusability & Version Control Pre-built Transforms & Functions
Reuse of ETL jobs
Check in / Check out
Collaboration
Error Recovery & Notification Job recovery and restart-ability
Send an email if a job fails
One Vendor, One Solution Leader in both ETL and BI/Reporting
Lower Total Cost of Ownership (TCO)