driving business value through agile data assets
TRANSCRIPT
Agenda
The Third Platform
The Data Imperative
Data In the Enterprise Today
The Data Tsunami
Getting the Data Under Control
Benefits to Having Well-Defined and
Managed Data
Conclusions/Recommendations
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 2
Toward the Third Platform
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 3
Distributed systems, accessible to non-technical
users
Data shared across systems, visual GUI access
Systems extended to the Web via static pages,
limited customer access to data and functions
The First
Platform
Fixed systems, statically defined data
Running on terminal systems, performing
back-office tasks, only accessible internally
The Second
Platform
The Third Platform
Bridging internal and external data
Large collections of data ingested
first, defined later.
Social data inclusion, mobile
device interaction.
Cloud services for elasticity.
Value delivered for new classes of
applications and data use (digital
transformation).
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 4
Source: IDC
From Static to Dynamic Data
Management
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 5
In a dynamic world… Data must change dynamically, or may originate externally,
but still requires definition.
Applications are coded in an event-driven manner, responding to stimuli, and, “learning” as they go.
Agility, adaptability, elasticity are required.
In a static world… Data is defined to suit application needs.
Applications are coded with fixed, serial processes.
No agility, no adaptability, and change is hard.
Agile, But Managed Data
New applications are emerging.• Web-based customer-facing applications accessing
databases.
• Applications that interact with, and coordinate app data on mobile devices.
• Applications that respond to sensor and other machine-generated data.
Existing applications need adapting.• Taking advantage of machine-generated data, social
media data, data from customers and partners.
• Blending analytic and transactional processing on a single database.
Both new and existing applications must be agile, so their data must be agile.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 6
Databases Are Changing
New data technologies for new workloads.• Hadoop – scalable but unmanaged.
• NoSQL – agile but without definitional formalism.
Existing data technologies are evolving.• Memory-optimized columnar data stores with SIMD
support for high speed analytics.
• Memory-optimized row or matrix data stores for high
speed transaction support.
• Late-binding schemas and agile schema support for
definition change without database restructuring.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 7
The Data
Imperative
Dangers of unmanaged data definitions:
• Poor data quality, leading to exponential
damage to business processes due to high
speed integration.
• Lack of knowledge about sensitive data,
leading to risk of contractual or regulatory
noncompliance.
• Duplicate, errant, or missing data-driven
processes due to poor understanding of the
data.
The process of digital transformation is
data-driven. The data must be well
understood.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 8
Data in the Enterprise Today
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 9
Most enterprises do not have a data governance
initiative.
Security definitions are fragmentary.
A lack of MDM leads to inconsistent and incomplete
views of key enterprise data about customers,
partners, products, etc.
Fragmented
Data is defined on an application-by-application
basis.
Select data is defined in ETL for purposes of data
movement.
Data warehouses have a select subset, the rest is
not managed at an enterprise level.
Ungoverned
The Data
Tsunami A huge wave of new data is coming fast.
• It’s not well defined.
• It’s high volume.
• It is critical to managing an agile business.
The formats vary.
• Some is XML. Some is CSV. Some is… who knows?
• Some is managed by web applications in JSON.
It needs to be ordered and interpreted, or
“curated”.
• All too often today, this is done by expensive data
scientists (not their job).
• Needs to be done by someone with an eye toward the
rest of the data in the enterprise.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 10
Getting the Data Under Control The Old Data Modeling Process
• Waterfall: driven by a well-defined sequential project plan.
• Driven by application specification.
• Slow, formal approach to model recursion.
• Models all to often left on the shelf after initial implementation.
The New Data Modeling Process• Agile: data is constantly examined and redefined.
• Data comes in, and then is interpreted.
• Data models must be designed to anticipate change.
• Models must also anticipate and support alternative forms of organization such as document (JSON, XML), wide column, etc.
• Target could be RDBMS, but also Hadoop, NoSQL, NewSQLdatabase, et al.
• Models should anticipate integration, and cross-system collaboration.
• Governance and security must be considerations from the start.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 11
Specify
Model
Implement
DeliverFeedback
CodeNeed
Model Implement
ReviseReview
Benefits of Having Well-Defined and
Managed Data
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 12
Both analytical and transactional systems adapt to changing business conditions and
new data.
Data sharing can be more informal, leading to greater insights through collaboration.
Agility
Well-defined data is easier to secure.
Knowing where the sensitive data is a key to proper protection from possible
compliance liability.
Lower Risk
When data is well understood and leveraged across systems, it can be better
exploited. This is a key to success on the Third Platform.
Adaptability means being able to take advantage of opportunities in the moment. Data
that is both transactional and analytical can enable smart applications.
More Business Opportunity
Conclusions/Recommendations
Conclusions
As businesses evolve toward the Third
Platform, they must be prepared to embrace
Digital Transformation.
This means being able to blend existing data in
new and unpredictable ways, and to leverage
new data on new data management
technologies.
It also means modeling data in ways that
support the above, while ensuring data
security, lowering risk, and enabling
exploitation of opportunities that this new class
of data will deliver.
Recommendations Take an audit of your existing data assets, and ask the
question, “How well do I know where my data is, and
what it means?”
Seek to define existing data through models, to ensure
its easy integration with other existing data sources,
and in preparation for new data sources.
Look at tools and utilities that will support both the
definition and modeling of existing data sources, and
data in places like Hadoop, NoSQL, NewSQL
databases, and so on.
Consider this an opportunity to leverage data
modeling to drive the enterprise to new levels of agility
and collaboration that will in turn ensure
competitiveness in the world of Digital Transformation.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 13
EMBARCADERO TECHNOLOGIESEMBARCADERO TECHNOLOGIES
Driving Business Value Through
Agile Data Assets
Ron Huizenga
Senior Product Manger – ER/Studio
EMBARCADERO TECHNOLOGIES
Agenda
• What’s happening with data?
• The new lifecycle
• Data landscape complexity
• Discovery & identification through models
– Specific capabilities
• What’s happening in reality?
• Concluding remarks
2
EMBARCADERO TECHNOLOGIES
3REFERENCES:
http://blog.qmee.com/wp-content/uploads/2013/07/Qmee-Online-In-60-Seconds2.png
http://techcrunch.com/2010/08/04/schmidt-data/
What’s Happening with Data?
EMBARCADERO TECHNOLOGIES
Key Skill Sets• Data Design & Management• ETL and Software Development• Data Analysis / Stats• Business Analysis & Discovery
Value Delivered• Validation• Integration• Enrichment• Usability
Value and the New Lifecycle
6
Discover
Document (Model)
Integrate
EMBARCADERO TECHNOLOGIES
Data Landscape Complexity
7
• Comprised of:
– Proliferation of disparate systems
– Mismatched departmental solutions
– Many database platforms
– Big data platforms
– ERP, SAAS
– Obsolete legacy systems
• Compounded by:
– Poor decommissioning strategy
– Point-to-point interfaces
– Data warehouse, data marts, ETL …Data Archaeologist?
EMBARCADERO TECHNOLOGIES
Discovery and Identification Through Models• Identify candidate data sources• Reverse engineer data sources into models• Identify, name and define• Classify through metadata• Map “like” items across models• Data lineage / chain of custody• Repository• Collaboration & publishing
8
EMBARCADERO TECHNOLOGIES
ER/Studio: Native Big Data Support
• MongoDB– Diagramming– Reverse & Forward Engineering (JSON, BSON)– MongoDB certification for 2.x and 3.0
• Certified for HDP 2.1– Forward and reverse engineering– Hive DDL
• Additonal MetaWizard capabilities for additional platforms
9
EMBARCADERO TECHNOLOGIES
ER/Studio: Apply naming Standards
• Can invoke with other wizards– General Physical Model– Compare & Merge– XML Schema Generation– Model Validation
• Can apply to model or sub-model at any time
• Either Direction• Selective review/apply• Enabled by loose model coupling• Name lockdown (freeze names)
11
EMBARCADERO TECHNOLOGIES
ER/Studio: Universal Mappings
• Ability to link “like” or related objects
– Within same model file
– Across separate model files
• Entity/Table level
• Attribute/Column level
12
EMBARCADERO TECHNOLOGIES
Increasing volumes, velocity, and variety of
Enterprise Data
30% - 50% year/year growth
Decreasing % of enterprise data which is
effectively utilized
5% of all Enterprise data fully utilized
Increased risk from data misunderstanding and
non-compliance
$600bn/annual cost for data clean-up in U.S.
Enterprise Data Trends
EMBARCADERO TECHNOLOGIES
Business Stakeholders’ Data Usage
19
Suspect that business stakeholders INTERPRET DATA INCORRECTLY
Yes, frequently
14%
Yes, occasionally
67%
No, never9%
I don’t know10%
Suspect that business stakeholders make decisions USING THE WRONG DATA?
Yes, frequently
11%
Yes, occasionally
64%
No, never13%
I don’t know12%
EMBARCADERO TECHNOLOGIES
Data Model Usage & Understanding
20
13%
3%
16%
19%
31%
18%
0% 5% 10% 15% 20% 25% 30% 35%
We don’t use data models
Other
Our data team does most datamodels but developers also build
them as needed
Our database administrators owndata modeling
Developers develop their own datamodels
We have a data modeling team thatis responsible for data models
What is your organization’s approach to data modeling?How well does your organization’s technology leadership team
understand the value of using data models?
Completely understand
20%
Understand somewhat
60%
Don’t understand
17%
I don’t know3%
87%
EMBARCADERO TECHNOLOGIES
Call to Action
• Audit, map and define existing data assets using models, with the capabilities discussed
• Share, collaborate, govern
• Leverage data modeling to enable business agility
• Adapt to the “new” lifecycle
• Instill a data culture based on a philosophy of continuous improvement
21
EMBARCADERO TECHNOLOGIES
Thank you!• Learn more about the ER/Studio product family:
http://www.embarcadero.com/data-modeling
• Trial Downloads: http://www.embarcadero.com/downloads
• To arrange a demo, please contact Embarcadero Sales: [email protected], (888) 233-2224
22