intro to marklogic data hub architecture · 13 june 2019© marklogic corporation pete aven. senior...
TRANSCRIPT
13 June 2019© MARKLOGIC CORPORATION
Pete AvenSenior Principal Solutions Engineer
@peteaven
Intro to MarkLogic Data Hub Architecture
Data is (still) in silos
MarkLogic Data Hub
ADVANCED SECURITY
SMART CURATION
UNIFIED PLATFORM
LOAD DATA AS IS SIMPLE DEVELOPMENT
KEY PRINCIPLES
Agility in Action Data Services First
Business Answer First
Expect and Embrace Change
Governed by Default
Deploy Anywhere
DATA AGILITY
Data Services First Minimize up-front work by
focusing on business valueand working back to data
Reduce execution risk with aggressive scoping, frequent iterations, continuous feedback
Increase returns on cumulative data
Direct analogue and enabler for Agile software development
Focus on Ease and Impact (ROI)Ea
se o
f Exe
cutio
n
Business Impact
Find Orders
High impact
Predict fraud
LEADING WITH BUSINESS VALUE
Flag Fraud
The Data Hub In Action
ERP
CRM
Taxonomy
Inventory
places
includes
Customer
Order
Product
wheresMyOrder
Flag Fraud
Segregate Cohorts
DATA SERVICES FIRSTI
MarkLogic Architecture
STORAGE LAYERScalability and Elasticity
ACID Transactions
INTERFACE LAYER
Data ServicesJSON, XML, RDF, Geo, Text, Binaries
RESTAPI
Graph / SPARQL
QUERY LAYER
JS XQuery SPARQL
JavaScript XQuery SPARQLSQL
INDEXES Universal Index
Geospatial Index
Triple Index
AutomatedFailover
ReverseIndex
DATA LOGIC
Expect and Embrace Change
Upstream
Quality and Meaning
New Sources
Messy or unexpected data
Ambiguous or conflicting definitions
Downstream
Business Requirements
New opportunities enabled by creative reuse
Get value sooner
Experiment with less cost
Everywhere
Compliance and Governance
New regulations and enforcement
Increased threats
Sharing not hoarding
ERP
/wheresMyOrder
Customer
Collection:/acme/customers
Hierarchical, sparse, high cardinality
Precise structure to free text
Change the data, change the schema
Standard JSON or XML, text, binary
Documents Represent Data More Naturally
DATA MODEL
ERP
/wheresMyOrder
Customer
Collection:/acme/customers
Document Data Model Load as is Universal indexing
Values Full text Structure Scalar ranges Geospatial
Schema on read Organize by collections,
directories
ERP
/wheresMyOrder
CRM eCommerce
Customer
Customer
prov:derivedFrom
prov:generatedBy
rdf:type
rdf:type
prov:wasRevisionOf
/wheresMyOrder
e7e9879a…
?
Customer
Order
Product
Purchased
Places
Includes
RelationshipsGRAPHS
Entities are documents
Relationships are triples
- Entities related to Entities
- Entities related to Facts
- Facts related to Facts
Infer new relationships
Derived fromType
Type
PII
SSN
Semantic RelationshipsGRAPHS
Entities are documents
Relationships are triples
- Entities related to Entities
- Entities related to Facts
- Facts related to Facts
Semantic RelationshipsGRAPHS
Entities are Documents
Relationships are triples
- Entities related to Entities
- Entities related to Facts
- Facts related to Facts
73fa4dc0…
Is Concept
Same As
SKOS
Customer
30d623ff…
,
Order Product
acme:includesacme:places
73fa4dc0…
isConcept
rdf:type
rdf:type
acme:purchased
e7e9879a…
acme:powerOfAttorney
0.
1.
rdf:type
prov:generatedBy
Governed by DefaultPUTTING THE “MS” BACK IN DBMS
Manage policy along with the data and metadata that it governs
Query that policy just like data to make enforcement model-driven
Automatically enforce policy in the database
Track lineage as data and policy change
Developer
Ummm,Can you repeat that please?
Domain Expert
Domain Expert
MODEL-DRIVEN
Data, Metadata, and Policy Model important business concepts as
needed (and not before)
Manage policy along with the data and the metadata it governs
Drive business processes and configuration from queryable policy definitions
Customer
e7e9879a…Secure by DesignGOVERNED BY DEFAULT
Confidentiality: Role-based access control and encryption at rest, in motion
Integrity: Transactional consistency and auditable trustworthiness
Availability: Elastic scale out and HA/DR
Deploy AnywhereAGILE INFRASTRUCTURE
Align infrastructure costs with SLAs using elastic scaling
Avoid lock-in with flexible cloud and on premise deployment
Reduce risk with automation and componentization
MarkLogic in Any CloudCLOUD NEUTRAL
• Proven in the cloud
• Private, hybrid, or public cloud
• AWS, Azure, and Google Cloud (and others)
• Deployment automation
CUSTOMER LDAP
CUSTOMERVPC
VPCPEERING
INGESTION & CURATION ACCESS
SERVICE VPC
(CUSTOMER ISOLATED)
LOAD BALANCER
LOAD BALANCER
Data
D-NODES
Operational AnalyticalCuration
LOAD BALANCER(8010-13-8000)
MarkLogic Tools
DMSDK MLCP REST API
MarkLogic Data Hub Service
Much More Than a Database as a Service
On-Premises
DATA CENTERS
NETWORKING
STORAGE
SERVERS
VIRTUALIZATION
OS
DOCUMENT DB
GRAPH DB
RELATIONAL DB
SEARCH
ETL
MDM
SECURITY
APPS
DATA CENTERS
NETWORKING
STORAGE
SERVERS
VIRTUALIZATION
OS
DOCUMENT DB
GRAPH DB
RELATIONAL DB
SEARCH
ETL
MDM
SECURITY
APPS
DATA CENTERS
NETWORKING
STORAGE
SERVERS
VIRTUALIZATION
OS
DOCUMENT DB
GRAPH DB
RELATIONAL DB
SEARCH
ETL TOOLS
MDM
SECURITY
APPS
DATA CENTERS
NETWORKING
STORAGE
SERVERS
VIRTUALIZATION
OS
DOCUMENT DB
GRAPH DB
RELATIONAL DB
SEARCH
ETL TOOLS
MDM
SECURITY
APPS
IaaS DBaaS Data Hub Service
The most comprehensive out-of-the-box cloud service stack
HARMONIZATION HARMONIZATION HARMONIZATION HARMONIZATION
Tool Chain
KEY PRINCIPLES
Agility in Action Data Services First
Business Answer First
Expect and Embrace Change
Governed by Default
Deploy Anywhere
Thank you