whitepaper anzo smart data lake -...
TRANSCRIPT
WHITEPAPER
Anzo Smart Data Lake™ Enterprise Graph-Based Data Discovery, Analytics and Governance
2 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
Introduction Cambridge Semantics, The Smart Data Company™, is an industry
leader in semantic standards and graph-based technology solutions.
We have combined scalable graph-based database technology with
our proven Anzo Smart Data Platform™.
The result – the Anzo Smart Data Lake - going beyond the rigid
relational data warehouse and the unwieldy Hadoop only Data
Lake; disrupting the way IT and business alike manage and analyze
data at enterprise scale with unprecedented flexibility, insight and
speed.
Read this whitepaper to learn how Cambridge Semantics has
changed the game of data discovery, analytics and governance
for the enterprise, to provide:
An unlimited enterprise graph so all enterprise users can “surf”
and query all their data intuitively and without specialized data
analytics knowledge
A semantic data model that easily captures and delivers the
“meaning” of data with all the inherent, relationships and
attributes
Ad hoc data discovery and analytic tools so business users in
any department can get answers to questions, as well as,
generate questions they didn’t think to ask before
Democratized Big Data, so essentially everyone can now
discover and analyze all the data - applying governance,
security and flexible policies
A rapidly deployable platform to integrate with existing Hadoop
or other Data Lake environments or start from the ground up
Linked and contextualized data, so users are now able to self-
help and combine data as needed to support business functions
Anzo Smart Data Lake 3 © Copyright 2015, Cambridge Semantics
Current Data Warehouse and Data Lake Approaches to Analytics Relational data warehouses continue to be the predominant
approach to organizing information for analytics and decision
support. Based on proven technology and governance methodology,
they offer IT a way to deliver solutions with predictable resourcing,
time and cost. Despite their ubiquity and effectiveness, however, as
data volumes and diversity grow, the time and cost of the warehouse
is becoming increasingly infeasible for the majority of analysis and
decision support requests brought to IT from the business – a great
deal of valuable insight is left on the table, and time to market
opportunities suffer.
The ever-increasing variety of alternate approaches is evidence of the
urgency of the industry to arrive at a better solution – as well as the
stark reality that we aren’t even close. Alternatives like Hadoop, No-
SQL and other Big Data approaches are, however, beginning to
converge under the banner of the “Data Lake” - generally defined as
a limitless repository for all data resources, with little up-front
preparation and effort by simply storing the data in its original
format.
To understand their limits, it’s worth pointing out that Hadoop, with
origins in Internet search, was designed to solve narrow, pre-defined
problems on homogeneous data at Web-scale. (Take for example, the
conceptually simple, yet computationally complex problem of
ranking and indexing the world’s Web pages.) It should therefore
not come as a surprise that the technology is not immediately
suitable for the conceptually difficult and broad analytic challenges
enterprises face with their heterogeneous data.
And yet, because of its low cost to scale, Hadoop continues to be the
platform of choice for building a Data Lake - usually taking one of
two forms:
Bespoke search and analytic applications built on Hadoop
Raw data extracts co-located in a Hadoop cluster
Despite their ubiquity and effectiveness, as data volumes and diversity grow, the time and cost of the warehouse is becoming increasingly infeasible for the majority of analysis and decision support requests brought to IT from the business.
“
4 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
The first approach has the potential to provide quality, scalable
solutions to solve a specific problem with custom modeling, ETL and
development efforts as its relational brethren……aspirational. The
second is the more common, and risky. Even in its aspirational state,
this Data Lake has lured CIOs and CDOs into the misunderstanding
that having data in one place is a facilitator to broader and more useful
analytics leading to better decision-making and better outcomes. While
the data may be in one place physically (a Hadoop cluster), in essence
all that is created is a collection of data siloes, unlinked and not useful
in a broader context, reducing the Data Lake to nothing more than a
collection of disparate data sources. (i.e., a “Data Swamp”)
While IT departments no longer have to spend time developing
models and programming ETL with this Data Lake approach,
the burden of organizing and merging data has been shifted
onto the shoulders of those least equipped to deal with the
problem: data scientists, analysts and subject matter experts.
Even with current approaches, these valuable resources are
spending an estimated 50% to 80% of their time preparing and
organizing their data and only 20% of their time analyzing it –
any solution that increases this burden is not viable.
Where does this leave us? In the “Old Country”, is the Data
Warehouse – tried and true yet with a cost and inflexibility putting the
approach out of reach of many business problems, particularly with
all the new data sources available In the “Wild West”, is the Data
Lake, a disorderly collection of out-of-context data sources loaded into
a mismatched technology – bereft of governance or reusability.
The lack of required data preparation (storing data in its original
format) is the cornerstone of industry analysts’ definition and
benefit of the Data Lake. But this very characteristic, the lack of
preparation, is what makes the Data Lakes difficult to use for
deriving insights and ultimately, value .
Could there be a way to the make the Data Lake smarter;
utilizing its cost benefits but eliminating its shortcoming?
Cambridge Semantics has discovered that by taking a less literal
interpretation of “original format”, the Data Lake can indeed be
made smart enough to deliver exceptional value, on demand.
While the data may be in one place physically (a Hadoop cluster), in essence all that is created is a collection of data siloes, unlinked and not useful in a broader context, reducing the Data Lake to nothing more than a collection of disparate data sources.
“
In the Wild West, we have the “Data Lake”, a disorderly collection of out-of-context data sources loaded into a mismatched technology – bereft of governance or reusability.
“
Anzo Smart Data Lake 5 © Copyright 2015, Cambridge Semantics
The Semantic Graph Model Approach to Data Discovery, Analytics and Governance A semantic model, more formally an OWL ontology, is a
conceptual description of data in a RDF graph offering users
at all levels a road-map to navigate the data, pose questions
and execute analytics. Semantic models are flexible and are
designed to be conceived and maintained at all organizational
levels:
Industry standards groups
Corporate governance
Departmental best practices
Individual models
Organizations can start small with their semantic models and evolve
them as business needs change or new data sources are required.
RDF, a graph representation of data comprising a network of nodes,
attributes and relationships, is inherently flexible, allowing new data
sources to be integrated without having to redesign the
representation.
The RDF data standard was designed to capture all relationships
and attributes of diverse data sources, to faithfully and completely
represent data.
Together, the RDF graph and the OWL model offer a natural way to
link information from disparate sources without having to know
what types of questions will be asked.
While it may be helpful to use the analogy, RDF is to Relational
records, as OWL is to Relational Schema, the semantic approach
offers several key advantages over the relational approach:
Flexibility to evolve the model to accommodate changes or new
sources
A conceptual representation for easy consumption by business
users
A unified model spanning all layers of the analytics stack
A framework for sharing standards across organizations
6 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
While the Big Data evolution unfolded over the last decade or so,
engineers at Cambridge Semantics have honed the art and science of
applying semantic graph models to real business analytics and data
discovery problems.
Semantic graph models target key challenges of data integration and
analytics:
Flexibly adapting to new data sources and queries requires a
data representation and model that can gracefully evolve to
accommodate new data, as well as link data from disparate
sources
Accurately capturing the full meaning of data with a format that
does not lose any of the inherent, relationships or attributes of
the data
Quickly asking new questions and performing ad hoc analytics—
without having to engage IT each step of the way
Consider the challenge financial institutions face in tracking-down
and investigating potential insider trading activity within the firm.
Looking at the list of an employee’s trades, for example, does not
paint a broad enough picture. Additional data sources such as watch
lists of companies, employee email and IM, research reports, news,
and even location must be combined, analyzed and explored. By
establishing a unified semantic model across these sources, and
bringing the data together in a graph, we can follow the
relationships in any direction without knowing up-front the types of
questions required. We can answers questions such as:
Anzo Smart Data Lake 7 © Copyright 2015, Cambridge Semantics
Which employees have made trades in the same location after exchanging
emails?
Which securities have been traded within 2 dates of a related research
report?
Are any deal team members trading off their own watch list?
Now suppose we want to ask the question
Have any traders engaged with industry experts?
We simply extend the semantic model and load relevant data into
our graph.
8 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
On the surface these models bear strong resemblance to entity
relationship diagrams or other modeling techniques used in
conjunction with relational data warehouses. So what’s different?
In the relational world, such a model would require translation to a
relational logical model – schema and tables carefully constructed by
database experts with indexes to optimize sets of known or
anticipated questions. Posing such questions requires translation
into SQL queries with joins and optimization – out of the reach of
business users and even most data scientists.
A semantic graph model, on the other hand, requires no such
translation. The data is stored exactly in the way it is modeled – the
way business users think - allowing questions to be asked and new
hypotheses explored on the fly.
Building the Graph How does this all really work in practice? The semantic graph model
is, after all, only a data representation. Building and maintaining
graphs can be challenging, particularly when the data sources are
multi-structured and diverse. But making this all actionable requires
a sophisticated semantics-based platform. Consider the following
example from Pharma R&D Intelligence.
A semantic graph model, on the other hand, requires no such translation. The data is stored exactly in the way it is modeled – the way business users think - allowing questions to be asked and new hypotheses explored on the fly.
“
Anzo Smart Data Lake 9 © Copyright 2015, Cambridge Semantics
A decision maker is trying to track activities of small biotech
companies in his area of expertise. The data sources are a relational
database, a news feed and a CRM system. Creating such graphs
from the source data requires sophisticated technology to build the
model, map to multiple sources and ingest the data.
The Anzo Smart Data Platform, an end-to-end suite for linking and
contextualizing multi-structured data into semantic graphs, is built
following a service-oriented architecture (SOA).
The platform includes tools for:
Modeling and Governance
Managing and versioning models, ontologies
Access control and security
Ingestion
Loading data from disparate sources (ETL)
Linking and transforming content across sources (ELT)
Text analytics
APIs and connection points to integrate with external tools and
systems
Graph-aware Analytics – a new paradigm in data discovery The incredible benefit of the graph model continues past data
integration and right into the data discovery and the analytics front-
end of the stack – yielding perhaps the greatest differentiator of the
approach.
The BI and analytics landscape is replete with tools – each with a
different set of capabilities for user empowerment and offer slick
visualizations. However all of these tools require significant data
10 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
preparation and data movement to work with existing Data Lake
approaches.
Rectangular subsets or data frames must be defined and extracted
before these BI tools can be effective. Building these extracts from a
swamp of disconnected raw data is technical, time-consuming, and
error-prone. If the requirements change or new questions are asked,
further extracts must be prepared involving additional work for the
data scientist and IT, often to the point of impracticality.
To understand how the semantic graph model shatters this glass ceiling,
let’s take a look at an actual model used in R&D Intelligence. A clinical
trial has related concepts including disease, phase of development,
organization and country. With this model, decision makers can ask
questions and create visualizations around the current clinical trial
landscape such as:
What Phase II clinical trials are being run in Japan for Ovarian Cancer?
However, what if we want to explore further and discover related
information.
Who are the key investigators in a particular region of Japan?
What trials are focusing on injections vs. oral medication?
To answer these questions, additional information is required. With
traditional BI tools, work must be done to discover, join and extract the
appropriate data set. With the graph model, all related data is
Anzo Smart Data Lake 11 © Copyright 2015, Cambridge Semantics
immediately available for data discovery and analytics. The data
scientist can explore the entire model to include any connected
information in the analysis.
To deliver this extraordinary potential to end users, Cambridge
Semantics built graph-awareness into Anzo on the Web – the data
discovery and analytics front-end of the Anzo Smart Data Platform.
Instead of relying on rectangular extracts of data - analysts can create
tables, filters, charts and visualizations by intuitively exploring paths
through the full model, applying filters to refine what specific data is
relevant This approach combines data discovery and analytics with
speed and agility – arriving at answers to new and ad hoc questions
quickly and without requesting support from IT.
12 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
Anzo Smart Data Lake Technical Overview Driven by the success of the Anzo Smart Data Platform, Cambridge
Semantics’ customers are increasing the size and scope of their
sources. For example, bringing together much larger data sets than
can be handled by single-server architecture. Rising to the challenge,
Cambridge Semantics has married Big Data scale with flexible graph
-based middleware. The result is the Anzo Smart Data Lake (Anzo
SDL) - a flexible and scalable knowledgebase for data discovery,
analytics and governance.
Born from the market’s growing thirst to deploy our proven graph-
based approach at enterprise Data Lake scale - Anzo SDL brings an
authenticity and fresh approach to the theater of Big Data
representation. Anzo SDL stores data with its full original meaning
and context, although requiring a bit more preparation on ingest, but
orders of magnitude less effort to derive downstream value.
Anzo SDL introduces three elements of scale to the Anzo Smart Data
Platform (SDP):
Unbounded storage and cataloging of RDF graphs
Parallelizable and rapid ingestion and linking of data sources
An interactive Graph Query Engine
Further since Anzo SDP is built on a services-oriented architecture,
Anzo SDL enables the unbundling of components for distributed
deployment.
Cambridge Semantics’ customers are deploying Anzo Smart Data
Lake to work with and leverage existing Hadoop Data Lake
environments as well build new Data Lakes from scratch.
Graph Storage and Cataloging
Anzo SDL uses highly scalable and available file systems such as
HDFS for storing the graph data at rest. Anzo Smart Data Lake
Anzo Smart Data Lake (Anzo SDL) - a flexible and scalable knowledgebase for data discovery, analytics and governance.
“
Anzo Smart Data Lake 13 © Copyright 2015, Cambridge Semantics
Server has local transactional graph storage containing the catalog of
models, data sets, mappings, analytics and other configuration used
throughout Anzo SDL.
The server provides:
A power-user workbench for configuring models, ingestion and
linking
Cataloging and metadata management of Anzo SDL graph data
as well as data sources outside Anzo SDL - including Hadoop
data sources
A data scientist/analyst entry-point for data discovery and
analytics
Provisioning and configuration of all other Anzo servers in Anzo
SDL for elastic cloud deployment
Security and access control
High availability and failover
14 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
Integration – Ingestion and Linking
Anzo Smart Data Integration is the toolset within the Anzo Smart
Data Platform for mapping and transforming data from all sources
into RDF graphs. Driven by the semantic model, these scalable
servers convert data from all formats, structured and unstructured
into the RDF graph format. An appropriate number of servers may
be deployed to accommodate the number of sources and total
volume of incoming data, including automatic incremental updates.
Depending on the nature of each of the data sources, one or more of
the techniques will applied:
Mapping and transformation of structured or tabular data
Text analytics, converting unstructured data to structured graphs
Custom plugins for data sources with APIs or proprietary
formats
High performance mapping and transforming using Apache
Spark to bring your existing Hadoop data into Anzo SDL
Maintaining the enterprise semantic graph at scale also presents a
modeling and governance challenge. Anzo SDL must accommodate
all sources, retaining models that are both true to the data, as well as
linked and contextualized to support query and analytics across
sources. Cambridge Semantics has developed methodologies and
tooling for organizing the enterprise graph. One such methodology
is the canonical linking model – graph models that link across
sources and take on configurable characteristics of the sources.
Canonical models also maintain provenance of each source’s
contribution to the canonical representation.
The methodology allows:
A scalable Data Lake with thousands of interconnected data sets.
Multiple canonical models (“versions of the truth”) for different
business applications – democratizing the modeling
Anzo Smart Data Lake 15 © Copyright 2015, Cambridge Semantics
Well-described, widely reusable data sets
High performance linking and transformation at scale based on
Apache Spark technology
With these approaches, new data can be quickly loaded into the Data
Lake, and links can be created across sources. While IT governance is
a key element of maintaining the enterprise graph in Anzo SDL, the
model-driven tooling enables new classes of users including
business analysts and data scientists to become data stewards –
participating in the process of filling the Data Lake.
Data Discovery and Analytics
The Anzo Discovery and Analytics Servers allow users to perform
data discovery and analytics across the large enterprise graph within
Anzo Smart Data Lake. The earlier mentioned Smart Data Lake
Server allows analysts to discover data sets across the enterprise
graph and combine them for interactive analytics in the Anzo
Discovery and Analytics servers. Analytics servers and cluster nodes
may be spun-up and down based on user demand.
Cambridge Semantics has developed methodologies and tooling for organizing the enterprise graph.
“
16 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
A key module of the Anzo Discovery and Analytics servers is Anzo
on the Web, users can configure search and visualization dashboards
with valuable views, analytics and insights. These configurations are
maintained in the analytics servers while active, but stored centrally
in the catalog for sharing and collaboration.
The Anzo Graph Query Engine is the key element of scale in the
Anzo Smart Data Lake. Based on elastic clustered, in-memory
computing, this component offers interactive ad hoc query and
analytics on datasets with billions of triples. With this powerful layer
over the RDF storage, end users can effect powerful analytic
workflows in a self-service manner.
On a browser like web interface the Smart Data Lake catalogue can
show not only the typical ways different data sets can be linked and
joined or are conceptually connected, it can even recommend other
datasets or even dashboards that you haven’t considered.
When data or a dashboard is selected, the in-memory graph
processing engine is loaded, reading in currently up to six million
“triples” or facts per second from the Anzo Smart Data Lake into
vast in-memory graphs that contain billions of facts available to be
simultaneously queried.
Once loaded, the data in the in-memory graph engine can be
interactively analyzed and traversed in any direction because of the
support for blazingly fast pipelines including numerous joins. That
would be near impossible in a relational database without a great
deal of prior schema structuring and query preparation.
This clean process of discovering and combining data analytics is
near instantaneous when compared with other Data Lake
approaches that require tedious mixing and matching of unprepared
and unlinked data sets for use in BI tools.
Anzo Smart Data Lake 17 © Copyright 2015, Cambridge Semantics
Anzo Smart Data Lake provides a unique, graph-aware data
discovery and analytics experience, enabling users to quickly drill-
down and analyze large, combined data sets. Results of this analysis
can be visualized and displayed within Anzo on the Web or
exported on-the-fly into external BI and reporting tools using open
protocols including OData and SPARQL.
Anzo Smart Data Lake - Time to Value The driving force behind enterprise data analytics is the desire to
obtain valuable insights more quickly from large, diverse data sets.
IT groups are now facing a trade-off. The data warehouse has a
lengthy initial implementation, and its lack of flexibility means new
questions cannot be quickly asked nor new insights quickly
discovered. The conventional Data Lake can be deployed quickly,
but the savings in data preparation and modeling is dearly paid for
later when analysts and data scientists approach the system to ask
questions and analyze data - finding they have significant work to
do. Well-conceived and constructed Hadoop-based point solutions
offer a middle ground, but on the same value curve.
The Anzo Smart Data Lake, by introducing a simple, graph-based
data representation, transcends this trade-off curve. Because RDF is a
“lossless” data representation, full data sets need only be loaded
once, regardless of anticipated (or unanticipated) use. For this one-
time cost to load data into the RDF graph representation, data
scientists enjoy self-service, on-demand, immediate reuse and
combination of data for any set of questions or analysis.
The big question is then, how high is the one-time cost of data
modeling and data ingestion? RDF itself is simple – building and
maintaining an enterprise-scale RDF graph does take effort.
Fortunately, Cambridge Semantics has 100’s of man-years of
research, engineering and field experience creating and linking RDF
from diverse sources. Not only Cambridge Semantics’ teams, but
also our customers and partners are able to use our tools and
This clean process of
discovering and combining
data analytics is near
instantaneous when
compared with other Data
Lake approaches that
require tedious mixing
and matching of
unprepared and unlinked
data sets for use in BI
tools.
“
18 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
methodologies to quickly load data into Anzo Smart Data Lakes and
reap near immediate value.
Governance
The Anzo Smart Data Lake, a disruptive capability that allows
groups to combine and query data from across the enterprise using
ad hoc models, requires organizations to reconsider governance
from a new perspective. A careful program of flexibility and reuse
balanced with methodology and controls will ensure that access
control, security, full data lineage or provenance and data context
are all preserved.
The tooling and methodologies within the Anzo Smart Data
Integration toolkit were designed for this type of governance -
insuring that proper modeling and linking practices are preserved
without limiting the expressivity of the models. Mappings to source
systems and linkages between data sets are created with provenance
for trust and traceability.
Anzo Smart Data Lake offers a platform on which organization
specific policies can be layered with appropriate roles for
stewardship and review. Analysts and data scientists rapidly
uncover insights and decision makers have confidence in those
insights – the crucial last step in the realization of value.
Anzo Smart Data Lake 19 © Copyright 2015, Cambridge Semantics
Industry Perspective
Companies in Pharmaceutical, Life Sciences, Financial Services,
Retail industries, and Government Agencies, are seeking ways to
make the full extent of their data more insightful, valuable and
actionable.
The following Pharma and Financial Services examples are related to
two different markets noted for complex data requirements.
Pharma
The data problems in Pharma range from the traditional - sales
forecasting and supply chain management, to the deeply scientific –
genome sequencing and assay result analysis. While practitioners on
the ends of this spectrum will find value with the graph-based
approach of Anzo Smart Data Lake as the technology proliferates
within their respective organizations, they are not the early adopters.
As the focus of big pharma has shifted from laboratory and basic
research to strategic partnerships, clinical development and medical
relationships, it is the knowledge management groups who mix
science with business within Pharma R&D who have been first to
adopt the approach. Tasked with combining and analyzing scientific
-rich data and presenting the results for making critical business
decisions, these non-technical, bench-turned-data scientists have
been winning with Anzo Smart Data Platform from its earliest days.
As the size of complexity of data sources have grown, these
customers represent the key drivers behind the scale of Anzo Smart
Data Lake.
Pharma R&D Intelligence
Competitive intelligence professionals combine internal and external
data of all formats to support strategic decisions around licensing IP,
partnering and running clinical trials. The data sources are large and
diverse and rely on accurate linkages using deep taxonomies.
Analysts and data
scientists rapidly uncover
insights and decision
makers have confidence in
those insights – the crucial
last step in the realization
of value.
“
20 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
Canonical linking data sets provide the backbone for scaling the
complex models inherent in these solutions. Approaches that do not
support text analytics to combine structured and unstructured data
are ineffective in this space.
Clinical Data Integration
Clinicians and the data scientists who support them require flexible
access to data sets across clinical trials. These users have found
significant value in graph aware analytics, allowing them to navigate
the wide and complex clinical data models to answer ad hoc
questions without manual data preparation or IT intervention.
Groups are further integrating real-world patient data to assess the
value and success of clinical trials.
Anzo Smart Data Lake 21 © Copyright 2015, Cambridge Semantics
Financial Services
Compliance
In the financial services sector, multiple billions of dollars are at
stake for those firms who are unable to effectively manage risk and
compliance to catch wrong doing early.
Specifically, identifying the potential for misuse of material non-
public information can be extremely difficult. Emails, messages,
trades and the people making them need to be looked at in a holistic
manner. Links and relationships need to be examined in detail, no
matter what the source is. For compliance officers and analysts,
identifying and exploring these relationships are a crucial
component of understanding what, how, why and when information
is shared and whether it is compliant or not.
To magnify the problem, the regulations for compliance are a
moving target making flexibility and ad hoc analytics an essential
feature of any solution.
Using an Anzo Smart Data Lake, Cambridge Semantics and its
partners have developed an investigative approach based on
combining disparate data sources in an interactive model that allows
compliance offers to investigate for compliance violations. Account
activity, web logs, email, phone archives, IM communications and
other sources can be linked to uncover potential violations of
regulatory requirements as well as internal policies and procedures
violations. Should regulations change, compliance workers can
quickly change the point of attack within the data – without
Rebuilding.
Visit our website to download the IDC buyer case study
“PricewaterhouseCoopers Helps Clients Manage Financial Risk and
Compliance with Cambridge Semantics’ Anzo Smart Data Platform”
22 Anzo Smart Data Lake © Copyright 2015, Cambridge Semantics
Conclusion With Anzo Smart Data Lake, the game has changed. IT groups no
longer have to compromise between a data warehouse and data
swamp and the business is able to arrive at insights faster than
anyone believed possible. High performance graph query
technology has unlocked the Anzo Smart Data Platform’s innate
ability deliver on this promise.
Using the graph-aware tools in Anzo SDP for analytics ETL, ELT,
and modeling, graph, our customers work quicker, cheaper, and
faster, with more flexibility and greater accuracy. The Anzo Smart
Data Lake delivers unprecedented data value, turning data assets
into extreme insight and competitive advantage.
Anzo Smart Data Lake 23 © Copyright 2015, Cambridge Semantics
To Learn More
Contact Cambridge Semantics:
http://www.cambridgesemantics.com/
About Cambridge Semantics
Cambridge Semantics Inc., The Smart Data Company™, is an enterprise analytics and data management software company. Our software, the Anzo Smart Data Platform™, allows IT departments and their business users to semantically link, analyze and manage diverse data whether internal or external, structured or unstructured, with speed, at big data scale and at the fraction of the implementation costs of using traditional approaches.
The company is based in Boston, Massachusetts.
For more information visit www.cambridgesemantics.com or follow us on Facebook, LinkedIn and Twitter: @CamSemantics
© Copyright 2015, Cambridge Semantics. All rights reserved.
Anzo Smart Data Lake Enterprise Graph-Based Discovery, Analytics and Governance