achieving business value with big data - i
TRANSCRIPT
-
7/29/2019 Achieving Business Value With Big Data - I
1/20
A presentation by
W H Inmon
ACHIEVING BUSINESS VALUE
WITH BIG DATA
-
7/29/2019 Achieving Business Value With Big Data - I
2/20
Big Data its everywhere
-
7/29/2019 Achieving Business Value With Big Data - I
3/20
Map Reduce Hive
sources of data analytical
processing
budget
compatibility
what the analyst sees today
Cirro Mongopig
-
7/29/2019 Achieving Business Value With Big Data - I
4/20
It is one thing to talk about the technology and challenges
of Big Data, but it is another subject when it comes to talking
about getting Business Value out of Big Data
-
7/29/2019 Achieving Business Value With Big Data - I
5/20
Big Data TextualDisambiguation
Business
Value
Here is what lies ahead in addressing the topic of
achieving business value out of Big Data
Unstructured data
-
7/29/2019 Achieving Business Value With Big Data - I
6/20
Big Data Textual
Disambiguation
Heres what everyone
is talking about
Here is the even bigger hurdle
that no one is talking about
-
7/29/2019 Achieving Business Value With Big Data - I
7/20
The reality everything in Big Data is unstructured
texttext
text
texttext
text
texttext
text
text
text
text
text
text
text
texttext
text
text
text
text
text
-
7/29/2019 Achieving Business Value With Big Data - I
8/20
texttext
text
texttext
text
texttext
text
text
text
text
text
text
text
text
text
text
text
text
text
text
You can solve all the technical problems of Big Data, but if you
dont also solve the problems of unstructured data, there is no
Business Value
-
7/29/2019 Achieving Business Value With Big Data - I
9/20
Big Data
Unstructured Data
All Big Data is unstructured;
Some unstructured data is not Big Data
-
7/29/2019 Achieving Business Value With Big Data - I
10/20
and whats so challenging about
raw text?
it is dangerous and potentially
very misleading to try to useraw text as a basis for decisions.
the answer is seven
seven what?seven days?
seven dollars?
seven wonders of the world?
seven seas?
seven dwarfs?
Consider the following confusion..
-
7/29/2019 Achieving Business Value With Big Data - I
11/20
Shes hot.
what is being said here?
she is attractive and I want to date her
it is Houston Texas and it is 98 degrees. She is sweating
I just took her temperature and it was 104 degrees..
looking at the words Shes hot tells you nothing
in order to make sense of the text you MUST supply context
and that is true for ALL text
Or consider this confusion..
-
7/29/2019 Achieving Business Value With Big Data - I
12/20
But context is not the only problem with text
Log record -
12p98**711>>?mmnaYYt009qqoiy3GGHt
13p99%%899?mmbbaf882qwooi4GGHtf
16p97*&*8772
-
7/29/2019 Achieving Business Value With Big Data - I
13/20
But context and interpretation are
not the only problems with text
May 5, 2013
2013/07/2301/13/2014
July 20th of 1945 Standardization of certain common
values is also needed
-
7/29/2019 Achieving Business Value With Big Data - I
14/20
Textual
disambiguation
In order to achieve Business Value,
the raw text found in Big Data must pass through
a process known as textual disambiguation
-
7/29/2019 Achieving Business Value With Big Data - I
15/20
so how do you do textual disambiguation?
documents
taxonomiesontologies
qualified
vocabulariesdocument
metadata
document sensitive
inference
textual
proximity
acronymresolutions
homographic
resolution
The first step is to contextualize
the raw data
-
7/29/2019 Achieving Business Value With Big Data - I
16/20
The problem with contextualization is that
there are many ways to contextualize thetext, all depending on the text
-
7/29/2019 Achieving Business Value With Big Data - I
17/20
Interpretation
And what about interpretation?
-
7/29/2019 Achieving Business Value With Big Data - I
18/20
Flt 462 DNV to ORD lst bgs CN 235-PO-908
Flight 462 Denver to Chicago lost bags claim number 235-PO-908
Flight 462 Denver to Chicago lost bags claim number 235-PO-908
Flightnumber
City City ActivityOperand Claim number Tone
noneFlight typeUS
Airline agent message
In order to make sense of an airline agent record, it must first be
interpreted, then the interpretation must be contextualized
-
7/29/2019 Achieving Business Value With Big Data - I
19/20
May 5, 2013
2013/07/23
01/13/2014
July 20th of 1945
Date - 20130505
Date - 19450720
Date - 20140113
Date - 20130723
Standardization of dates
-
7/29/2019 Achieving Business Value With Big Data - I
20/20
Textual
disambiguationStandard dbms
raw
disambiguated Inmon/Krishnan Big Data Architecture