12 vs of big data governance - university of...

15
1 The 12 Vs of Big Data provide a Governance framework of questions the should be of value in ensuring that our IT and Big Data and Analytics projects deliver value to the various stakeholders. One of the most important aspects of Governance frameworks is to provide questions that allow each project team to consider how to be successful. Typically, an assessment of risks and vulnerabilities will provide the context within which to connect the Vs to the specific situation. From this, it will be possible to develop answers that ensure enhanced success for the project. Each of the following slides briefly explore the purpose of the Vs.

Upload: others

Post on 20-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

1

The 12 Vs of Big Data provide a Governance framework of questions the should

be of value in ensuring that our IT and Big Data and Analytics projects deliver

value to the various stakeholders.

One of the most important aspects of Governance frameworks is to provide

questions that allow each project team to consider how to be successful.

Typically, an assessment of risks and vulnerabilities will provide the context within

which to connect the Vs to the specific situation. From this, it will be possible to

develop answers that ensure enhanced success for the project.

Each of the following slides briefly explore the purpose of the Vs.

Page 2: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Whilst Volume is one of the definitional Vs defining Big Data as that which

exceed the limits of current technologies, it can also be used as a question to

help us to understand the consequences on our projects in terms of the

technology stack that will be needed to ensure a satisfactory solution.

2

Page 3: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

The velocity of data can pose significant challenges to the ability to provide

answers in the requisite timescales.

3

Page 4: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Variety refers to the wide range of formats and structures of the data that we

want to amalgamate and then process.

We now have structured and un-structured data which each have significant

challenges.

It is vital to understand the variety of our data sources in order to ensure that we

do not end-up with a fruit salad of mis-matched data. If we get it wrong, there is

the potential to create significant reputational, security and financial vulnerabilities

for our organisations and users.

4

Page 5: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Variability can be a vital aspect of understanding the relevance and value of the

detected patterns in the data. It can be a challenge even where there is a highly

regular periodicity in the data.

As an example, weather tends to have a seasonal periodicity, such as winter and

summer but can pose challenges due to the fact that it is also very variable from

day to day and even from year to year. This poses problems for retailers where

footfall can vary according to sun or rain on a particular day.

It will also affect sports events’ attendance.

However, it can also refer to the natural human variability in preferences and

likes. The fundamental question will be to assess the stability of personal

preferences (and even honesty with answers to survey questionnaires). Do we

really understand the stability of the way that humans answer a personality profile

questionnaire over time. Timescales could be over a period of days or weeks or

months.

5

Page 6: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Value is possibly one of the most difficult questions to answer.

What is the value of the project or application to the wide range of stakeholders in

the project.

How do each of the stakeholders obtain value.

Is it always monetary, or are there other factors, such as intellectual curiosity,

time saving, entertainment, customer retention, etc. The answers are many.

6

Page 7: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Are the data true? J Easton (IBM 2012) points out that probably 80% of all data

are of uncertain veracity.

Examples could be IoT sensor calibration drift, Location Services inaccuracy

(ThinkNear data suggests that approximately 14% of all location data used for

location based advertising is inaccurate by more than 60 miles (100km).

Social Media provide another source of probably low veracity data that needs

very careful analysis before any level of trust can be obtained.

7

Page 8: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Linked to Veracity is the question of the Validity of the data and the analysis.

A classic example of the lack of validity is the centre picture, where the

correlation is totally spurious and completely invalid.

8

Page 9: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Volatility poses the question of the timescales over-which data and the analyses

retain validity.

Weather forecasts provide an example in that the forecasts for a specific place

can change over relatively short timescales.

9

Page 10: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Verbosity particularly relates to questions about text sources and the problems of

machine understanding of the meaning of the text.

Natural language is often verbose in that it is not truly concise and is often highly

predictable. Does this help or hinder the analysis of text.

At the other end of the Verbosity problem are Tweets with the 140 character limit

and the consequences on textual analysis when all the normal rules of syntax

and grammar are ignored.

Can machine learning really understand irony or humour?

10

Page 11: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Vulnerability poses a wide range of questions about the system, the data and the

stakeholders.

Are the vulnerabilities technical, legal, financial, reputational…….?

11

Page 12: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Asking about Verification helps us to understand how we can demonstrate the

Validity and Veracity of the insights gained based on Verification and Validation of

our data cleansing and the analytics.

It helps us to demonstrate that we understand and have addressed the

Vulnerabilities.

12

Page 13: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

Visualisation helps us to understand the best (and most ethical) ways of

visualising our data.

Are we accidentally or deliberately mis-leading the users of our visualisations. Are

we presenting our analyses in a way that is fair and honest?

13

Page 14: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

This was the original presentation which was intended to demonstrate the

similarity of the levels of investment in the UK Infrastructure. It did this by using a

Log Scale. However, as the next slide demonstrates, this was a very severe

misrepresentation of the information.

This visualisation worked because most readers do not have much familiarity

with Log Scales and assume the more normal linear scales.

14

Page 15: 12 Vs of Big Data Governance - University of Derbycomputing.derby.ac.uk/.../12-Vs-of-Big-Data-Governance.pdf · 2017-02-02 · The 12 Vs of Big Data provide a Governance framework

As this version shows, with a linear scale, only Energy and Transport have

significant levels of investment.

Investment in Flood Defences is insignificant compared to the others. This was in

the context of the sever flooding that had happened over the previous winter in

the UK

15