Transcript
Page 1: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

DERIVING INSIGHTS FROM BIG DATA

Presented by: Solon Angel

Product Manager

CaseWare IDEA Inc.

November 13, 2012

Page 2: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

• Introduction

• What is BIG DATA?

• Impact on Audit

• Analytics & Collaboration

• Best Practices

• Questions & Answers

Agenda

Page 3: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

BIG DATA

Megabytes

Gigabytes

Terabytes

Petabytes

Increasing Data Variety & Complexity

Web Logs

Sales transactions

Offer history

Affiliate Networks Search Marketing

Behavioral Targeting

Sensors /RFID/Devices

Mobile Web

User Click Stream

Sentiment

User Generated Content

Social Interactions & Feeds

Spatial & GPS Coordinates

Business Data Feeds

Speech to Text

Product Service Logs

SMS/MMS

Purchase

Detail Purchase Record

Payment Record

Support contacts

External Demographics

HD Audio, Video, Images

ERP

CRM

WEB

Automated reports

Offer details

Printed reports

AP / AR

What is BIG DATA?

Page 4: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Devil in the Data

ATMs

ERPs

Transactional data

CRM , Accounting

databases, new compliance

requirements, new medias etc…

Exabyte(s)

TENFOLD GROWTH OBSERVED IN FIVE YEARS

Page 5: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

• In 2011, digital data was 10 times the size than in 2006

• Data sets are beyond the standard ability to process

• 44-fold in the next ten years

• Data growth cannot be ignored

• Requires new approach to enable insights and process

optimization

Growing Challenge

Page 6: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Poll 1

• What is the size of the biggest data file

you’ve worked with?

• 100Mb – 1Gb

• 1Gb – 500Gb

• 500Gb – 1Tb (Terabyte)

• More than 1Tb

Page 7: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Impact on Audit

Page 8: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Impact on Audit

• The big data problem

Higher volumes means longer analysis time

Larger variety of data type increases audit complexity

Fast changing record set turns audit-focused data into

a moving target

Providing insights becomes difficult on desktops

Page 9: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Higher Volume Impact

• The problem with big data:

Higher volumes means longer analysis time, or no

analysis!

• Example: Medicare

Page 10: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Higher Volume Impact

• Medicare

Medicare data spans across states, dozens agencies,

private companies and datacenters

Record set extremely fragmented

It is impossible to transfer all the data in one location

for processing

Page 11: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Yesterday/Today’s data:

Files

Databases

Tables

Columns

More Variety Impact Today/Tomorrow’s data:

Large PDFs

Automated feeds

Raw data extracts

Unstructured data

Scanned data

Audio files

Video

Page 12: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

More Variety Impact

• Cause of complex problems and gaps

Data Sources Analytics

Extract

Aging Sort

Search

Group

Stratify

Standards

Gaps

Duplicates Sampling

Statistics

Join

Append

Audit Tests Transactional systems

Data

warehouses Online

databases

Client files

Printed

reports

Page 13: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Velocity Impact

• The problem with Velocity:

Fast changing record set turns audit-focus data into a

moving target

Page 14: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Speed Imports Scalable

Velocity

Volume Variety

Value

Impact

Page 15: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Poll 2

• What problems do Big Data pose for Audit?

• Higher volumes means longer analysis time

• Larger variety of data types increases audit

complexity

• Fast changing record sets turn audit-focus data

into a moving target

Page 16: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Analytics & Collaboration

Page 17: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

1. Import from ERPs, CRM, other data files

2. Prepare the data

3. Analyze

4. Create report as PDF, Word, Excel…

5. Send emails / file sharing

6. Meet to discuss

35% 10% 30% 5% 5% 15%

Typical Day in Audit

Page 18: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Consider the Following

• Senior Auditor A spends a lot of time requesting datasets from IT.

• There is a delay of 3 days between the systems and the data he is

given.

• The datasets being IT-formatted, he spends considerable amount of

time cleaning the datasets into a workable database.

• At the same time, Senior Auditor B asks for similar datasets, but the

data was acquired by IT 5 days after. He also needs to spend time

cleaning the datasets.

• Hours are spent duplicating efforts for different results!

Page 19: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

“Garbage in, garbage out”

Scenario

PROJECT A

PROJECT B

PROJECT C

PROJECT D

PROJECT E

PROJECT F

PROJECT G

PROJECT H

Day 3

Day 5

Page 20: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Risks Associated

• Data acquisition is cumbersome

• Risk of inaccurate data sources from IT

• Duplication of effort

• No visibility of the team’s activity

Page 21: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Server Scenario

PROJECTS A-H

Auditor B

Auditor C

Auditor D

Network

backup

Auditor A

Page 22: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

1. Import from ERPs, CRM, other data files

2. Prepare the data

3. Analyze

4. Create report as PDF, Word, Excel…

5. Send emails / file sharing

6. Meet to discuss

35% 10% 30% 5% 5% 15%

Accelerate the audit process by 50%

Impact on Audit

Page 23: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

1. Data is available from data sources

2. Analyze and share easily

3. Meet to discuss

Streamline the audit process by 50%

Keeping Audit Relevant

Page 24: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Poll 3

• What are advantages of a collaborative

approach to analytics?

• Less duplication of tasks between individuals

• Tackling problems that require group intelligence

• Retaining analytical process of all audits

Page 25: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Best Practices

Page 26: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

• Modern Science

Human DNA code

Protein folding is one of the

hardest computational

problems in biology

In Today’s World

Popular Mechanics 2012

• Traditionally requires:

Mathematicians and developers able to write algorithms

Highly qualified scientists capable of interpreting results

Page 27: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

• Modern Science - How did they do it?

New approach, new tools:

― Distributed computing grid based on

commodity hardware

― Ease to use interface providing a

single view of the problem, without

the need to interpret data

― Enabling collaboration of thousands

of individual (as a game)

In Today’s World

Page 28: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

In Today’s World

"You don't find many soloists among the top scorers.”

Global Game Moderator

Popular Mechanics 2012

• Modern Science

Page 29: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

• Enabling collaboration is key to solve big data

• Collaborate between teams

Less duplication of time spent on acquiring data

Easy to repeat success on a larger scale

• Applied knowledge transfer is greater and more effective

Retain analytical process of all audits – keep

expertise

Collaboration vs. Big Data

Page 31: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Traditional

Hard To Manage

Costly

Limited

Distributed computing

Self-Managed

Cost Efficient

Scalable

Accelerate Performance

Page 32: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Tests Desktop (hrs.) Server (mins.) Gain

Summarization 2:29:01 0:08:14 1810 %

Duplicate Key Detection 1:09:56 0:07:01 3139%

Stratification 5:03:01 0:11:32 2626% Random Sample 2.5 million 1:12:26 0:08:13 881%

TESTS PERFORMED WITH BANKING DATA

3.2 MILLION RECORDS, 300+ FIELDS, 20 GIGABYTES

Results

Page 33: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Poll 4

• What are the advantages of server based

processing?

• Enabling efficient collaboration

• Run tasks faster

• More secure data

Page 34: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

• Turning a foe into friend

Involving IT

Audit team

Datacenter

Data (fraud)

IT

Page 35: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

IT wants:

Secured data

Control of access

ROI for investments

Disaster recovery

Audit needs:

Consistent data access

Data integrity

Speed

On demand analytics

Give to Get

Page 36: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

• Remove friction points in audit process

• Look to identify best practices and scale them

• Enable auditors to be in control, help each other

• Leverage the latest technologies in datacenters

• If you know how to use CAATs (IDEA, ACL, etc.) you are

ready!

• Let the data speak: start with a pilot process, management

quick to approve success and immediate ROI

Recommendations

Page 37: Deriving Insights From Big Data - Success•The big data problem Higher volumes means longer analysis time Larger variety of data type increases audit complexity Fast changing record

Solon Angel | [email protected]

Questions?


Top Related