data visualization: impactful audit reporting
TRANSCRIPT
Data Visualization:
Impactful Audit Reporting
The Role of Data in Decision Making
Jennifer Lewis Priestley, Ph.D.
Professor of Statistics and Data Science
3
0.00%
50.00%
100.00%
150.00%
200.00%
250.00%
300.00%
350.00%
400.00%
450.00%
500.00%
2008 2009 2010 2011 2012 2013 2014 2015 2016
Data Science
Everything Else
A really big
number
A really small
number
No matter how its measured…
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
FinanceRetail
Healthcare
Economics
Manufacturing
Consulting
Political Science
Everyone is chasing the same talent…
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
Data Scientists…
7
Data scientist was the highest rated job, with a job score of 4.8 (out of 5), a job satisfaction rating of 4.4 (also out of 5)… – GlassDoor (Jan, 2017)
“The demand for data scientists is not expected to decrease anytime soon…data is so cheap to store en masse now, and so many devices, apps, and systems are producing data constantly, there’s more data now than ever before and its continuing to increase.” – Burtchworks Salary Study (Mar, 2017)
I keep saying the sexy job in the next ten years will be statisticians. People think I’m joking, but who would’ve guessed that computer engineers would’ve been the sexy job of the 1990s? – Hal Varian, Chief Economist Google
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
100BNumber of emails sent
every day
500MNumber of tweets per
day
85MMessages on
Instagram every day
25GBData generated by the average hybrid every
hour
300Orders processed by
Amazon every second
Data Show me your Big Data…
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
This is not your father’s data…
Cross Sectional Time Series Streaming Image/Video
Geo Spatial Networks Link Text
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
WEST COAST23% of POPULATION
43% of Data Scientists
MOUNTAIN12% of POPULATION
10% of Data Scientists
SOUTH27% of POPULATION8% of Data Scientists
MIDWEST21% of POPULATION
14% of Data Scientists
NORTHEAST17.5% of POPULATION25% of Data Scientists
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
Native and Non Natives…
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
Non-Natives are a big part of the talent gap…
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
Big Data Company : Coca Cola
~ 50,000 machines around the world
Can dispense about 95 drinks an hour
Can dispense about 125 different drinks
Submits real time data on:- Syrup consumed/drink configuration- Outlet- Time
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
Example of Dark Data at work…
14
Sales
Credit Rating of Owner
Domain Trends
Yelp/Facebook/Twitter Data
Video
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
Predictive Analytics in ContextC
om
ple
xity
Business Value
What happened in the past?Tool: Reporting
What is likely to happen?Tool: Predictive Analytics
What is happening now?Tool: Dashboards
Why did it happen?Tool: Statistical analysis
2008 2009 2010 2011 2012 2013 2014 2015 2016
Data Science
Everything Else
…which brings us back…
A really big
number
A really small
number
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
This shift has huge implications for developing, recruiting and retaining analytical talent…
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
We can’t teach the way we have always taught
The 1950s called …they want their curriculum back…
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
19
“Person who is better at statistics than any software
engineer and better at software engineering than any
statistician” – Josh Wills, Director of Data Engineering at
Slack
“Person who is better at explaining the business
implications of the results than any scientist and better at
science than any business school graduate”
Jennifer Priestley, Ph.D. Data Nerd
Who ARE these People?
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
Why Do I Need to Know This?
1. You are Not Alone. Or Unique.
20
2. Mine the data that you have…
3. Start using/accessing the data that you don’t have…
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
21
Partnering with Universities in Data Science
Grow the Talent
Inexpensive Consulting
Screening Talent
Kennesaw State UniversityDepartment of Statistics and Analytical Sciences
22Kennesaw State University
Department of Statistics and Analytical Sciences
How can Internal Audit get started?
Brent Munster, Director of Assurance and Advisory
Rob Thomas, Data Analytics Sr. Manager
Data Visualization | Continuous Auditing
24
1. Identify Processes
2. Develop Tests
3. Refine Results and
Tests
4. Automate Data Flows
5. Automate Testing
6. Visualize Testing
7. Real-time Monitoring
and/or Auditing
Using data visualization for continuous auditing is not an overnight process, it takes time to develop and mature
Identify Processes1
• Not all continuous auditing processes
are ideal for visualization (e.g.,
exception based testing)
• Start small and easy (e.g., easy to
understand, limited data feeds, small
sets of data)
Automate Testing5
• Determine the appropriate cadence
(e.g., daily, weekly, monthly)
• Utilize tools (e.g., ACL, SQL, Teradata) to
schedule and automate testing
Real-time Monitoring and/or Auditing7
• Inquire on items that do not meet the pre-defined criteria or are outside of the norm
• Document results and communicate any findings
• Determine if changes need to be made to either the tests or visualizations
Visualize Testing6
• Understand tools available for
visualization and expertise necessary for
each
• Create visualizations that are easy to
understand
Develop Tests2
• Determine purpose of audit (e.g., fraud
identification, control testing)
• Identify normal parameters, brainstorm
ideas for testing and leverage resources
available (e.g., The IIA, ACL, AICPA)
• Reverse engineer previous issues or
scenarios
Refine Results and Tests3
• Challenge the results and your process
• Remove false positives
• Determine the root cause of the false
positive
• Utilize expertise from the business
Automate Data Flows4
• Determine all data flows (e.g., HR,
Financial, external)
• Work with IT to ensure flows and data at
rest is secure
Time intensive activity
Data Visualization | Continuous Auditing
25
Purchasing Card Auditing
Home Depot utilizes company paid credit cards (P-Cards) to make purchases at infrequent vendors. Cards are
paid for directly by the company and do not require automated approval of transaction prior to purchase.
Auditing of transactions on these cards has evolved over time.
Random Sampling
Exception Sampling
Targeted Sampling
Risk Rated Sampling
Evolution of P-Card Testing and Sampling
• Pulled population of
transactions
• Randomly selected
transactions
• Developed tests
based on policy
• Selected a sample of
transactions that
appeared to not
meet policy
• Developed tests
based on known
fraud or brainstormed
scenarios
• Selected a sample of
transactions that met
criteria
• Combination of
exception and
targeted sampling
• Scored tests based on
risk
• Aggregated risk score
at card level
• Sampled transactions
from high risk cardsLimitations
Delayed Auditing
Data Analytics Understanding
Manual Process
Lots of Results
Small Samples Clunky ToolsAdministrative
Nightmare
26
Data Visualization | Continuous Auditing
Test
s ▪ Amount, location, activity pattern▪ Vendor, spend category (MCC)▪ Owner and employment status S
co
rin
g
▪ Individual test weights ▪ Internal fraud scenarios ▪ External fraud scenarios
Transactions are scored based on attributes across 17 individual tests and
on combinations of attributes indicative of internal or external fraud
Load
Transactions
(Last 7 Days)
Test and
Score
Transactions
Publish
Results to
Dashboards
Pre-scheduled activity runs weekly to load, test and
score transactions and to publish results to dashboards
The Evolution Continues …With the limitations of each of the previous auditing techniques, we decided that visualization was missing to help focus on high risk transactions
Results Dashboarded in
Tests & Risk
Scoring
HR Records
P-Card Charges
Source Data: HR Records (Internally Provided) & P-Card Charges (3rd Party Provided)
Tests & Risk Scoring: Developed a variety of tests based on fraud scenarios, spending patterns and standard operating procedures
Tools Utilized: Teradata & Tableau
• Teradata is utilized automate and schedule the jobs to merge
data sets, perform testing and risk rate transactions
• Tableau is utilized to obtain results from Teradata and visualize
the results so user can easily identify transactions that appear
abnormal
Data Visualization | Continuous Auditing
27
Direct Marketing,
Food Purchases
11/30/16 RDC 5034… BESTBUYCOM7928280125 RICHFIELD… BEST BUY .COM LLC WWW.BESTBUY.COM EDEN PRAIRIE MN UNITED STATES $3,066.37 77 187
10/27/16 RDC 5034… GUITAR CENTER #856 8 WHITEHALL… GUITAR CENTER MGMT GUITAR CENTER #856 WHITEHALL PA UNITED STATES $1,631.20 3 3
10/07/16 RDC 5034… HTTP://WWW.REPAIRHPP SHEUNG W… CYBER MERCHANT LIMI… CYBER MERCHANT… NT 00 HONG KONG $773.70 70 170
10/2/16 RDC 5034… DIRECTV SERVICE 800-347-3288 DIRECTV INC DIRECTV INC EL SEGUNDO CA UNITED STATES $129.31 80 56
Foreign
Transactions
Online, Out-of-
State Purchases
Unknown/Out of
Country
Risk Score by Card – Cards are sorted by risk rating and allows user to click for additional details
Card Deep Dive – Users view when reviewing a specific card
Data Visualization | Operational and Strategic Analytics
28
IndependentAs an independent group, IA can provide a non-biased
viewpoint into different areas of the business
Given the scope of IA’s work, they have the background
and experience to understand the full picture
IA can leverage their expansive network to provide a
qualitative aspect to the quantitative analysis
Knowledge Base
Network
Internal Audit can be an important asset in developing and monitoring operational and strategic analytics
Operations
Inventory Reviews
Shrink Deep Dives
Transaction Testing
Supply Chain
Receipts Testing
Inventory Parameter Review
SER
Merchandising
Perfect Order Scorecards
Online Inventory Buffer
Data Visualization | Operational and Strategic Analytics
29
Data analytics are incorporated into projects primarily in three different ways
Develop high-level
understanding
Identify root cause
for issues
Continuous
monitoring
• Pulling our own data, rather than relying on the business to provide data/reports, allows us to:
• Develop a more tailored view of the business
• Create a broader level of insight into the business
• Obtain results on a more real time basis
1
2
3
Identify Testing Units
• Each SKU (~50,000) at each store (~2,000) for the last three years
• Transfer testing units into data lab for further testing
Aggregate Attributes about the Testing Units
• Identify relevant attributes (200+) about the testing units and transfer into lab
• Vendor, DC flow path, sales history/penetration, inventory location, etc.
Analyze Results Leveraging Data Visualization Tools (Tableau)
• With 200+ attributes aligned with 300M testing units, understanding the impact of
each attribute, in isolation and combined, can easily be seen
Gaining an Understanding of Drivers of Inventory Loss / Shrink
Data Visualization | Operational and Strategic Analytics
30
Data analytics are incorporated into projects primarily in three different ways
Develop high-level
understanding
Identify root cause
for issues
Continuous
monitoring
• Projects will create massive data sets (250M+ rows)
from the most granular level to be able to
understand the impact of “minor” attributes
• All sales transactions for the last 3 years
• All cartons flowing through our Supply Chain
network
• Daily on-hand quantities for every SKU in
every store
• Every price change for every SKU in every
store
Data Visualization | Operational and Strategic Analytics
31
Data analytics are incorporated into projects primarily in three different ways
Develop high-level
understanding
Identify root cause
for issues
Continuous
monitoring
Identification of high-risk scenarios or defective scenarios can be published and refreshed automatically,
allowing for real time monitoring with no manual effort to maintain
1st Fixed
Flagged Again
Flagged Again
Data Visualization | Interaction with the Business
Depending on the project, our interaction regarding the final deliverable can vary.
Hand Off Work for Business to Own
When business partner has capable staff, we will turn over the scripts/queries/dashboards to the business so
they can take full ownership of the maintenance and review.
• Requires business to have the right skillset to maintain and digest appropriately
Retain Responsibility for Monitoring and Communicating Results
Certain groups might not have the skillset or expertise to take over the scripts/queries/dashboards, so we
will maintain responsibility for their upkeep and communicating results to the business for further action.
• Level of effort to maintain can vary significantly based primarily on the data sources involved
Reports can provide high level summary and further details so all information is available in one place
Data Visualization | Interaction with the Business
Depending on the project, our interaction regarding the final deliverable can vary.
Cross-Functional Monitoring with Business
To review the numerous high-risk and defective scenarios identified from previous projects, a cross-
functional team was created that meets every month.
• Over 100 dashboards built over several years of projects from multiple groups
• Monthly review performed by Asset Protection and results communicated monthly with several
different business customers
3. Investigate anomalies
2. Review dashboards
1. Refresh dashboards
4. Report findings
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Internal Audit Asset Protection
Finance Integration Merch Payables Inventory Accounting
Store Ops Finance
Data Visualization Case Studies
Ethan King
Forensic Technology Services Director
35
Why Data Visualizations?
Data Visualizations Make Complex Data Accessible and Actionable
THE DATA PROBLEM
• More data leads to more complex and harder decision making
• Research shows that trying to incorporate large amounts of information into decision can make decision making harder and can lead to worse decision outcomes
• Traditional charts and graphs are typically limited to two or three dimensions at most
DATA VISUALIZATIONS AS THE SOLUTION
• Visualization dashboards can gracefully incorporate more than three analytical dimensions
• Data visualizations are dynamic and react to the needs of the users
• Research shows that people make better decisions using charts and graphs over tabular data formats
36
Tableau Visualization Platform
CUSTOMIZED AUTOMATION
Grant Thornton LLP (US) uses the Tableau data visualization platform to present its findings. With the Tableau Server you can:
• Ability to quickly iterate version of the dashboards
• Receive automated updates when your data has been updated
• Reports are interactive with the ability filter to drill down into the details
• Available on a computer or your mobile device
KISS PRINCIPLE
• We Only use Tableau Server only where it makes sense for our clients
• Verbal reports, Excel, and PowerPoint can all be effective
37
Logistics Company Vendor Fraud
FACT PATTERN
Whistle blower at a logistics company identified a potentially fraudulent vendor.
ANALYSIS HIGHLIGHTS QUICK FACTS
• Vendor Fraud
• Risk Scoring
• Benford's Law Analysis
• FTS worked with GT forensic
accountants to determine the scope
of the fraud
• Company had over 40,000 vendors
across the United States
• Time pressure by local authorities
made a comprehensive vendor
review impossible
• Approximately 400k vendor payment
records
Logistics Company Vendor Fraud
39
Fraudulent Bank Account Activity
FACT PATTERN
An executive was accused of misusing customer funds.
ANALYSIS HIGHLIGHTS QUICK FACTS
• Bank Statement Analysis
• Pattern / Trend Analysis
• Text to Data Transformation
• GT received bank transaction and
wire information in the archival format
the bank used for storage
• GT was asked to help identify
potentially inappropriate transactions
• Approximately 1 million bank
statement records
Fraudulent Bank Account Activity
Fraudulent Bank Account Activity
42
Inventory Shrinkage Analysis
FACT PATTERN
A consumer retailer has been experiencing ongoing inventory shortages.
ANALYSIS HIGHLIGHTS QUICK FACTS
• Supply Chain / Logistics
• Consumer Products
• Inventory Shrinkage
• Company has over 1,000 stores and
cross-docks across the United States
• Company has been performing
frequent physical inventory counts to
adjust inventories in its ERP system -
SAP
• Analysis uses approximately 2 million
inventory adjustment records
Inventory Shrinkage Analysis