oracle’s machine learning and adw to detect and …. complaints frequently tips or complaints will...
TRANSCRIPT
1
Oracle’s Machine Learning and ADWC to Detect and Fight Fraud and UK NHS Case Study
Charlie Berger, MS Engineering, MBASr. Director Product Management, Machine Learning, AI and Cognitive [email protected] www.twitter.com/CharlieDataMine
Phil GodfreyData Scientist, Information ServicesUK NHS, Business Services Authority
Move the Algorithms; Not the Data!
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
2
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
People's Attitudes About FraudConsumers
• Nearly one of 10 Americans would commit insurance fraud if they knew they could get away with it.
• Nearly one of four Americans say it’s ok to defraud insurers
–About one in 10 people agree it’s ok to submit claims for items that aren’t lost or damaged, or for personal injuries that didn’t occur.
–Two of five people are “not very likely” or “not likely at all” to report someone who ripped of an insurer. Accenture Ltd.(2003)
• Nearly three of 10 Americans (29 percent) wouldn't report insurance scams committed by someone they know. Progressive Insurance (2001)
– About 35 percent say fraud costs their companies 5-10 percent of claim volume. More than 30 percent say fraud losses cost 10-20 percent of claim volume;
– Detecting fraud before claims are paid, and upgrading analytics, were mentioned most often as the insurers’ main fraud-fighting priorities;
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
American Society of Certified Fraud Examiners20 Ways to Detect Fraud
1. Unusual Behavior The perpetrator will often display unusual behavior, that when taken as a whole is a strong indicator of fraud. The fraudster may not ever take a vacation or call in sick in fear of being caught. He or she may not assign out work even when overloaded. Other symptoms may be changes in behavior such as increased drinking, smoking, defensiveness, and unusual irritability and suspiciousness.
2. ComplaintsFrequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have been known to be some of the best sources of fraud and should be taken seriously. Although all too often, the motives of the complainant may be suspect, the allegations usually have merit that warrant further investigation.
3. Stale Items in ReconciliationsIn bank reconciliations, deposits or checks not included in the reconciliation could be indicative of theft. Missing deposits could mean the perpetrator absconded with the funds; missing checks could indicate one made out to a bogus payee.
4. Excessive VoidsVoided sales slips could mean that the sale was rung up, the payment diverted to the use of the perpetrator, and the sales slip subsequently voided to cover the theft.
5. Missing DocumentsDocuments which are unable to be located can be a red flag for fraud. Although it is expected that some documents will be misplaced, the auditor should look for explanations as to why the documents are missing, and what steps were taken to locate the requested items. All too often, the auditors will select an alternate item or allow the auditee to select an alternate without determining whether or not problem exists.
6. Excessive Credit MemosSimilar to excessive voids, this technique can be used to cover the theft of cash. A credit memo to a phony customer is written out, and the cash is taken to make total cash balance.
http://www.fraudiscovery.com/detect.html http://www.auditnet.org/testing_20_ways.htm
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Automatically sift through large amounts of data to find hidden patterns, discover new insights and make predictions
What is Machine Learning?
• Identify most important factor (Attribute Importance)
• Predict customer behavior (Classification)
• Predict or estimate a value (Regression)
• Find profiles of targeted people or items (Decision Trees)
• Segment a population (Clustering)
• Find fraudulent or “rare events” (Anomaly Detection)
• Determine co-occurring items in a “baskets” (Associations)
A1 A2 A3 A4 A5 A6 A7
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Machine Learning/Data Mining ProvidesBetter Information, Valuable Insights and Predictions
Customer Months
Lease Churners vs. Loyal Customers
Insight & PredictionSegment #1
IF CUST_MO > 14 AND INCOME < $90K, THEN Prediction = Lease Churner
Confidence = 100%Support = 8/39
Segment #3 IF CUST_MO > 7 AND INCOME < $175K, THENPrediction = Lease Churner,
Confidence = 83%Support = 6/39
Source: Inspired from Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management by Michael J. A. Berry, Gordon S. Linoff
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Data Mining When Lack Sufficient Past ExamplesBetter Information, Valuable Insights and Predictions
Customer Months
Cell Phone Fraud vs. Loyal Customers
Source: Inspired from Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management by Michael J. A. Berry, Gordon S. Linoff
?
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
CLASSIFICATION– Naïve Bayes– Logistic Regression (GLM)– Decision Tree– Random Forest– Neural Network– Support Vector Machine– Explicit Semantic Analysis
CLUSTERING– Hierarchical K-Means– Hierarchical O-Cluster– Expectation Maximization (EM)
ANOMALY DETECTION– One-Class SVM
TIME SERIES– State of the art forecasting using
Exponential Smoothing. – Includes all popular models
e.g. Holt-Winters with trends, seasons, irregularity, missing data
REGRESSION– Linear Model– Generalized Linear Model– Support Vector Machine (SVM)– Stepwise Linear regression– Neural Network– LASSO
ATTRIBUTE IMPORTANCE– Minimum Description Length– Principal Comp Analysis (PCA)– Unsupervised Pair-wise KL Div – CUR decomposition for row & AI
ASSOCIATION RULES– A priori/ market basket
PREDICTIVE QUERIES– Predict, cluster, detect, features
SQL ANALYTICS– SQL Windows, SQL Patterns,
SQL Aggregates
A1 A2 A3 A4 A5 A6 A7
• OAA (Oracle Data Mining + Oracle R Enterprise) and ORAAH combined• OAA includes support for Partitioned Models, Transactional, Unstructured, Geo-spatial, Graph data. etc,
Oracle’s Machine Learning & Adv. Analytics Algorithms
FEATURE EXTRACTION– Principal Comp Analysis (PCA)– Non-negative Matrix Factorization– Singular Value Decomposition (SVD)– Explicit Semantic Analysis (ESA)
TEXT MINING SUPPORT– Algorithms support text type– Tokenization and theme extraction– Explicit Semantic Analysis (ESA) for
document similarity
STATISTICAL FUNCTIONS– Basic statistics: min, max,
median, stdev, t-test, F-test, Pearson’s, Chi-Sq, ANOVA, etc.
R PACKAGES– CRAN R Algorithm Packages
through Embedded R Execution– Spark MLlib algorithm integration
EXPORTABLE ML MODELS– REST APIs for deployment
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Where’s Waldo?
• Can you find Waldo in 10 seconds?
• Get ready….Go!
9
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Data Lab To Find Savings and Cost Reductions in Health Care Budget
• United Kingdom’s National Health Service
• Identify billing and identity fraud
• Optimize treatment by reducing use of less effective medical procedures
• Deployed Oracle Advanced Analytics, and Oracle Business Intelligence on Oracle Exadata and Oracle Exalytics
“With one vendor providing the whole solution, it’s very easy for us.”
Nina Monckton, NHS BSA
£350M Confirmed savings identified
Confidential – Oracle Internal/Restricted/Highly Restricted 11Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
“It is despicable people would even claim things they are not entitled to.”
– Sue Frith, head of NHS Counter Fraud Authority
12
Dentists inventing patients fuelling
NHS fraud bill of more than £1bn
every year, officials warn
'Despicable' fraud costs NHS in
England £1bn a year
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Remember Big Bird’s Song
One of these things is not like the others,
One of these things doesn't belong,
Can you tell which thing is not like the others
By the time I finish my song?
https://youtu.be/ueZ6tvqhk8U
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Approach to Finding Anomalies
• LOTS of data and few or zero examples
• Many attributes
• For an single attribute, a value may seem ~“OK”
• Taken collectively, many attributes may indicate “anomalous”
• Look for what is “different”
X1
X2
X3
X4
X1
X2
X3
X4
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
A Real Fraud ExampleMy credit card statement—Can you see the fraud?
May 22 1:14 PM FOOD Monaco Café $127.38May 22 7:32 PM WINE Wine Bistro $28.00…June 14 2:05 PM MISC Mobil Mart $75.00June 14 2:06 PM MISC Mobil Mart $75.00June 15 11:48 AM MISC Mobil Mart $75.00June 15 11:49 AM MISC Mobil Mart $75.00May 28 6:31 PM WINE Acton Shop $31.00May 29 8:39 PM FOOD Crossroads $128.14June 16 11:48 AM MISC Mobil Mart $75.00June 16 11:49 AM MISC Mobil Mart $75.00
Monaco?Gas Station?
All same $75 amount?
Pairs of $75?
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Start with a Business Problem Statement
“If I had an hour to solve a problem I'd spend 55 minutes thinking about the problem and 5 minutes thinking about solutions.”
― Albert Einstein
Clearly Define Problem
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 17
Manage and Analyze All Your Data
Big Data SQL / R
SQL / R
Object Store
“Engineered Features”– Derived attributes that reflect domain knowledge—key to best models e.g.:• Counts• Totals• Changes
over time
Boil down the Data Lake
Data Scientists, R Users, Citizen Data Scientists
Architecturally, Many Options and Flexibility
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Multiple Data Scientist User Roles SupportedOracle’s Machine Learning/Advanced Analytics
Application DevelopersDBAs
Additional relevant data and “engineered
features”
Sensor data, Text, unstructured data, transactional data, spatial
data, etc.
Historical data Assembled historical data
Historical or Current Data to be “scored” for predictions
Predictions & Insights
Oracle Database 12c
R, Python Users, Data Scientists Data Analysts, Citizen Data ScientistsOML Notebook Users
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Big Data Analytics using w Graph
• Add new engineered features– Percentage time spent in
zones
– Amount time/encounters with persons of interest
• Better predictions using available data– At risk customers
– Government approval processes
– Medical claims
– IoT predictive analytics
Oracle Advanced Analytics/Machine Learning with Enhanced Graph & Spatial Data SourcesTransactional network relationships data
Transactional geo-location data summarized to % time spent in areas or number of “hits” near a location
Better data and “engineered features”; better predictive models and predictive insights
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
OAA Model Build and Real-time SQL Apply Prediction
BEGIN
DBMS_DATA_MINING.CREATE_MODEL(
model_name => 'BUY_INSUR1',
mining_function => dbms_data_mining.classification,
data_table_name => 'CUST_INSUR_LTV',
case_id_column_name => 'CUST_ID',
target_column_name => 'BUY_INSURANCE',
settings_table_name => ' CUST_INSUR_LTV_SET');
END;
/
Simple SQL Syntax
Select prediction_probability(BUY_INSUR1, 'Yes'
USING 3500 as bank_funds, 825 as checking_amount, 400 as credit_balance, 22 as age,
'Married' as marital_status, 93 as MONEY_MONTLY_OVERDRAWN, 1 as house_ownership)
from dual;
ML Model Build (PL/SQL)
Model Apply (SQL query)
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Fraud Prediction Simplified
drop table CLAIMS_SET;exec dbms_data_mining.drop_model('CLAIMSMODEL');create table CLAIMS_SET (setting_name varchar2(30), setting_value varchar2(4000));insert into CLAIMS_SET values ('ALGO_NAME','ALGO_SUPPORT_VECTOR_MACHINES');insert into CLAIMS_SET values ('PREP_AUTO','ON');commit;
begindbms_data_mining.create_model('CLAIMSMODEL', 'CLASSIFICATION','CLAIMS', 'POLICYNUMBER', null, 'CLAIMS_SET');
end;/
-- Top 5 most suspicious fraud policy holder claimsselect * from(select POLICYNUMBER, round(prob_fraud*100,2) percent_fraud,
rank() over (order by prob_fraud desc) rnk from(select POLICYNUMBER, prediction_probability(CLAIMSMODEL, '0' using *) prob_fraudfrom CLAIMSwhere PASTNUMBEROFCLAIMS in ('2to4', 'morethan4')))where rnk <= 5order by percent_fraud desc;
Automated In-DB Analytical Methodology
POLICYNUMBER PERCENT_FRAUDRNK------------ ------------- ----------6532 64.78 12749 64.17 23440 63.22 3654 63.1 412650 62.36 5
Automated Monthly “Application” -- Just add:
Create
View CLAIMS2_30
As
Select * from CLAIMS2
Where mydate > SYSDATE – 30
Time measure: set timing on;
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Operationalizing and Embedding Analytics for Action
22
How long does it take to put a defined model into operational use?
?
??Length of time to put a model into production.
Based on 141 respondents who stated they are doing this today
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Introducing:Oracle Machine Learning SQL Notebook
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
ADWC + Oracle Machine Learning Notebooks
Oracle Exadata Cloud Service
Oracle Database Cloud Service
Express Cloud Service
Data Warehouse Services(EDWs, DW, departmental marts and sandboxes)
Autonomous Data Warehouse Cloud
Service Console
Built-in Access Tools
Oracle Machine Learning
Service Management
Autonomous Database Cloud
Oracle SQL Developer
Developer Tools
Oracle Object Storage CloudFlat Files and Staging
Data Integration
Services
Oracle Data Integration Platform Cloud
3rd Party DI on Oracle Cloud Compute
3rd Party DI On-premises
24
Oracle Analytics Cloud
Analytics
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Free Trial & Hands on Workshop:
https://go.oracle.com/adwc
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Oracle Advanced AnalyticsBrief Demo + Additional Reference Slides
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Tax Noncomplaince Audit SelectionTwo Example Approaches - There are many possible more!
• Anomaly Detection
– Build 1-Class Support Vector Machine (SVM) models on “normal or compliant” tax submissions
• Unsupervised machine learning when few know examples on which to train e.g. < 2%
– Build Decision Tree models for classification of Noncompliant tax submissions (yes/no) based on historical 2011 data• Supervised machine learning approach
when many known examples of target classes are available oh which to train
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
SQL Developer/Oracle Data Miner GUIAnomaly Detection—Simple Conceptual Workflow
Train on “normal” recordsApply model and sort on
likelihood to be “different”
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Financial Sector/Accounting/ExpensesAnomaly Detection
Simple Fraud Detection Methodology—1-Class SVM
More Sophisticated Fraud Detection Methodology—Clustering + 1-Class SVM
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Multiple Approaches To Detect Potential Fraud
1. Anomaly Detection (1-Class SVM) • Add feedback loop to purify the input training data over time
and improve model performance
2. Classification• IF you have a lot of examples (25% or more) of fraud on which
to train/learn
3. Clustering• Find records that don’t high very high probability to fit any
particular cluster and/or lie in the outlier/edges of the clusters
4. Hybrid of #3 and then #1• Pre-cluster the records to create “similar” segments and then
apply anomaly detection models for each cluster
5. Panel of Experts • i.e. 3 out of 5 models predict possibly anomalous above 40% or
any 1 out of N models considers this record unusual
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Getting Started—Oracle ML/AA Resources & LinksOracle Advanced Analytics Overview Information
• Oracle's Machine Learning and Advanced Analytics 12.2c and Oracle Data Miner 4.2 New Features preso• Oracle Advanced Analytics Public Customer References• Oracle’s Machine Learning and Advanced Analytics Data Management Platforms white paper on OTN • Oracle INTERNAL ONLY OAA Product Management Wiki and Beehive Workspace (contains latest
presentations, demos, product, etc. information)
YouTube recorded Oracle Advanced Analytics Presentations and Demos, White Papers • Oracle's Machine Learning & Advanced Analytics 12.2 & Oracle Data Miner 4.2 New Features YouTube video• Library of YouTube Movies on Oracle Advanced Analytics, Data Mining, Machine Learning (7+ “live” Demos
e.g. Oracle Data Miner 4.0 New Features, Retail, Fraud, Loyalty, Overview, etc.)• Overview YouTube video of Oracle’s Advanced Analytics and Machine Learning
Getting Started/Training/Tutorials Link to OAA/Oracle Data Miner Workflow GUI Online (free) Tutorial Series on OTN Link to OAA/Oracle R Enterprise (free) Tutorial Series on OTN Link to Try the Oracle Cloud Now! Link to Getting Started w/ ODM blog entry Link to New OAA/Oracle Data Mining 2-Day Instructor Led Oracle University course. Oracle Data Mining Sample Code Examples
Additional Resources, Documentation & OTN Discussion Forums• Oracle Advanced Analytics Option on OTN page• OAA/Oracle Data Mining on OTN page, ODM Documentation & ODM Blog• OAA/Oracle R Enterprise page on OTN page, ORE Documentation & ORE Blog• Oracle SQL based Basic Statistical functions on OTN• Oracle R Advanced Analytics for Hadoop (ORAAH) on OTN
Analytics and Data Summit , All Analytics, All Data, No Nonsense. March 12-14, 2019, Redwood Shores, CA
•
Send email now to [email protected] and you’ll get my “away message” with the links.
Fraud and Error in the NHS
Phil Godfrey – Data Scientist
NHS Business Services Authority
NHSBSA Portfolio
We have data covering:
• More than £35 billion NHS activity
• Billions of transactions
• Millions of NHS staff
• Tens of thousands of NHS
contracts
• Millions of citizens
Key facts data stats (2017/18)
1,000,000,000The number of
prescription items
processed every year
44,360,000FP17 Dental Claim
forms processed
LR 14,700,000 The number of claims
for free prescriptions
and dental procedures
checked by our loss
recovery teams
11,000,000Applications from
patients for help with
their NHS health
costs
3,840,000The number of calls
our Contact Centre
handled plus 785,000e-mails
5,500,000The number of
European Health
Insurance Cards
provided to UK
residents
2,200,000The number of
employee assignments
processed per month
for 1,500,000 NHS
Staff
1,400,000Actively contributing
NHS Pension
members, 550,000deferred members
paying 860,000recipients each month
3,000The number of people
registered and eligible
for support through the
England Infected
Blood Support
Scheme
£5,100,000Paid out in NHS
Student and Social
Work Bursaries to
80,000 Students
In the beginning
• What are the questions we want to answer?
Questions that not only demonstrate business value
but also test the requirements of the volumes of
data.
• What data do we have to answer these questions?
Where is it?
• How can we get the data from a transactional
system to an environment where we can do some
advanced analytics?
• Who is going to own the actions from the insight
created?
• How do we engage/educate/involve the rest of the
organisation?
Absolute must haves of our Data Lab
• Security
• Performance
• Hold large amounts of data and process
efficiently
• Analysis of the data without not knowing the
person
• No issues with the network
• Don’t need huge computing power on your
desk
• No need for specialist skills such as Hadoop /
Spark
• Availability of training – online and classroom
Why Oracle Exadata and Advanced Analytics?
• Engineered in-memory system (algorithms executed within
database)
• Existing Oracle database experience within the
organisation
• Courses and support available
• Affordability
• Speed to deploy
• Integration with R
• Unstructured data
• Intuitive to use
• Oracle understood our business problem
Our Data Lab
Insight
Operational Improvements
Finding Fraud and Error
Data
Lab 0102
03
Examples of how we use our data to find unwarranted variations
• Polypharmacy
• Fraud in the NHS
• Unusual Quantities
• Primary Care
• Homeopathic analysis
• Drug prices
Using Oracle & Advanced Analytics in the NHS
Polypharmacy• Polypharmacy is described as ‘the concurrent use of multiple medication items by one
individual’
– Appropriate polypharmacy – will extend life expectancy and improve quality of life
– Inappropriate polypharmacy – can be harmful and may increase the risk of adverse drug
reactions, impairing quality of life for patients
Discussion Points1. Analysis
2. Findings
3. Recommendations
4. Outcomes
Polypharmacy findings by area
Percentage of patients in each Clinical Commission Group area who were dispensed ten or more items in June
2018
Knowlsey, Liverpool and Bradford City CCGs have the 1st, 2nd and 3rd highest percentage of patients who were dispensed ten or more
medications.
Polypharmacy basket analysis
Specific insights:
• Identification of commonly co-
occurring medicinal items (Market
Basket Analysis).
– In particular, highest co-occurring
items are multiple doses of
Warfarin.
– Food items frequently occur in
multiple varieties together.
• Identification of co-occurrences of
medicines with potentially harmful
interactions.
Potentially harmful NSAID combinations
in March 2015
Monitoring polypharmacy
Dashboard CCG Practice NHSBSA Practice Patient
• The polypharmacy dashboard was first published in July 2017.
• CCGs monitor the dashboard and notify practices of their performance. Practices request
polypharmacy patient details from the NHSBSA, upon which to take action.
• Practices from only 11 CCGs have requested patient details; one of those active CCGs is
North East Hampshire and Farnham. Its performance has decreased significantly
compared to nationally.
National
CCG
NE Hampshire & Farnham
Fraud in the NHS
• It is estimated that £1.25bn of fraud is being committed each year by patients, staff and contractors
(NHSCFA, 2017)
• The sum represents about 1% of the NHS budget
Irregularities in the prescribing and dispensing of dependency forming
drugs
• Drug abuse isn't just about recreational drugs. Besides marijuana, legal medicines
are the most commonly abused drugs in the UK. Some prescribed drugs can be
addictive and dangerous
• Prescribing of addictive medicines has increased 3% over five years
• One patient in eleven (8.9%) who receives a prescription is prescribed one of
these medicines
• Antidepressant prescriptions in England have more than doubled in the past 10
years
• A recent survey also found that 7.6% of adults had taken a prescription-only
painkiller not prescribed to them
• From April 2017 to March 2018 there were approximately 149 million dependency
forming drug items dispensed at a cost of £1.2bn
Unusual Quantities
Purpose
• The purpose of this analysis was to provide insight into outlier analysis of items dispensed at prescription
level, with a focus on products with unusually high quantities.
• Identifying these outliers could help the NHSBSA to realise cost savings, improve our internal processes
and have a positive impact on patient care.
• Research conducted by NHS National Patient Safety Agency (2007) identified over three thousand
Medication incidents in 2007 (5% of all Medication incidents) were categorised as “Wrong Quantity”.
Although the impact of these incidents is not clear, they could have the potential to cause harm to
patients.
Unusual Quantities
How?
• Utilising Oracle Anomaly Detection.
• Descriptive Statistics using the aggregate function to
describe attributes of drugs:
– Minimum, Median, Average, Maximum quantities
• New functionality allows us to include “Partition By”
drug, which makes this analysis possible.
Why this is important?
• Without the Partition By we would combine similar but
very different quantities across all drugs:
– 1,000 tablets
– 1,000 ml
– 1,000 litres
• All of the above would be considered the same!
Unusual Quantities
Findings
• The findings have been shared with a pilot CCG, and the potential benefits are:
• Aim to reduce prescribing error by identifying highly prescribed quantities (which could be made in error).
• Identifying any products which are commonly appearing
• Influence training needs for any CCG’s/practices that appear prominently in the analysis.
Next
• Looking to movie this analysis to be business as usual, investigating using Oracle Integration with R to
allow a script to produce this with no user intervention.
DHSC Historic Drug Prices
Purpose?
• Department of Health and Social Care requested DALL to identify
products which display unusual price behaviour to help inform
investigations.
Why?
• An article published in the Times Newspaper identified 36 drugs of
particular interest which were supplied by the DHSC.
• Understanding attributes of these drugs
and prescribing patterns could provide a
methodology to identify additional drugs
DHSC Historic Drug Prices
Why?
• DALL applied this insight over our entire
prescriptions dataset (December 2013 to
September 2017) with the aim of identifying
additional products to investigate.
How?
• Regression analysis conducted on a drugs
pack price and reimbursement values over
time.
• Only those with a positive regression slope
on pack price and total cost returns products
with an increasing cost to the NHS.
• The top 25% of products are returned to
reduce the dataset.
DHSC Historic Drug Prices
Findings
Branded Products
– 1,120 branded products identified meeting the criteria, with
products being supplied by 111 wholesalers.
Generic Products
– 269 generic products identified meeting the criteria.
• All of the above products, along with a dashboard of data showcasing
the drug pack prices and cost to the NHS over time have been
provided to the DHSC for investigation.
Oracle SQL - Regression
• Regression functionality built-in to Oracle SQL.
• Allows regression to be calculated with single lines of code!
• As well as SLOPE , you’re also able to calculate all other
regression components, Intercept, R-Squared, average X and Y
values.
SELECT job_id, employee id id, salary, REGR_SLOPE(SYSDATE-hire_date, salary) OVER (PARTITION BY job_id) slopeFROM employees
Delivering Analysis through ePACT2
using Oracle Analytics Cloud
ePACT2 Project
• Provides authorised NHS users with free access to a wide range of prescription
data not available in the public domain due to its sensitivity
• Delivered originally using BICS and now Oracle Analytics Cloud
• NHSBSA create a number of dashboards that allow CCG’s to understand the
variation in prescribing across GP practices, within a CCG and nationally.
• Dashboards are created in conjunction with cross-NHS working groups,
Academic Health Science Networks (AHSN’s) to ensure clinical
appropriateness.
ePACT2 – Then and Now
ePACT
ePACT2
Export data to analyse and visualise
Analyse and visualise within ePACT2
ePACT2 Functionality
• ePACT2 delivered through Oracle Analytics Cloud, allows us to create dashboards and produce ad-hoc
analysis.
• To date, ePACT2 has delivered 13 dashboards to reduce costs and improve health of patients
Polypharmacy• To provide insight into prescribing of
multiple medicines for individual
patients in primary care.
• To provide potential metrics to show
the variation of Polypharmacy across
England to help CCGs and others to
target their work plans to address
PolyPharmacy.
• Reducing the inappropriate use of
multiple medicinal items by an
individual will:
– Reduce the risk of harmful interaction of
drugs on individual patients.
– Reduce spend of ineffective or harmful
prescribing.
– Reduce hospital admissions for adverse
reactions to medicines that could be
avoided.
Items which should not be routinely prescribed in Primary
Care• Introduced in November 2017, this
dashboard reports on spend and
volume of items prescribed on a list
of 18 treatments, drawn up with
family doctors and pharmacists,
deemed to be ineffective, over-
priced and of low clinical value.
• Based on our current drug mapping,
the cost to the NHS is £24.4M in the
latest quarter (February to April
2018).
• The data can be broken down to GP
Practice level, making it as relevant
as possible for care commissioners
to use.
Homeopathic Analysis
• Prescription data can be utilised to
explore prescribing for particular
medicines.
• Analysis shows unusual prescribing
pattern in 50-54 year olds.
• Bristol CCG accounts for 77.3% of all
50-54 homeopathic prescribing.
Drug Prices
• Analyses of Atropine 1%
eye drops explores drug
price variation over time.
• The price of this product
has increased over
19,000% since 2004.
• We can use our data to
understand how supplier
behaviour could have
impacted prices.
• This type of analysis can
be produced for any drug,
over a large time period.
In Summary
• Continue working with business areas and using our data to identify areas of fraud and error.
• Deploying some of our data science insights into operational efficiencies and anomaly detection.
• Collaborations with other NHS organisations to expand the availability of NHS data for data science.
Any Questions?