innovate analytics with oracle data mining & oracle r
DESCRIPTION
The “big data” buzz has garnered a lot of interest lately. Many assume that big data and discovering newfound insights about currently collected data involves investment and skilling up on new tools. Truth be told, you may already have the tools needed to tap into your big data potential. In this webinar, learn how: - Oracle Data Mining and Oracle R can generate insights without the hype or large investment in big data products - Oracle Data Mining integrates seamlessly with Oracle Business Intelligence Enterprise Edition (OBIEE) - Statistical analytics can be achieved by using Oracle R Come away with insights on real-world scenarios to implement quick wins at your organization. http://www.capgemini.com/oracleTRANSCRIPT
Better intelligence, smarter decisions
Christian Screen, CapgeminiOracle Analytics Practice (formerly BI Consulting Group)
Tuesday December 11, 2012
Innovate Analytics with Oracle Data Mining & R
| Better intelligence, smarter decisions
Christian Screen
© 2012 Capgemini. All rights reserved.
o Solutions Engineer at Capgemini o15 Years in Technologyo Co-Author of Oracle BIEE 11g – A Hands-On Tutorialo Podcast & Blog at ArtofBi.como Oracle ACEo Oracle Deputy CTOo Oracle Hyperion Certified Consultanto BI Evangelist
| Better intelligence, smarter decisions
Agendao What Data Mining?
o What is R?
o Oracle BI 11g + Oracle Data Mining!
o Use Cases
o Getting your Organization Started
o Oracle Endeca & Data Mining
o With Oracle BI 11g
o Oracle Predictive Analytics stack - Tie it all together
o Q & A
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Getting the Message Today
An understanding of Data Mining and RHow Predictive Analytics is what your organization needs nowMotivation for starting a Predictive Analytics projectHow Oracle BI 11g works with Data Mining & R
What I want to leave you with today…
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
What is Data Mining?
“The process that attempts to discover
patterns and hidden knowledge in large data sets in order to
aid the decision making process.”
DiscoveryPattern
s Fraud Detection
Algorithms
Clustering
Classification
Association Analysis
Predictive Analytics
Regression
Summarization
Affinities
ProbabilitiesStatistical Confidence
Un
bia
se
dLarge Data Sets
Anomaly Detection
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Data Mining Use Cases
Example # 1 – Market Basket Analysis
Determine combination of items purchased together generating highest or lowest margin
Predict new item sales when sold together with product sold at discount price
Calculate based on previous sales of individual products which store would benefit most from joined product sales
Example: computer + monitor, printer + paper
Example #2 – Relationship Patterns
Think Linked-In and Facebook Based on similar existing data points
locate other accounts and rank based on affinity
Predict how may friends/people may accept an offer based on previous behavior and the behavior of similar users
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Oracle Data Mining
Oracle’s DM Solution for Solving Business Problems
Oracle Data Mining is installed automatically when you install Oracle Database Enterprise Edition.
GUI is included with SQL DeveloperA mature DM Platform
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
What is R?
o A Open Source programming language and software environment
for statistical computing and graphics.
o Standard developed and maintained by the R Foundation
o Part of the Free Software Foundation GNU project
o Watch Movie: Revolution OS (2002)
o Client or Client Server
o Many images available on Amazon EC2
o R-Studio seems to be a leading Open Source IDE for R
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
What is R?
What Else?o Really good at statistical and graphical techniques, including linear
and nonlinear modeling, classical statistical tests, time-series
analysis, classification, clustering, and others
o Text Mining and Lexical scoping
o Extends via user created Libraries/Packages
o Automated report and document generation
o Think associations, algorithms, statistics and
surprisingly good graphics
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Data Mining vs. R
R
Does not require a database Complex algorithms can require large
amounts of memory and processor Visual output built in or by libraries User-built libraries make for expansive
capabilities and logic to be developed Skill-up required for R Script language Data does not need to be clean
though it helps Uses connectors for data sources
Data Mining
Develop Statistical Models Resides in the Database No visual output SQL Developer plug-in available Leverage some existing DB skills Likes cleansed data
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Oracle BI & Data Mining / R
Start today! Think Competitive Advantage.Data Scientists are heroesExperimentation is keyLeverage existing investments Its part of Business IntelligenceUsed by a few, developed by fewerGet Endeca Information DiscoveryGet R & R-StudioExplore Predictive Analytics
Getting the Organization Started is Simple
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Oracle BI & Data Mining / R
Traditional BI is necessary but can become stagnantPredictive Analytics complements BIExisting BI investments distribute Predictive AnalyticsAll areas of an organization can benefit
Bringing it All Under Business Intelligence makes sense…
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Oracle Predictive Analytics Now
BI, Discovery, Data Mining, & R
Predictive Analytics
Includes traditional BI concepts
Expands the BI TeamExpands knowledge of
informationTactically targets specific
scenarios of the businessProvides non-linear
perspective on the business
Tells us how we’re doing today
Distributes informationTracks Performance MetricsMakes sense of ERP data
Traditional BI / Analytics
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Oracle Predictive Analytics
Endeca is mainly a three part tool (ETL, Server, Client)• Data Mining typically depends on a cleansed source, i.e. DW
Endeca includes powerful features, visualizations, and guided search navigation, perfect for self-service discovery• Data Mining is database driven so no inherent graphics• R has visuals and client tools for development but is not distributed
Endeca projects perform well in an iterative design-develop-feedback loop• Data Mining and R are built from models and algorithms are typically consumed
by other systems
Endeca Information Discovery (EID) vs. Oracle Data Mining / R
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Oracle Predictive Analytics Stack
Oracle Advanced Analytics in Oracle 11g R2 RDBMS• Data Mining & Enterprise R
o Aims to eliminate need for SAS, SPSS, Matlab, etc.
Enterprise-R• Oracle’s flavor of R
Exadata MachineExalytics (as consumer)
• Endeca• OBIEE• Essbase
Engineered to work together…
Start Planning Now for Running your business in
the future of Analytics.
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Upcoming Endeca Webinar
Predictive Advertising Analytics in MediaScott Schlesinger Paper on Predictive Analytics in the Movies
(http://www.digitalmarketingsuite.com/url/60534)
Endeca Information Discovery in the Wild
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Data Mining with Oracle BI 11gOracle R Enterprise in Oracle BI Dashboards
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Data Mining with Oracle BI 11g
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Oracle R with Oracle BI 11gORE in Oracle BI 11g dashboard – Flight Delays
© 2012 Capgemini. All rights reserved.
| Better intelligence, smarter decisions
Data Mining & R with Oracle BI 11g
Embedded R Script using Parameterization from Oracle BI select * from table(rqTableEval( cursor(select ARRDELAY,DISTANCE,DEPDELAY,YEAR,MONTH,DAYOFMONTH,DEPTIME,ARRTIME,UNIQUECARRIER,FLIGHTNUM,ORIGIN,DEST, ORIGIN||'-'||DEST ROUTE from ontime_s where year >= valueof(NQ_SESSION.OR_ARG1) and year <= valueof(NQ_SESSION.OR_ARG2) and DEPDELAY is not NULL), cursor(select 1 max1, 1 pos1, 'mod' name1, to_number(null) max2, to_number(null) pos2, to_char(null) name2, total, chunk, value from ontime_lm), 'select ARRDELAY, DISTANCE, DEPDELAY,YEAR,MONTH,DAYOFMONTH,DEPTIME,ARRTIME,UNIQUECARRIER,FLIGHTNUM,ORIGIN,DEST, ORIGIN||''-''||DEST ROUTE, 1 PRED from ontime_s', 'PredictDelays-score'))order by 1, 2, 3
© 2012 Capgemini. All rights reserved.
Better intelligence, smarter decisions
Questions?
© 2012 Capgemini. All rights reserved.
www.capgemini.com/bim
The information contained in this presentation is proprietary. ©2012 Capgemini. All rights reserved
Contact Christian [email protected]
Access Oracle BI Training at Capgemini’s BICG Universitylink