predictive dynamix software
TRANSCRIPT
Predictive Dynamix Software
Predictive DynamixTurning Business Experience into Better Decisions
Proprietary & Confidential
Predictive Dynamix – Company Profile
• Founded in 1999• Based in Houston, TX• Provides advanced predictive data mining
software solutions• Three primary focus areas:
– Predictive Suite: Software for commercial data mining analysis
– Predictive Engines: Software components for integrating predictive data mining functionality into vertical applications
– Professional services
Proprietary & Confidential
Predictive Dynamix – About the FounderPaul Duke has over 15 years of commercial experience in developing advanced pattern recognition applications.
• Predictive Dynamix:– Chief product visionary and architect– Leads professional services
• Prior to Predictive Dynamix:– NeuroCorp: Vice President of Development
• Headed software development & consulting services• Developed methods for geospatial data mining, demographic
analysis, sales forecasting, and retail network optimization– Texaco: Artificial Intelligence Group
• Headed applied R&D and consulting• Pioneered a variety of engineering, geoscience, and marketing
applications of neural networks, rule-based systems, & other AI methods
• Instructed “Decision Support & Expert Systems” MBA course at Rice University
Proprietary & Confidential
Predictive ModelingPredictive modeling maximizes the value of domain experience anddata in order to make better decisions.
PredictiveModel
Historical /Operational
Data
Pattern classification/scoringTrend forecastingEvent detectionOutlier detection
Contributing Disciplines:Statistics / Artificial Intelligence / Signal & Image ProcessingEconomics & Finance / Information Theory / Control Theory
Operations Research / Medicine / Social & Behavioral Sciences
Proprietary & Confidential
Predictive Data Mining Applications
In Virtually Every Industry:– Agriculture – Automotive– Charities– Electronics– Energy– Finance– Insurance– Healthcare– Travel & Leisure– Pharmaceuticals– Retail – Telecommunications
• Customer Relationship Management & Marketing– Targeted Marketing– Churn Forecasting– Consumer Propensity Modeling– Personalized Content Management
• Site Selection• Price Elasticity Modeling• Category Management• Fraud Detection• Credit Scoring• Political Forecasting• Countless science & engineering applications
Proprietary & Confidential
Knowing Your Customers
Historical Customer Sales
Data
GIS-based Data
Customer Information
Previous Purchase Categories
Size, Recency, & Frequency of Previous Purchases
Customer Demographics:- Age - Income- Children - Education- Home Ownership, etc.Geocode
Web-basedClickstream
Data
“Interest” Categories
Recency of Website Visits
Frequency of Website Visits
Regional Competitive Intensity
Local Commercial Makeup
Payment History
Other Locational Factors
Other Historical Sales Factors
Other Web Factors
Forecast likelihood of response to a product campaign
Predicting lifetime customer value
Forecast likelihood of a transaction being fraudulent
Scoring for credit worthiness
Forecast likelihood of a customer switching to a competitor
Proprietary & Confidential
Knowing Your Stores
Detailed SiteData
GIS-based SiteData
Store Information(Client &
Competitor)
Historical Sales
Brand
Site Demographics:- Population - Income- Businesses - EducationGeocode
GIS-based Customer
Data
Site Proximity
Facility Factors- Capacity - Moderness- Layout - Signage- APC’s
Historical Purchases
Demographic Profile
Local Competitive Intensity
Local Commercial Makeup
Other Locational Factors
Operational Factors- Pricing - Staffing- Hours - Service- Upkeep - Promotions- Merchandising
Other Customer Factors
Forecast new site sales potential
Identify site rehab / upgrade / purchase candidates
Demand forecasting & inventorymanagement
Implement dynamic pricing
Identify category mix
Geocode
Target campaigns via profiles, purchase history, & proximity
Site Traffic
Proprietary & Confidential
Value of Predictive Modeling
• Maximize the value of the corporate data infrastructure
• Maximize the value of your business experience
=>Better Decisions
• Predictive data mining provides the means to recognize important patterns and trends that exist across many variables
• Using predictive data mining to model business dynamics, many variables can be effectively assimilated to produce a forecast
Proprietary & Confidential
Predictive Suite OverviewComprehensive workbench for analysis and model building
Data & Model Analysis:• Graphical, Crosstab, and
Statistical Analysis• Data Sampling• SQL Querying• Automated Variable Selection• ROC/Lift Analysis• Response Surface• What-if Analysis
Model Types:• Multiple Regression• Neural Networks• Self-Organizing Map• Dynamic Clustering• Decision Tree• Fuzzy Logic Rules
Proprietary & Confidential
Predictive Engines – Software Components
Non-UI API Components
User Interface Components
Model Engines Clustering Neural NetworksFuzzy Logic RegressionDecision
Tree
OptimizationGenetic
Algorithms& Annealing
Model UI CMDynamix NNDynamixFPDynamix RGDynamixDTDynamix
Data Analysis InteractiveGraph
InteractiveCrosstab
InteractiveStatistics
Database DatabaseSQL
DatabaseBrowser
DatabaseSampler
Model AnalysisModel
Response Surface
Model Variable
Sensitivities
Interactive Model
Execution
Model Lift/ROC
Vertical
Applications
Proprietary & Confidential
RGDynamix – Multiple Regression• Regression models use a linear
equation to approximate relationships
• General least squares algorithm minimizes error across the training dataset
• Single weight per input factor results in the simplest model type
• Best fit application types:– Substantial set of example data– Useful to understand model
equation/weights– Highest performance training and
prediction– Simple representation is sufficient
to model application dynamics
Y = Σ Ii * Wi + Constant
Proprietary & Confidential
Regression Model
Y
N
Y
Y
YY
YY
Y
Y
Y
Y
Y
Y
Y
Y
YY
Y
YY
Y
Y
Y
YY
Y
Y
Y
Y
Y
Y
N
N
N
NN
NN
NN
N
N
N
N
NN
N
N
N
NN
NNN
N
NN
NN
N
N
N
N
N
NN
N
N
NN
N
N
N
N
N
N
YY
Y
Y
Proprietary & Confidential
NNDynamix – Neural Networks• Multi-layer perceptron models are
highly accurate for forecasting & classification
• Back propagation learning algorithm can discover complex, non-linear relationships from data
• Rigorous training methodology for building optimal models
• Best fit application types:– Substantial set of example data– Limited knowledge of first principles– Highest accuracy requirements – High performance requirements
Proprietary & Confidential
Neural Network Model
Y
N
Y
Y
YY
YY
Y
Y
Y
Y
Y
Y
Y
Y
YY
Y
YY
Y
Y
Y
YY
Y
Y
Y
Y
Y
Y
N
N
NN
NN
NN
N
N
N
N
NN
N
N
N
NN
NNN
N
NN
NN
N
N
N
NN
N
N
N
N
N
N
YY
Y
Y
NN N
N
N NN
Proprietary & Confidential
CMDynamx – Cluster Models• Cluster-based models for
forecasting, categorization, & exploratory data analysis
• Discovers complex, non-linear, relationships from data by grouping similar cases together into clusters
• Flexibility for forecasting, outlier detection, pattern completion, error estimation, & other apps
• Best fit application types:– Substantial set of example data– Benefit from qualitative
information about model forecasts– Model needs to be extensible to
different outputs
Proprietary & Confidential
Cluster Model
Y
N
Y
Y
YY
YY
Y
Y
Y
Y
Y
Y
Y
Y
YY
Y
YY
Y
Y
Y
YY
Y
Y
Y
Y
Y
Y
N
N
N
NN
NN
NN
N
N
N
N
NN
N
N
N
NN
NNN
N
NN
NN
N
N
N
N
N
NN
N
N
NN
N
N
N
N
N
N
YY
Y
Y
Proprietary & Confidential
DTDynamix – Decision Trees• Decision tree models use IF-THEN
rules to generate classifications and predictions
• CHAID algorithm derives rules from data via recursive partitioning
• Variable selection is performed as the model is learning
• Best fit application types:– Substantial set of example data– Model explainability is more
important than statistical accuracy– Data is categorical
Proprietary & Confidential
Decision Tree Model
NNYY NYY Y N N
N NY
N NNN NY Y
YY Y N NN NY
Y NNN
NN NN
NYY Y NYY NY N YY NY NN YY Y NNY Y YNN
YYY NNN YY NN
N N Y YNNN N
N
Proprietary & Confidential
FPDynamix – Fuzzy Logic Rules• Fuzzy logic rules for describing
first principles relationships • Models can be discovered from
data or be specified by the model analyst
• Model weights can be constrained, preserving model dynamics while adapting to data
• Best fit application types:– Application dynamics are
generally understood– Limited example data– Need to audit decision logic– Model needs to adapt predictably
to new data
High
Average
Low
High
Average
Low
High
Average
Low
Proprietary & Confidential
Fuzzy Logic Model
NNYY NYY Y N N
N NY
N NNN NY Y
YY Y N NN NY
Y NNN
NN NN
NYY Y NYY NY N YY NY NN YY Y NNY Y YNN
YYY NNN YY NN
N N Y YNNN N
N
Proprietary & Confidential
Genetic Algorithms• Stochastic optimization algorithm
based on Darwinian concepts of natural selection and survival of the fittest
• Population of solutions is generated and evaluated. Features of better solutions have a higher probability of surviving into future generations
• Best fit applications:– Many solution parameters to
search– Highly non-linear dynamics– No direct gradient information
available– No target information
Proprietary & Confidential
Predictive Suite – Data AnalysisGraphical Analysis Crosstab Analysis
Statistical Analysis
Proprietary & Confidential
Predictive Suite – Model AnalysisROC/Lift Analysis What-if Analysis
Model Response Surface Model Sensitivities
Proprietary & Confidential
In Summary…
Predictive Engines
Components
Tailored Solutions
State-of-the-ArtPredictiveAlgorithms
Data MiningApplications
ExtensiveIndustry
Experience