2016 03-16 digital energy luncheon
TRANSCRIPT
Digital Energy LuncheonMarch 16, 2016
Machine Learning:Fundamentals andE&P Applications
2
Introduction to Southwestern Energy
Southwestern Energy Company (NYSE: SWN) is a leading natural gas and oil company with operations predominantly in the United States, engaged in exploration, development and production activities, including related natural gas gathering and marketing.
Source: http://www.swn.com/
3
Digital Energy Luncheon
Machine Learning:Fundamentals and E&P Applications
Machine Learning encompasses data acquisition, transmission,
retention, analysis, and reduction. The expected outgrowth of 24x7
data systems and operations centers is Knowledge Engineering and
Data Intensive Analytics AKA Machine Learning. This presentation will
develop and apply Machine Learning concepts to the Upstream O&G
industry. Specific focus will be given to the fundamental concepts and
definitions of Machine Learning along with the application of Machine
Learning.
4
Machine Learning
“ A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. ”
~Tom Mitchell
Source: Tom Mitchell, Mitchell, T. (1997). Machine Learning, McGraw Hill.
5
Use Case #1 – Lateral Placement
Source: http://geology.com/articles/horizontal-drilling/
6
Predictive Analytics• Focuses on Prediction
– Based on Known Properties– Learned from Training Data
Data Mining• Focuses on Discovery
– Unknown Properties in Data– The Analysis Phase of
Knowledge Discovery
Precursors to Machine Learning
Machine Learning is the “Extraction of Wisdom by Understanding the underlying Data”
~Mark Reynolds
Source: Mark Reynolds, compilation
7
Machine Learning: Data into Wisdom
Source: Mark Reynolds, compilation
Seismic
Drilling
Completions
Production
Data InformationVisualization
KnowledgeForensics
UnderstandingAnalysis &
Mining
WisdomAnticipating Application
RT Frac
Daily Rpts
Well Plan RT
Drill
Geo-steer
AFE
RT Prod
Reservior
8
Machine Learning on the Hype Curve
9
Use Case #2 – Offset Torque & Drag
Source: Gefei Liu, PVI Connecting Dots with Lines Using Drilling Software, August 20, 2013 http://www.pvisoftware.com/blog/2013/08/
10
The Four Paradigms in O&G
• O&G is where we found itEmpirical
• O&G is where we expect itTheoretical
• O&G is where we estimate itComputational
• O&G is where we infer itData Exploration
Source: Mark Reynolds, compilation
11
The Catalyst• Data captured by
instruments• Data generated by
simulations• Data acquired by
sensor networks
The Destination• Solutions from data analysis• Solutions from data mining• Solutions from visualization• Solutions from drill down• Solutions for bottom line• Solutions using eScience
Machine Learning in the 4th Paradigm
Source: eScience and the Fourth Paradigm: Data-Intensive Scientific Discovery and Digital Preservation, Tony Hey, Microsoft Research http://www.alliancepermanentaccess.org/wp-content/uploads/2011/12/apa2011/15_%28Nov11%29TonyHey-APA%20Meeting.pdf
“ eScience is the set of tools and technologiesto support data federation and collaboration ”
~ Jim Grey
12
Machine Learning in the 4th Paradigm
Acquire Analyze Annunciate Archive Analyze Anticipate Apply
Data InformationVisualization
KnowledgeForensics
UnderstandingAnalysis &
Mining
WisdomAnticipating Application
Creating Informational Accessibility and Transparency Discovering Experiential Performance Improvements Segmenting Processes and Process Results Replacing Human Decision w/ Automated Algorithms Innovating New Models, Products, Services
Source: Mark Reynolds, compilation
13
Modern Data Exploration
Unsupervised Learning
Supervised Learning
Reinforcement Learning
Semi-Supervised Learning
24/7
Predictive Analytics
Machine Learning
Data Mining
AI
Source: Mark Reynolds, compilation
14
Principal Concepts in Machine Learning
• Unsupervised Learning– Data is unlabeled
• Supervised Learning– Teach and train with data that is well labeled with a
defined output• Reinforcement Learning
– Validity of data alignment is served as feedback• Semi-Supervised Learning
– Some of the data is labeled, some is unlabeled
Source: Mark Reynolds, compilation
15
Use Case #3 – Unsupervised Learning
Unsupervised Learning Torque increases in the curve
Source: Mark Reynolds, compilation
16
Textbook Process of Machine Learning
Training Data
Pre-Processing Learning Error
AnalysisModel
Phase 1) Learning
Phase 2) Prediction
New Data ModelPredictable
Result
17
Algorithmic Approaches
• Decision Tree Learning– Maps observation to conclusions
• Association Rule Learning– Discovering interesting relations
• Artificial Neural Networks– Incremental function modules
• Inductive Logic Programming– Rule based representations for input
--> output
• Support Vector Machines– Classification and regression
• Clustering– Assignment of observations to
clusters
• Bayesian Networks– Probabilistic models correlating
variables
• Reinforcement Learning– Finds policy to map states to desired
outcome
• Representation Learning– Principal component analysis
• Similarity & Metric Learning– Pairs of examples train others
• Sparse Dictionary Learning– Datum as linear combinations
• Genetic Algorithms– Mimics natural heuristics
18
Use Case #4: Compositional Reservoir
SPE 154505
A novel approach for treating the phase stability and phase split problems in compositional reservoir simulation…
~Vassilis Gaganis, et al
Source: SPE 154505: Machine Learning Methods to Speed up Compositional Reservoir Simulation, June 2012
19
Machine Learning: The “Data Layer”
• Engineering the Source– Signals, content, and
characterizations• Engineering the Data
– Address errant data– Address valid spurious data– Address data quality
• Engineering the Store– Repository– Recall and Reporting– Representations
Data Acquisition
Data Transmission
Data Retention
Data Analysis
Data Reduction
Source: Mark Reynolds, compilation
20
Machine Learning: Data Diversity
• Macro (or field-level)– Spatial– Temporal
• Pad (or offset)– Spatial– Temporal
• Well (or wellbore)– Spatial– Temporal
• External– Uploads– Political, Climate, etc
• The 3 Cs of Data Quality– Consistency– Correctness– Completeness– [#4] Currency– [#5] Conformity
Source: Mark Reynolds, compilation
Data Diversity - Spatial, Temporal, Referential
21
Machine Learning: The “Output Layer”
• Engineering the Store– Data distribution– Data staging
• Engineering the Recall– Simple query– Cube v Matrix
• Engineering the Use Case– Destination: human– Destination: machine
Classification
Regression
Clustering
Density Estimation
Dimensional Reduction
22
Use Case #5: Decline Curve Anomaly
Source: Mark Reynolds, compilation
23
The Fast Data ecosystem in O&G
Land
Drilling
Reservoir Completion
Water
Production
Steering Regulatory
Midstream
Source: Assorted web images
24
Security –OPC / Scada / IIoT
Source: Industrial control systems and SCADA cyber-security, 11 August 2014, By Dr Richard Piggin http://eandt.theiet.org/magazine/2014/08/cyber-security-new-battlefront.cfm
25
Machine Learning must be Integrated
Systems & Knowledge Engineer
O&G Systems
Control Systems
Remote Systems
Information Systems
Embedded Systems
Robotic Systems
Data Fusion
Real-Time Systems
Look-Back Analysis
Look-Ahead
SystemsLand and Regulatory
Geology Geophysics
Drilling Engineering
Completion Engineering
Production Engineering
Reservoir Engineering
Systems Engineering
Source: Mark Reynolds, compilation
26
Algorithmic Approaches (revisited)
• Decision Tree Learning– Maps observation to conclusions
• Association Rule Learning– Discovering interesting relations
• Artificial Neural Networks– Incremental function modules
• Inductive Logic Programming– Rule based representations for input
--> output
• Support Vector Machines– Classification and regression
• Clustering– Assignment of observations to
clusters
• Bayesian Networks– Probabilistic models correlating
variables
• Reinforcement Learning– Finds policy to map states to desired
outcome
• Representation Learning– Principal component analysis
• Similarity & Metric Learning– Pairs of examples train others
• Sparse Dictionary Learning– Datum as linear combinations
• Genetic Algorithms– Mimics natural heuristics
27
Keep Your Eye on the Prize
Data
Information
Knowledge
Understanding
Wisdom
Application
The question is NOT“How can we … ?”
But instead“What is the objective?”
( or “Why?” )
28
And – Keep Your Eye on the Machine
29
Mark Reynolds
Mark Reynolds Vitae• Southwestern Energy• Lone Star College• Intent Driven Designs• Scan Systems• Sikorsky Aircraft• General Dynamics
• Southwestern Energy Email– [email protected]