tao zhao*, vikram jayaram, bo zhang and kurt j. marfurt, university of oklahoma huailai zhou,...
TRANSCRIPT
Tao Zhao*, Vikram Jayaram, Bo zhang and Kurt J. Marfurt, University of Oklahoma
Huailai Zhou, Chengdu University of Technology
Lithofacies Classification in the Barnett Shale Using Proximal Support Vector Machines
Outlines
Introduction
Theory and Formulations
Testing and Classification
Discussions
Conclusions
Acknowledgements
2
Outlines
Introduction
Theory and Formulations
Testing and Classification
Discussions
Conclusions
Acknowledgements
3
Introduction
What is the problem?
• Huge amount of data
• High dimensionality
• Nonlinear relation
4
Introduction
What is a proximal support vector machine (PSVM)?Proposed by Fung and Mangasarian (2001, 2005)
A recent variant of support vector machine (SVM) (Cortes and Vapnik, 1995)
Supervised machine learning technique that can recover the latent relation between existing properties and measurements
Height
Hair length
P1 P2
6’2’’ 5’7’’
1 in. 20 in.
Classification between male and female
5
Introduction
What is a proximal support vector machine (PSVM)?Proposed by Fung and Mangasarian (2001, 2005)
A recent variant of support vector machine (SVM) (Cortes and Vapnik, 1995)
Supervised machine learning technique that can recover the latent relation between existing properties and measurements
Height
Hair length
5’8’’
15 in.
Classification between male and female
6
?IT Specialist Need more dimensions!
Introduction
Why we use PSVM?
1. Explicit geologic meaning for each class
2. Faster than traditional SVM
3. Superior than ANNs
7
Introduction
How we use PSVM?
We applied PSVM to delineate shale and limestone in the Barnett Shale from both seismic and well log data.
General stratigraphy of the Ordovician to Pennsylvanian section in the FWB through a well in the study area (After Loucks and Ruppel, 2007).
8
Outlines
Introduction
Theory and Formulations
Testing and Classification
Discussions
Conclusions
Acknowledgements
9
Theory and Formulations
Why we use PSVM?
10
Sphericity
ColorRed Yellow Green Blue Purple
Hig
hM
ediu
mLo
w
AttributeSample Color Sphericity
1 Red High
2 Red Mid-Low
3 Green Mid-Low
4 Purple Medium
5 Blue Mid-Low
6 Yellow-Green Low
7 Green Low
8 Red-Yellow Low
9 Blue-Purple Medium
10 Red-Yellow High10
1
2
3
4
5
6
7
8
9
Low Vitamin C?
High Vitamin C?
Medium Vitamin C?
Medium-High Vitamin C?
Low Vitamin C High Vitamin C Medium Vitamin C
Unsupervised learning
Theory and Formulations
Why we use PSVM?
11
Sphericity
ColorRed Yellow Green Blue Purple
Hig
hM
ediu
mLo
w
AttributeSample Color Sphericity
1 Red High
2 Red Mid-Low
3 Green Mid-Low
4 Purple Medium
5 Blue Mid-Low
6 Yellow-Green Low
7 Green Low
8 Red-Yellow Low
9 Blue-Purple Medium
10 Red-Yellow High10
1
2
3
4
5
6
7
8
9
Low Vitamin C
Medium Vitamin C
High Vitamin C
Low Vitamin C High Vitamin C Medium Vitamin C
Supervised learning
Theory and Formulations
Cartoon illustration for a 2D PSVM classifier
Fundamentals for PSVM
12
Theory and Formulations
Cartoon illustration for a 3D PSVM classifier
Fundamentals for PSVM
13
Theory and Formulations
Cartoon illustration for an linearly inseparable problem
Mapping into higher dimensional space
14
Denotes “A”Denotes “B”
𝑥2+𝑦2=1
𝑥2+𝑦2=2
A:
B:
(𝑥 , 𝑦 )
(𝑥 , 𝑦 , 𝑥2+ 𝑦2)
Theory and Formulations
Cartoon illustration for an linearly inseparable problem
Mapping into higher dimensional space
15
Denotes “A”Denotes “B”
𝑥2+𝑦2=1
𝑥2+𝑦2=2
A:
B:
(𝑥 , 𝑦 )
(𝑥 , 𝑦 , 𝑥2+ 𝑦2)
Theory and Formulations
Cartoon illustration for an linearly inseparable problem
Mapping into higher dimensional space
16
Denotes “A”Denotes “B”
𝑥2+𝑦2=1
𝑥2+𝑦2=2
A:
B:
(𝑥 , 𝑦 )
(𝑥 , 𝑦 , 𝑥2+ 𝑦2)
Theory and Formulations
Cartoon illustration for an linearly inseparable problem
Mapping into higher dimensional space
17
Denotes “A”Denotes “B”
𝑥2+𝑦2=1
𝑥2+𝑦2=2
A:
B:
(𝑥 , 𝑦 )
(𝑥 , 𝑦 , 𝑥2+ 𝑦2)
Theory and Formulations
Cartoon illustration for an linearly inseparable problem
Mapping into higher dimensional space
18
Denotes “A”Denotes “B”
𝑥2+𝑦2=1
𝑥2+𝑦2=2
A:
B:
(𝑥 , 𝑦 )
(𝑥 , 𝑦 , 𝑥2+ 𝑦2)
Theory and Formulations
Cartoon illustration for an linearly inseparable problem
Mapping into higher dimensional space
19
Denotes “A”Denotes “B”
𝑥2+𝑦2=1
𝑥2+𝑦2=2
A:
B:
(𝑥 , 𝑦 )
(𝑥 , 𝑦 , 𝑥2+ 𝑦2)
Theory and Formulations
Cartoon illustration for an linearly inseparable problem
Mapping into higher dimensional space
20
Denotes “A”Denotes “B”
These two classes are now separable by a 3D plane.
Decision-boundary
𝑥2+𝑦2=1
𝑥2+𝑦2=2
A:
B:
(𝑥 , 𝑦 )
(𝑥 , 𝑦 , 𝑥2+ 𝑦2)
Theory and Formulations
Cartoon illustration for an linearly inseparable problem
Mapping into higher dimensional space
21
Denotes “A”Denotes “B”
Decision-boundary
𝑥2+𝑦2=1
𝑥2+𝑦2=2
A:
B:
(𝑥 , 𝑦 )
(𝑥 , 𝑦 , 𝑥2+ 𝑦2)
Theory and Formulations
Cartoon illustration for an linearly inseparable problem
Mapping into higher dimensional space
22
Denotes “A”Denotes “B”
Decision-boundary
𝑥2+𝑦2=1
𝑥2+𝑦2=2
A:
B:
(𝑥 , 𝑦 )
(𝑥 , 𝑥2+ 𝑦2)
Outlines
Introduction
Theory and Formulations
Testing and Classification
Discussions
Conclusions
Acknowledgements
23
Testing and Classification
Binary classification between shale and limestone in a Barnett Shale play
Seismic waveform classification
dim12345678
t.1 t.2 …
shale
limestone
PSVM classifier
24
Testing and Classification
Sample traces are selected by interpreters across the survey
Seismic waveform classification
Time slice at 1376 ms 25
14 ms tim
e
window
Testing and Classification
Testing the robustness
Seismic waveform classification
Percentage of Traces Used in Training
Number of Training Traces Number of Testing Traces Correctness (%)
10% 16 145 83.45
20% 32 129 87.6
30% 48 113 84.1
40% 64 97 80.41
50% 80 81 90.12
50% 81 80 93.75
60% 97 64 93.75
70% 113 48 93.75
80% 129 32 90.63
90% 145 16 93.75 26
Testing and Classification
Classification result
Seismic waveform classification
27
N
Marble Falls Limestone
Upper Barnett Shale
Forestburg Limestone
Lower Barnett Shale
Lower Barnett Shale Upper Barnett Shale
Forestburg Limestone
Marble Falls Limestone
Inlin
e
Crossline
Time (ms)1370
1384
shale
limestone
0.5 miles
Testing and Classification
Well base map
Well log classificationin
line
crossline
25
50
75
100
125
150
175
25 50 75 100 125 150 175 200
Training well Testing well
28
0.5 miles
well A
well B
well C
well D
Testing and Classification
Well log classification correlating with lithologic interpretation
Well log classification
Lithology from well log interpretation
Blue: LimestoneGreen: Shale
Lithology from PSVM
Blue: LimestoneGreen: Shale
Marble Falls Limestone
Upper Barnett Limestone
Upper Barnett Shale
Lower Barnett Shale
Forestburg Limestone
5000 P-wave (ft/s) 20000
0 Gamma Ray (API) 150
1.5 Density (g/cc) 3
Training correctness: 89% Testing correctness: 88%
29
7800
Depth (ft)
8000
8400
8200
8600
Outlines
Introduction
Theory and Formulations
Testing and Classification
Discussions
Conclusions
Acknowledgements
30
Discussions
The boundary between two PSVM classes matches the interpreted formation boundary nicely.
Seismic waveform classification
A zoom-in view of the previous PSVM classification map
Reliable classification rate can be achieved by training with as little as 0.2% of the data.
It can provide a reliable reference when human interpretation is tedious.
31
Forestburg Limestone
Upper Barnett Shale
Lower Barnett Shale
0.3 miles
Discussions
Blind well testing correctness (88%) is close to the training correctness (89%), which indicates the PSVM classifier is capable of generalizing to a well with distance.
Three fundamental well logs are used as inputs instead of more advanced elastic properties, which can still guarantee a reliable classification.
Well log classification
A segment from the previous PSVM well log classification result
It can provide a fast and reliable reference when human interpretation is tedious.
32
Discussions
One step further?Originally SVMs are built to solve binary classification problems.
Multiclass PSVM has been proposed by researchers, and we improved the classification robustness.
We then applied multiclass PSVM for brittleness index estimation in the Barnett Shale and it has provided promising result.
33
Discussions
Brittleness index estimation
34Brittleness index (BI) estimation using PSVM on well logs from four rock properties
BI_N BI_Cσ
Depth (ft)
Discussions
Brittleness index estimation
34
Normalized Brittleness index
6600 6800 7000 7200 7400 7600 7800 80000
0.1
0.2
0.3
0.4
0.5
0.6
0.7 BI_N = 10
BI_N = 9
BI_N = 8
BI_N = 7
BI_N = 6
BI_N = 5
BI_N = 4
BI_N = 3
BI_N = 2
BI_N = 1
Bri
ttle
ness
Ind
ex
Depth (ft)
Discussions
Brittleness index estimation
35
Estimated brittleness index (BI) using PSVM on seismic prestack inversion
1.2
1.3
1.4
30 60 12090
t 0 (
s)
CDP Number
180
BI_C
Miles
0 0.2
10
0
150
Marble Falls
Upper Barnett
Forestburg
Lower Barnett
Viola
Outlines
Introduction
Theory and Formulations
Testing and Classification
Discussions
Conclusions
Acknowledgements
36
Conclusions
PSVM lithofacies classification showed promising results in both seismic and well log data.
Multiclass PSVM classifiers are also available and ready for more complicated applications.
Brittleness index estimation proves the capability of PSVM in a 3D multi-attribute classification using a vector of seismic attributes.
We also anticipate comparisons between PSVM and other supervised (e.g. artificial neural networks or ANN) and unsupervised (e.g. SOM, generative topographic mapping or GTM) classification algorithms.
37
Outlines
Introduction
Theory and Formulations
Testing and Classification
Discussions
Conclusions
Acknowledgements
38
Acknowledgement
Thanks to Devon Energy for providing the data, all sponsors of Attribute Assisted Seismic Processing and Interpretation (AASPI) consortium group for their generous sponsorship, and colleagues for their valuable suggestions.
39
40
THANKS
Questions and suggestions?
ReferencesCortes, C. and V. Vapnik, 1995, Support-vector networks: Machine Learning, 20, 273-297.
Fung, G. and O. L. Mangasarian, 2001, Proximal support vector machine classifiers: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM 2001, 77-86.
Fung, G. M. and O. L. Mangasarian, 2005, Multicategory proximal support vector machine classifiers: Machine Learning, 59, 77-97.
Loucks, R. G. and S. C. Ruppel, 2007, Mississippian Barnett Shale: Lithofacies and depositional setting of a deep-water shale-gas succession in the Fort Worth Basin, Texas: AAPG Bulletin, 91, 579-601.
Mangasarian, O. L. and E. W. Wild, 2006, Multisurface proximal support vector machine classification via generalized eigenvalues: IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 69-74.
Platt, John C., Nello Cristianini, and John Shawe-Taylor, 1999, Large margin DAGs for multiclass classification: nips, 12, 547-553.
Roy, A., B. J. Dowdell, and K. J. Marfurt, 2013, Characterizing a Mississippian tripolitic chert reservoir using 3D unsupervised and supervised multiattribute seismic facies analysis: An example from Osage County, Oklahoma: Interpretation, 1, SB109-SB124.
Roy, A., A. S. Romero-Peláez, T. J. Kwaitkowski, and K. J. Marfurt, 2014, Generative topographic mapping for seismic facies estimation of a carbonate wash, Veracruz Basin, southern Mexico: Interpretation, 2, SA31-SA47.
Torres, A. and J. Reveron, 2013, Lithofacies discrimination using support vector machines, rock physics and simultaneous seismic inversion in clastic reservoirs in the Orinoco Oil Belt, Venezuela: SEG Technical Program Expanded Abstracts 2013, 2578-2582.
41
Multiclass classification?How we assign a class to an unknown sample A B C D
A 0.3 -1.2 2.3
B -0.3 0.8 -1.1
C 1.2 -0.8 -1.9
D -2.3 1.1 1.9
Example of a classification factor table
Examine the binary PSVM classification factor (CF) of the current pilot class against every other active classes.
Find the class corresponds to the most negative CF value, then assign that class as the new pilot class, and turn the current pilot class into inactive.
All CFs are positive?
Yes Assign the current pilot class to this sample and exit
Set class “A” as the pilot class
Turn all classes into active
No
42
Appendix
Appendix
Multiclass classification?Testing results for multiclass classification
Dataset Sample size
Testing size Dimension Number of class nu delta Sample
reduced to (%)Training
correctnessTesting
correctnessPendigits 7494 3498 16 10 2000 0.0001 10 97.72% 97.11%Pendigits 7494 3498 16 10 2000 0.0001 20 99.25% 97.20%Pendigits 7494 3498 16 10 2000 0.0001 30 99.56% 98.20%Pendigits 7494 3498 16 10 2000 0.0001 40 99.64% 97.71%Pendigits 7494 3498 16 10 2000 0.0001 50 99.73% 97.94%letter_scale 15000 5000 16 26 20000 0.1 10 82.69% 82.06%letter_scale 15000 5000 16 26 20000 0.1 20 89.70% 89.42%letter_scale 15000 5000 16 26 20000 0.1 30 93.23% 91.86%letter_scale 15000 5000 16 26 20000 0.1 40 94.83% 93.44%
43