Download - PNN and inversion-B
Reveal Physical and Acoustic Attributes within Oil and Gas Industry
A probabilistic neural network (PNN) is a feedforward neural network, which
was derived from the Bayesian network and a statistical algorithm called Kernel
Fisher discriminant analysis. It was introduced by D.F. Specht in the early 1990s.
In a PNN, the operations are organized into a multilayered feedforward network
with four layers:
• Input layer
• Hidden layer
• Pattern layer/Summation layer
• Output layer
PNN is often used in classification challenges. When an input is present, the first
layer computes the distance from the input vector to the training input vectors.
This produces a vector where its elements indicate how close the input is to the
training input. The second layer sums the contribution for each class of inputs and
produces its net output as a vector of probabilities. Finally, a compete transfer
function on the output of the second layer picks the maximum of these
probabilities, and produces a 1 (positive identification) for that class and a 0
(negative identification) for non-targeted classes.
Input layer
Each neuron in the input layer represents a predictor variable. In categorical variables, N-1neurons are used when there are N number of categories. It standardizes the range of the values by subtracting the median and dividing by the interquartile range. Then the input neurons feed the values to each of the neurons in the hidden layer.
Pattern layer
This layer contains one neuron for each case in the training data set. It stores the values of the predictor variables for the case along with the target value. A hidden neuron computes the Euclidean distance of the test case from the neuron’s center point and then applies the RBF kernel function using the sigma values.
Summation layer
For PNN networks there is one pattern neuron for each category of the target variable. The actual target category of each training case is stored with each hidden neuron; the weighted value coming out of a hidden neuron is fed only to the pattern neuron that corresponds to the hidden neuron’s category. The pattern neurons add the values for the class they represent.
Output layer
The output layer compares the weighted votes for each target category accumulated in the pattern layer and uses the largest vote to predict the target category.
Advantages
There are several advantages and disadvantages using PNN instead of multilayer
perceptron.
PNNs are much faster than multilayer perceptron networks.
PNNs can be more accurate than multilayer perceptron networks.
PNN networks are relatively insensitive to outliers.
PNN networks generate accurate predicted target probability scores.
PNNs approach Bayes optimal classification.
Disadvantages
PNN are slower than multilayer perceptron networks at classifying new cases.
PNN require more memory space to store the model.
It is stated that the model
based seismic inversion
has a robust mathematical
platform.
Neural network analysis is
perceived to operates as a
kind of “black-box”.
Is this correct?
Inversion technique, which is the most common method to provide acoustic and physical attribute
approximation within seismic data is highly dependent on the selected initial model because of the
inherent non-uniqueness then the solution obtained is one of the many possible solutions which may be
equally valid. There is no reason why a particular solution will have more preference over any other
solution.
For the case at hand, the neural network approach yields a solution that is geologically more meaningful,
because the procedure utilizes the available well log information to estimate the target parameters.
For those areas where the available well-log control is uniformly distributed, the neural network approach
could yield more meaningful impedance estimates that correlate well with the impedance logs. This lends
confidence to the seismic interpreters to believe the impedance estimates away from the control points.
Machine learning algorithms use computational methods to “learn”
information directly from data without assuming a predetermined
equation as a model. They can adaptively improve their
performance as you increase the number of samples available for
learning.
Classification, regression, and clustering and build predictive models to discover useful patterns from
observed data.
Use of machine learning tools to detect patterns and build predictive models from data sets.
Machine learning algorithms are used in applications such as:
• computational finance (credit scoring and algorithmic trading),
• image processing and computer vision (face recognition, object detection, object recognition),
• computational biology (tumor detection, drug discovery, and DNA sequencing),
• energy production (price and load forecasting),
• natural language processing, speech and image recognition, and
• advertising and recommendation systems.
Machine learning is an integral part of data analytics, which deals with
developing data-driven insights for better designs and decisions.
Build models to train computers ability to classify data into different categories. This can help perform more accurately analyze and visualize data.
Classification can be used within area such as image processing amongst many others.
Common algorithms for performing classification include support vector machine (SVM), boosted and bagged decision trees, k-nearest neighbor, Naïve Bayes, discriminant analysis, logistic regression, and deep learning and neural networks.
Identify patterns, outliers in the data and determine its inter-dependencies and geolocations.
TRAIN YOUR DATA with models.
Find natural groupings and patterns in data. Clustering
is used on unlabeled data to find natural groupings and
patterns. Applications of clustering include
pattern mining,
medical imaging, and
object recognition.
Common algorithms for performing clustering include
k-means and k-medoids, hierarchical clustering,
Gaussian mixture models, hidden Markov models,
self-organizing maps, fuzzy c-means clustering, and
subtractive clustering.
TRAIN YOUR DATA with models.
Organization is required to faster, more efficient and with higher rate of success to
discover, optimize and deploy predictive models by analyzing multiple data sources.
This is required to improve business outcome.
There is a buzz about Big Data, but the requirement for a solution to make senses of
the Big Data, namely the ANALYTICS and INTEGRATION within business
applications, DEPLOYMENT, have been missed.
80% of time spent on preparing data, 20% of time is spent complaining about the
need to prepare the data.
• Amount of data associated with many variables (predictors)
• Present day equations not suitable for the Complexity in the data, requires
iteration of algorithms, which can be time-consuming
• Deep Learning, machine learning or Neural Network requires significant technical
expertise
• There is no one fit all solutions, which requires an iterative process approach
• What approach to take, clustering, classification, regression, neural network or
curve fitting algorithms.
• How to best integrate machine learning with data and deployment within Data
Analytics workflows.
PREPARATION
• Generate attributes from a 3D seismic volume
• Use wells to compute Acoustic Impedance logs
• Elastic Inversion of 3D Seismic volume
Wells and Seismic are both used as training of impedance logs
and traces
ANALYSIS
Established that inversion alone does not provide enough
temporal resolution to delineate reservoir quality variations.
Use of multi-attributes combined with inversion data in a
Probabilistic Neural Network to provide a high resolution
impedance 3D dataset.
PROCEDURE
1. Internal attributes taken from seismic data
2. External attributes from impedance data.
3. The attributes are normalized so input and output traces are
aligned.
4. Establish an operator which can predict log properties from
a seismic attribute.
5. Stepwise regression procedure is used to establish best
correlation between internal (inversion data) and external
attributes.
• Internal attributes could be, but not limited to:• Integrate
• Quadrature Trace
• Raw Seismic
• Filtered data
• Second derivative
• Instantaneous phased
• Cosine Instantaneous phase
6. Establish training error and validation errors in the neural
network study and select the best one for use in qualifying
reservoir qualities/ variations.
(M.S.Rawat, Viswaja Devalla, T.K.Mathuria and U.S.D.Pandey, 2013)
Multi attributes, used as PNN inputs, must be interpreted within a geologic framework, not a statistical
one. Geologically plausible and physically realistic results are necessary to be confidently used for
exploration and exploitation purposes.
Appropriate input seismic attributes are selected via forward step-wise regression as too many input
seismic attributes yields spurious and noisy results (Kalkomey, 1997).
The seismic attributes must be cross validated to avoid overtraining.
Cross validation systematically removes each well used in the training process from the training set.
The multi attribute transform is recalculated with the well absent, or hidden, from the training process.
The average predictive error of all the hidden wells is referred to as the validation error. The validation
error is the error associated with applying the PNN to the entire seismic data volume.
1. Additional seismic attributes are considered valuable as long as the
validation error, the thin line in figure here, is minimized. In this case,
the optimum number of seismic attributes for PNN application is five.
2. Each attribute is further evaluated for its significance within the
established geologic framework
3. The data is trained over intervals of interest.
4. Quantitatively, the PNN does not sufficiently predict the original
Effective porosity values
5. Qualitatively, the PNN predicts the Effective Porosity curve frequency
differences between various facies of interest.
6. The PNN is applied to the 3D seismic data volume yielding a
Effective Porosity 3D volume.
7. Values corresponding to various facies given reason to create facies
trends which has reliability in a temporal domain.
1. First, single-attribute regression was performed to the data. Out of all the attributes,
inverse of Vp/Vs ratio gave highest correlation with hydrocarbon Volume with a
coefficient of 0.64.
2. Then a combination of multi-attribute regression and probabilistic Neural Network
(PNN) method was used to derive a suitable relationship for predicting Hydrocarbon
Volume
3. A multi-attribute stepwise linear regression analysis was performed using Gas
Volume log at fifteen well locations.
4. Validation correlation is computed by excluding one well at a time from the training
data set, calculating correlation at that well and making average of the correlations
after repeating the procedure for all the wells.
5. Validation error was calculated against the number of attributes for the different
operator lengths. The plot illustrates that a thirteen-point operator gave the minimum
validation error with 12 attributes. The attributes were 1/(Vp/Vs), 1/(P-impedance),
Integrated absolute amplitude, Amplitude Envelope (P-impedance), Instantaneous
Phase, Integrate (Vp/Vs), Average Frequency (Vp/Vs), Apparent Polarity (Vp/Vs),
Instantaneous Frequency (P-impedance), Integrate (P-impedance), Cosine
Instantaneous Phase and Amplitude Envelope (Vp/Vs). The network derived from the
multi-attribute linear regression gave an average correlation of 0.80.Profile showing HC volume along with
actual log at one well location after PNN.
Plot of validation error vs. attributes for
different operator length
(Amit K. Ray1 and Samir Biswal1, 2012)
Valioso Ltd
145-157 St John Street
EC1V4PW London, UK
Dekabristiv Street 17
75110 Kherson, Ukraine
Lilla Bommen 6
SE-411 04 Göteborg, Sweden
Phoenix Center, Calea Buzesti, nr.75-77, etaj 9
Sector 1, Bucharest
Romania
We are happy to help you!
www.Valioso.rocks