![Page 1: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/1.jpg)
![Page 2: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/2.jpg)
What is Intelligent Assistant?
1. A person who assists a specific person with
their business tasks
2. A software agent that can perform tasks or
services
3. Skynet
![Page 3: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/3.jpg)
SPARK—UNIVERSAL
COMPUTATION ENGINE FOR
PROCESSING
OIL INDUSTRY DATA
Andy Starzhinsky, Vice President
Yaroslav Nedashkovsky, System Architect
SoftEleganceData
![Page 4: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/4.jpg)
Data analysis: from data collection to
predictive analytics
1. Oil industry overview
2. Data Lake
3. Data Collection
4. Data analysis
![Page 5: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/5.jpg)
![Page 6: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/6.jpg)
Data Flow
![Page 7: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/7.jpg)
Challenges
- Subsurface monitoring
- Unification of data collection, monitoring and
analysis
- Predictive analytics
![Page 8: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/8.jpg)
Intelligent Assistance
- Failure Prediction
- Digital management
- Major overhaul
![Page 9: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/9.jpg)
2. Data Lake
![Page 10: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/10.jpg)
![Page 11: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/11.jpg)
- sensors readings
- flat files (for example las* file)
- legacy dataset
Data sources
Data storage
- Cassandra
- distributed file system (GlusterFS)
![Page 12: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/12.jpg)
Infrastructure
Cloud agnostic
![Page 13: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/13.jpg)
3. Data Collection
![Page 14: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/14.jpg)
Smart Well
![Page 15: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/15.jpg)
4. Data Analysis
![Page 16: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/16.jpg)
Add-hoc query with Spark Notebook
![Page 17: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/17.jpg)
Jupyter + Python for LAS file analysis
![Page 18: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/18.jpg)
Prognostic Health Monitoring
- determine remaining useful life
- predict failures before they occur
![Page 19: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/19.jpg)
PHM workflow
![Page 20: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/20.jpg)
Artificial lift
![Page 21: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/21.jpg)
Estimation:
1% of improvement in ESP (Electric Submersible Pumps)
performance world-wide
provide over 0.5 M additional barrels of oil / day
2 M oil wells in operation worldwide
1 M wells use some types of artificial lift
750,000 of the lifted wells use sucker-rod pumps
![Page 22: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/22.jpg)
Problem: road pump failures (surface, tubing,
down-hole) could lead ~ two weeks of down time
for oil producing
![Page 23: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/23.jpg)
Challenge: prevent/reduce downtime (especially
handle pre-failure state to help support
engineers service road pump), reduce cost
![Page 24: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/24.jpg)
Dream: build global failure prediction model
that could be scaled to all wells worldwide
![Page 25: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/25.jpg)
Workflow
![Page 26: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/26.jpg)
Training
- Logistic regression
- SVM
- Naive Bayes
- Decision Tree
![Page 27: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/27.jpg)
Technical debt
![Page 28: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/28.jpg)
Spark streaming - data prediction at runtime
//at driver - load model to broadcast variable
val alSrpPredictionModel = {
val model = *Model*.load("/datalake/models/al_srp/")
ssc.sparkContext.broadcast(model)
}
//at client – make predictions
val predictions = alSrpPredictionModel.value.transform(inputDataset)
predictions.select("id", "time", "prediction")
.collect()
.foreach { case Row(meterId: String, time: Long, prediction: Double) =>
//failure
if (prediction != NORMAL) {
session.execute(s"INSERT INTO meters.failures (time, meter_id, type)
VALUES ($time, '$meterId', $prediction)")
//send notification about failure
}
}
! Scheduling delay
! Processing time
![Page 29: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/29.jpg)
Intelligent Assistant
![Page 30: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/30.jpg)
Conclusion
- digitalization collection intelligent analysis => better
insight
- from artificial intelligent in Oil to Smart Cities, etc.
- is Spark a “Swiss knife” for Data Lakes? (SQL, batch,
MLlib, streaming)
![Page 31: Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky](https://reader036.vdocument.in/reader036/viewer/2022062400/586f75821a28ab10258b609b/html5/thumbnails/31.jpg)
References
1. Predicting Failures from Oilfield Sensor Data using Time Series Shapelets -
http://www-scf.usc.edu/~chelmis/pubs/spe14.pdf
2. Failure Prediction for Rod Pump Artificial Lift Systems -
http://cdm15799.contentdm.oclc.org/cdm/ref/collection/p15799coll3/id/323256
3. Using ZigBee for Wireless Remote Monitoring and Control -
http://www.ethanpublishing.com/uploadfile/2015/0608/20150608015338765.pdf