leonid sheremetov

32
High Performance Computing in Petroleum Exploration and Production: IMP experience Dr. Leonid Sheremetov [email protected] Mexican Petroleum Institute Awareness Raising Workshop, Mexico-city Nov. 23-24

Upload: guadalupemoreno

Post on 18-Dec-2014

284 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Leonid sheremetov

High Performance Computing in Petroleum Exploration and Production:

IMP experience

Dr. Leonid Sheremetov [email protected]

Mexican Petroleum Institute

Awareness Raising Workshop, Mexico-city Nov. 23-24

Page 2: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Outline

HPC challenges in the Petroleum IndustryMexican Petroleum Institute (IMP)

IMP profileResearch Program for Applied Mathematics andComputing (MAyC)High performance computing in IMP

Research agenda of HPC in MAyCGrid-based Simulation (dynamic data drivenapplications)Grid-based Distributed Data MiningTask assignment in desktop gridsAdaptive grain parallelismDynamic task distribution in multi-core clusters

Conclusions

Page 3: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city 3

O&G Exploration and Production in Mexico and in the worldPEMEX – Mexican Oil Company:

4 regions,14 assets,2488 oil fields,24645 wells

PEMEX Technology Strategy: technicalinnovation and advanced decisionmaking supportResearch funds: CONACYT-SENERHidrocarburos and EnergíasRenovables

Principle reservoirs decline (decreasedrecovery)Increased technical complexity of allprocesses (increased cost)Increased gap between acquired andutilized data

Page 4: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

O&G industry

Transparent Data and

InformationRemote Operations

Shared Processes

Virtual Collaboration

Real Time Information about Reservoirs

Knowledge Management

Immersion Technologies

Integrated Supply Chains

Transactional Processes

Operational and Financial Reports

Operation Optimization

Reservoir modeling

Page 5: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

O&G industry (continued)

HF Data Historian

Multiple SCADA systems

SIOPDV

SAP

ADITEP

PEMEX Corporate DB

HF Data Historian

3-D Seismic/simulation

computationally intensive tasks data intensive applications sensor intensive applications(i-Field)

Solving grand challenge applications using HPC

Page 6: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Outline

HPC challenges in the Petroleum IndustryMexican Petroleum Institute (IMP)

IMP profileResearch Program for Applied Mathematics andComputing (MAyC)High performance computing in IMP

Research agenda of HPC in MAyCGrid-based Simulation (dynamic data drivenapplications)Grid-based Distributed Data MiningTask assignment in desktop gridsAdaptive grain parallelismDynamic task distribution in multi-core clusters

Conclusions

Page 7: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city 7

Mexican Petroleum Institute (IMP)

• IMP is public research centre• IMP was founded on August, 23

of 1965 • Year budget about $300 mln USD• IMP objectives:

• Research and Development• Application Technologies• Consulting• Education and Training

(postgraduate program opened in 2003)

for PEMEX – Mexican Oil Company

Page 8: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Research Program for Applied Mathematics and Computing

Founded in 2001Contains:

ResearchersDevelopersScientific Computing Lab

Main Research Areas:Distributed IntelligentComputing

• data mining• computational intelligence• expert systems• agent technology

Multiobjective Optimization• logistics• supply chain management

Simulation• partial differencial

equations• numeric methods

Page 9: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Supercomputing in Mexico

Page 10: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

HPC in Mexico

CINVESTAV: XiuhcóatlProcessors: INTEL-AMD-GPGPUNumber of cores: 3480 (CPU),Real performance: 24.97TFlops

UAM: AITZALOANumber of nodes: 270 (135 Twin) nodes.Processors: Intel Xeón Quad-Core a 3 GhzNumber of cores: 2160 (540 Quad-CoreCPU)Memory: 16GB en RAM por nodo.Real performance: 18.4 TFlops.

UNAM: KAN BALAM (HP CP 4000)Number of nodes: 342 nodes,Processors: AMD OpteronNumber of cores: 1368 CPUMemory: 3 Terabytes.Real performance: 7.1 TFlops

Fujitsu K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect: 705,024 cores, 10,510 TFlops

Page 11: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Mexico and IMP in Top500

MAyC created

Page 12: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Evolution of High Performance Platforms in the IMP

1968: IBM11301972: IBM-360/44 for theComputer Centre and Centre forGeophysical Processing (analysisof seismic data for reservoircharacterization)1980: UNIVAC 1106 (design of oilplatforms)1982: UNIVAC 1100/82(multiprocessor), VAX 7501982: 1st distributed DB in Mexico2000: Cray Origin 20002001: Research Program onApplied Mathematics andComputing (PIMAyC).2001: Lufac Cluster with 256nodes (2 CPUs each)2009: Lufac Cluster (Villahermosa)2011: Supercomputing Lab: XeonX7500 CPU, 250 cores - inprogress

Estimated server&cluster capacity (2011): 0.4 TFlops

Page 13: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Applications of HPC in the IMP

Computation intensive tasksReservoir simulationOceanographic modeling for offshore explorationAtmospheric modeling

Data intensive tasks3-D seismic cubes pre-stack analysis and multi-attribute analysisData mining

Computation and communication intensivetasks:

Collaborative engineeringNano characterization in 2D and 3D andnanochemical analysis (shared Lab.)

Page 14: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Improved exploration and production (E&P) performance

HPC seismic-to-simulation technologiesseamlessly integrate geophysics, geology, andreservoir engineering in a unified earth modelSchlumberger’s Petrel™ Seismic Serveranalyzes terabytes of seismic survey datarepresented in 2-D and 3-D displays usingPetrel Geophysics.ECLIPSE® reservoir simulation software usesthe power of HPC clusters to generateanimated 3-D simulation modelsSchlumberger’s software is optimized forIntel’s Xeon multi-core architecture working onadvanced compiler and communicationtechnology, such as the Intel® MPI Library 3.1and the Intel® Compiler Suite, for high-performance cluster software available on theMicrosoft® Windows® Compute ClusterServer

14

Page 15: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Real Time Remote Control of a JEOL JEM 2200 FS Microscope Using Internet 2

IMP Ultra High ResolutionElectron Microscopy Laboratoryis one of the first shared Labs inMexico promoting incollaboration with the UNAMInstitute of Physics the creationof national and internationalnetworks on multidisciplinaryscientific research, sharingtechnologic infrastructuresthrough internet 2.It provides nano characterizationin 2D and 3D and nanochemicalanalysisBoth computational andcommunication (12 Mb I2)intensive tasksHead of Lab. Vicente GaribayFebles, [email protected]

Page 16: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

.

High resolution image of Pd-catalyst nanoparticles and its chemical analysis (EDS).

Page 17: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Project CONACYT SENER-Hidrocarburos: Data Mining

Methods and Techniques of Computational Intelligence and Data Mining for Decision Making in Exploitation of Mature Fields Project coordinator:

Instituto Mexicano del Petróleo(MAyC)

Project collaborators:CINVESTAV, CIC-IPN, CIMAT,IIE, INAOE,.

Project dates:March 08, 2011 – March 07, 2013

17

Scatterplot of multiple variables against FechaJUJO-2A in PozosReconstruidosAforos-HistóricosProducción.stw 3v*9132c

Prod. Aforos = Distance Weighted Least SquaresProd. Diaria Prom = Distance Weighted Least Squares

Prod. Aforos Prod. Diaria Prom

28/08/1976 18/02/1982 11/08/1987 31/01/1993 24/07/1998 14/01/2004 06/07/2009

Fecha

-2000

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

Page 18: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Project CONACYT SENER-Hidrocarburos: Data Mining

Objective: Develop and apply data miningand computational intelligence (DM&CI)techniques for the análysis of technical dataon hidrocarbon exploitation to support decisionmaking and solution identification increasingthe efficiency of exploitation of mature fieldsNovel approach: top-down (inverse)modeling based on the analysis of dynamicoilfield data and reconstruction of the staticcharacterization and hydro-geologicalreservoir models for selection of poorlydrained areas and recovery methods applyingDM&CIData: one oilfield – 9,464 files, > 50Gb(without seismic and simulation models)(2488 oil fields)

18

Page 19: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Outline

HPC challenges in the Petroleum IndustryMexican Petroleum Institute (IMP)

IMP profileResearch Program for Applied Mathematics andComputing (MAyC)High performance computing in IMP

Research agenda of HPC in MAyCGrid-based Simulation (dynamic data drivenapplications)Grid-based Distributed Data MiningTask assignment in desktop gridsAdaptive grain parallelismDynamic task distribution in multi-core clusters

Conclusions

Page 20: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

What grids would we need?

Data grid:Support for large, distributed data repositories

Computational grid:Execution of high-end simulation models in paralleland distributed fashion

Knowledge grid:Add basic knowledge discovery mechanisms to a gridA grid architecture specialized for data mining

Page 21: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Grid-based Simulation: Dynamic Data Driven Application Systems

Formalized by Frederica DaremaData is fed into an executing applicationeither as the data is collected or from adata archive.The simulation can then makepredictions about the entity regardinghow it will change and what its futurestate will be. The simulation is thencontinuously adjusted with data gatheredfrom the entity. The predictions made bythe simulation can then influence howand where future data will be gatheredfrom the entity, in order to focus on areasof uncertainty.Production history data can be fed to thereservoir simulator to determine thereservoir description parameters from thegiven performance and to predict theperformance of an oil field.Intelligent agents are suitable to makethese decisions with regard to which datato absorb, when it should be absorbed,and how it should be absorbed.

Page 22: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Distributed Data Mining on Knowledge Grids

TeraGrid (San Diego SupercomputerCenter, National Center forSupercomputing, Caltech, ArgonneNational Lab: scientific data sets mining)Knowledge Grid (Università diCatanzaro and DEIS, Università dellaCalabria running over MIUR SP3 Italiannational grid)Terra Wide Data Mining Testbed(National Center for Data Mining at theUniversity of Illinois at Chicago)ADaM (University of Alabama inHuntsville: hydrology data mining)

IMP&PEMEX - Data mining algorithmsand knowledge discovery processes areboth compute and data intensive,therefore the Grid can offer a computingand data management infrastructure forsupporting decentralized and paralleldata analysis.

Adapted from: M. Cannataro, A. Congiusta, A. Pugliese, D. Talia, and P. Trunfio. Distributed data mining on grids: Services, tools, and applications. IEEE Transactions on Systems, Man, Cybernetics, Part B, 34(6), 2004.

Page 23: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Distributed Data Mining for Modelling of Hydraulic Communication between Wells

Principal components analysisFuzzy clustering (fuzzy K-means)MAP Transform (trend analysis)See5 (decision trees andrulesets)WizWhy® (association rulemining), etc.

Page 24: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Scientific grids: research agenda

During the last years, computational speed has beenincreasing geometrically, while the speed incommunication has only experienced a linear increase.The complexity of contemporary scientific applicationswith increased demand for computing power and accessto larger datasets is setting a trend towards the increasedutilization of grids of desktop personal computersThe combination of many multicore computers inscientific grids demand a combination of fine and coarsegrain parallelization

Page 25: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Desktop grids (in collaboration with CINVESTAV)

Network topology depends upon abandwidth availability for parallelprocesses to communicateA novel task assignment schemewhich takes the dynamic networktopology into consideration isdeveloped*The approach is based on theBandwidth-aware Bulk SynchronousParallel Computer (BSP)computational modelThe force field method forsynchronisation is usedThe algorithm tested for the gridscomposed of 1K nodes

*E. Wilson García and G. Morales-Luna, LNCS-3795

Page 26: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Task assignment algorithm

Three types of applications werestudied:

High computation, low communicationcostHigh computation, middlecommunication costHigh computation, high communicationcost

Many parallel applications fall into the2nd categoryDistributed data applications fall intothe 3rd category

Page 27: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Adaptive grain parallelism

Gmandel (http://gmandel.sf.net/) is abenchmark for computer infrastructuregenerating images of the fractals from theMandelbrot set. The basic unit of measureis the MMIPS (Million of MandelbrotIterations Per Second).Gmandel runs on Linux (or equivalent) on asingle computer, a multiprocesor computer(most new multi-core PCs) orInfiniband/Myrinet computer cluster.In each case, Gmandel can take advantageof multi-core technology by the use ofshared memory, fine grained anddistributed memory, coarse grain parallelcomputing techniques. The first isaccomplished with posix threads and thelatter by means of MPI message passing(currently tested with mpich2, from ANL).

Page 28: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Dynamic task distribution in HPC (in collaboration with CIC-IPN)

Increased complexity ofembedded devices led totheir verification consumingup to 70% of human andcomputational resourcesDynamic planning anddistribution of HDL modelsover a parallel simulationplatform (clusters with multi-core nodes) is a challengingtaskSuch a simulation platformis being developed by aPhD student Josué RangelGonzález, in collaborationwith the Embedded SystemsLab of CIC-IPN

Page 29: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Outline

HPC challenges in the Petroleum IndustryMexican Petroleum Institute (IMP)

IMP profileResearch Program for Applied Mathematics andComputing (MAyC)High performance computing in IMP

Research agenda of HPC in MAyCGrid-based Simulation (dynamic data drivenapplications)Grid-based Distributed Data MiningTask assignment in desktop gridsAdaptive grain parallelismDynamic task distribution in multi-core clusters

Conclusions

Page 30: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Conclusions

Infrastructure next steps:Cluster installation in the Supercomputing lab.12 Mb I2 connectionIntegration to Delta Metropolitana HPC Grid initiative

Software, taking advantage of the infrastructure:Current platform: Landmark, Petrel, Eclipse, OFM,opensource

Research agenda:Novel tasks and approaches enabled by the HPCincreasing the efficiency of R&D to satisfy the needs ofPEMEX

Page 31: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city 31

MAyC at the IMP: publications & events

March 12-14 2012, Cancun, Mexico

Page 32: Leonid sheremetov

RISC, Nov. 23-24 Mexico-city

Leonid [email protected]

Thank You! Any Questions?