leonid sheremetov
TRANSCRIPT
High Performance Computing in Petroleum Exploration and Production:
IMP experience
Dr. Leonid Sheremetov [email protected]
Mexican Petroleum Institute
Awareness Raising Workshop, Mexico-city Nov. 23-24
RISC, Nov. 23-24 Mexico-city
Outline
HPC challenges in the Petroleum IndustryMexican Petroleum Institute (IMP)
IMP profileResearch Program for Applied Mathematics andComputing (MAyC)High performance computing in IMP
Research agenda of HPC in MAyCGrid-based Simulation (dynamic data drivenapplications)Grid-based Distributed Data MiningTask assignment in desktop gridsAdaptive grain parallelismDynamic task distribution in multi-core clusters
Conclusions
RISC, Nov. 23-24 Mexico-city 3
O&G Exploration and Production in Mexico and in the worldPEMEX – Mexican Oil Company:
4 regions,14 assets,2488 oil fields,24645 wells
PEMEX Technology Strategy: technicalinnovation and advanced decisionmaking supportResearch funds: CONACYT-SENERHidrocarburos and EnergíasRenovables
Principle reservoirs decline (decreasedrecovery)Increased technical complexity of allprocesses (increased cost)Increased gap between acquired andutilized data
RISC, Nov. 23-24 Mexico-city
O&G industry
Transparent Data and
InformationRemote Operations
Shared Processes
Virtual Collaboration
Real Time Information about Reservoirs
Knowledge Management
Immersion Technologies
Integrated Supply Chains
Transactional Processes
Operational and Financial Reports
Operation Optimization
Reservoir modeling
RISC, Nov. 23-24 Mexico-city
O&G industry (continued)
HF Data Historian
Multiple SCADA systems
SIOPDV
SAP
ADITEP
PEMEX Corporate DB
HF Data Historian
3-D Seismic/simulation
computationally intensive tasks data intensive applications sensor intensive applications(i-Field)
Solving grand challenge applications using HPC
RISC, Nov. 23-24 Mexico-city
Outline
HPC challenges in the Petroleum IndustryMexican Petroleum Institute (IMP)
IMP profileResearch Program for Applied Mathematics andComputing (MAyC)High performance computing in IMP
Research agenda of HPC in MAyCGrid-based Simulation (dynamic data drivenapplications)Grid-based Distributed Data MiningTask assignment in desktop gridsAdaptive grain parallelismDynamic task distribution in multi-core clusters
Conclusions
RISC, Nov. 23-24 Mexico-city 7
Mexican Petroleum Institute (IMP)
• IMP is public research centre• IMP was founded on August, 23
of 1965 • Year budget about $300 mln USD• IMP objectives:
• Research and Development• Application Technologies• Consulting• Education and Training
(postgraduate program opened in 2003)
for PEMEX – Mexican Oil Company
RISC, Nov. 23-24 Mexico-city
Research Program for Applied Mathematics and Computing
Founded in 2001Contains:
ResearchersDevelopersScientific Computing Lab
Main Research Areas:Distributed IntelligentComputing
• data mining• computational intelligence• expert systems• agent technology
Multiobjective Optimization• logistics• supply chain management
Simulation• partial differencial
equations• numeric methods
RISC, Nov. 23-24 Mexico-city
Supercomputing in Mexico
RISC, Nov. 23-24 Mexico-city
HPC in Mexico
CINVESTAV: XiuhcóatlProcessors: INTEL-AMD-GPGPUNumber of cores: 3480 (CPU),Real performance: 24.97TFlops
UAM: AITZALOANumber of nodes: 270 (135 Twin) nodes.Processors: Intel Xeón Quad-Core a 3 GhzNumber of cores: 2160 (540 Quad-CoreCPU)Memory: 16GB en RAM por nodo.Real performance: 18.4 TFlops.
UNAM: KAN BALAM (HP CP 4000)Number of nodes: 342 nodes,Processors: AMD OpteronNumber of cores: 1368 CPUMemory: 3 Terabytes.Real performance: 7.1 TFlops
Fujitsu K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect: 705,024 cores, 10,510 TFlops
RISC, Nov. 23-24 Mexico-city
Mexico and IMP in Top500
MAyC created
RISC, Nov. 23-24 Mexico-city
Evolution of High Performance Platforms in the IMP
1968: IBM11301972: IBM-360/44 for theComputer Centre and Centre forGeophysical Processing (analysisof seismic data for reservoircharacterization)1980: UNIVAC 1106 (design of oilplatforms)1982: UNIVAC 1100/82(multiprocessor), VAX 7501982: 1st distributed DB in Mexico2000: Cray Origin 20002001: Research Program onApplied Mathematics andComputing (PIMAyC).2001: Lufac Cluster with 256nodes (2 CPUs each)2009: Lufac Cluster (Villahermosa)2011: Supercomputing Lab: XeonX7500 CPU, 250 cores - inprogress
Estimated server&cluster capacity (2011): 0.4 TFlops
RISC, Nov. 23-24 Mexico-city
Applications of HPC in the IMP
Computation intensive tasksReservoir simulationOceanographic modeling for offshore explorationAtmospheric modeling
Data intensive tasks3-D seismic cubes pre-stack analysis and multi-attribute analysisData mining
Computation and communication intensivetasks:
Collaborative engineeringNano characterization in 2D and 3D andnanochemical analysis (shared Lab.)
RISC, Nov. 23-24 Mexico-city
Improved exploration and production (E&P) performance
HPC seismic-to-simulation technologiesseamlessly integrate geophysics, geology, andreservoir engineering in a unified earth modelSchlumberger’s Petrel™ Seismic Serveranalyzes terabytes of seismic survey datarepresented in 2-D and 3-D displays usingPetrel Geophysics.ECLIPSE® reservoir simulation software usesthe power of HPC clusters to generateanimated 3-D simulation modelsSchlumberger’s software is optimized forIntel’s Xeon multi-core architecture working onadvanced compiler and communicationtechnology, such as the Intel® MPI Library 3.1and the Intel® Compiler Suite, for high-performance cluster software available on theMicrosoft® Windows® Compute ClusterServer
14
RISC, Nov. 23-24 Mexico-city
Real Time Remote Control of a JEOL JEM 2200 FS Microscope Using Internet 2
IMP Ultra High ResolutionElectron Microscopy Laboratoryis one of the first shared Labs inMexico promoting incollaboration with the UNAMInstitute of Physics the creationof national and internationalnetworks on multidisciplinaryscientific research, sharingtechnologic infrastructuresthrough internet 2.It provides nano characterizationin 2D and 3D and nanochemicalanalysisBoth computational andcommunication (12 Mb I2)intensive tasksHead of Lab. Vicente GaribayFebles, [email protected]
RISC, Nov. 23-24 Mexico-city
.
High resolution image of Pd-catalyst nanoparticles and its chemical analysis (EDS).
RISC, Nov. 23-24 Mexico-city
Project CONACYT SENER-Hidrocarburos: Data Mining
Methods and Techniques of Computational Intelligence and Data Mining for Decision Making in Exploitation of Mature Fields Project coordinator:
Instituto Mexicano del Petróleo(MAyC)
Project collaborators:CINVESTAV, CIC-IPN, CIMAT,IIE, INAOE,.
Project dates:March 08, 2011 – March 07, 2013
17
Scatterplot of multiple variables against FechaJUJO-2A in PozosReconstruidosAforos-HistóricosProducción.stw 3v*9132c
Prod. Aforos = Distance Weighted Least SquaresProd. Diaria Prom = Distance Weighted Least Squares
Prod. Aforos Prod. Diaria Prom
28/08/1976 18/02/1982 11/08/1987 31/01/1993 24/07/1998 14/01/2004 06/07/2009
Fecha
-2000
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
RISC, Nov. 23-24 Mexico-city
Project CONACYT SENER-Hidrocarburos: Data Mining
Objective: Develop and apply data miningand computational intelligence (DM&CI)techniques for the análysis of technical dataon hidrocarbon exploitation to support decisionmaking and solution identification increasingthe efficiency of exploitation of mature fieldsNovel approach: top-down (inverse)modeling based on the analysis of dynamicoilfield data and reconstruction of the staticcharacterization and hydro-geologicalreservoir models for selection of poorlydrained areas and recovery methods applyingDM&CIData: one oilfield – 9,464 files, > 50Gb(without seismic and simulation models)(2488 oil fields)
18
RISC, Nov. 23-24 Mexico-city
Outline
HPC challenges in the Petroleum IndustryMexican Petroleum Institute (IMP)
IMP profileResearch Program for Applied Mathematics andComputing (MAyC)High performance computing in IMP
Research agenda of HPC in MAyCGrid-based Simulation (dynamic data drivenapplications)Grid-based Distributed Data MiningTask assignment in desktop gridsAdaptive grain parallelismDynamic task distribution in multi-core clusters
Conclusions
RISC, Nov. 23-24 Mexico-city
What grids would we need?
Data grid:Support for large, distributed data repositories
Computational grid:Execution of high-end simulation models in paralleland distributed fashion
Knowledge grid:Add basic knowledge discovery mechanisms to a gridA grid architecture specialized for data mining
RISC, Nov. 23-24 Mexico-city
Grid-based Simulation: Dynamic Data Driven Application Systems
Formalized by Frederica DaremaData is fed into an executing applicationeither as the data is collected or from adata archive.The simulation can then makepredictions about the entity regardinghow it will change and what its futurestate will be. The simulation is thencontinuously adjusted with data gatheredfrom the entity. The predictions made bythe simulation can then influence howand where future data will be gatheredfrom the entity, in order to focus on areasof uncertainty.Production history data can be fed to thereservoir simulator to determine thereservoir description parameters from thegiven performance and to predict theperformance of an oil field.Intelligent agents are suitable to makethese decisions with regard to which datato absorb, when it should be absorbed,and how it should be absorbed.
RISC, Nov. 23-24 Mexico-city
Distributed Data Mining on Knowledge Grids
TeraGrid (San Diego SupercomputerCenter, National Center forSupercomputing, Caltech, ArgonneNational Lab: scientific data sets mining)Knowledge Grid (Università diCatanzaro and DEIS, Università dellaCalabria running over MIUR SP3 Italiannational grid)Terra Wide Data Mining Testbed(National Center for Data Mining at theUniversity of Illinois at Chicago)ADaM (University of Alabama inHuntsville: hydrology data mining)
IMP&PEMEX - Data mining algorithmsand knowledge discovery processes areboth compute and data intensive,therefore the Grid can offer a computingand data management infrastructure forsupporting decentralized and paralleldata analysis.
Adapted from: M. Cannataro, A. Congiusta, A. Pugliese, D. Talia, and P. Trunfio. Distributed data mining on grids: Services, tools, and applications. IEEE Transactions on Systems, Man, Cybernetics, Part B, 34(6), 2004.
RISC, Nov. 23-24 Mexico-city
Distributed Data Mining for Modelling of Hydraulic Communication between Wells
Principal components analysisFuzzy clustering (fuzzy K-means)MAP Transform (trend analysis)See5 (decision trees andrulesets)WizWhy® (association rulemining), etc.
RISC, Nov. 23-24 Mexico-city
Scientific grids: research agenda
During the last years, computational speed has beenincreasing geometrically, while the speed incommunication has only experienced a linear increase.The complexity of contemporary scientific applicationswith increased demand for computing power and accessto larger datasets is setting a trend towards the increasedutilization of grids of desktop personal computersThe combination of many multicore computers inscientific grids demand a combination of fine and coarsegrain parallelization
RISC, Nov. 23-24 Mexico-city
Desktop grids (in collaboration with CINVESTAV)
Network topology depends upon abandwidth availability for parallelprocesses to communicateA novel task assignment schemewhich takes the dynamic networktopology into consideration isdeveloped*The approach is based on theBandwidth-aware Bulk SynchronousParallel Computer (BSP)computational modelThe force field method forsynchronisation is usedThe algorithm tested for the gridscomposed of 1K nodes
*E. Wilson García and G. Morales-Luna, LNCS-3795
RISC, Nov. 23-24 Mexico-city
Task assignment algorithm
Three types of applications werestudied:
High computation, low communicationcostHigh computation, middlecommunication costHigh computation, high communicationcost
Many parallel applications fall into the2nd categoryDistributed data applications fall intothe 3rd category
RISC, Nov. 23-24 Mexico-city
Adaptive grain parallelism
Gmandel (http://gmandel.sf.net/) is abenchmark for computer infrastructuregenerating images of the fractals from theMandelbrot set. The basic unit of measureis the MMIPS (Million of MandelbrotIterations Per Second).Gmandel runs on Linux (or equivalent) on asingle computer, a multiprocesor computer(most new multi-core PCs) orInfiniband/Myrinet computer cluster.In each case, Gmandel can take advantageof multi-core technology by the use ofshared memory, fine grained anddistributed memory, coarse grain parallelcomputing techniques. The first isaccomplished with posix threads and thelatter by means of MPI message passing(currently tested with mpich2, from ANL).
RISC, Nov. 23-24 Mexico-city
Dynamic task distribution in HPC (in collaboration with CIC-IPN)
Increased complexity ofembedded devices led totheir verification consumingup to 70% of human andcomputational resourcesDynamic planning anddistribution of HDL modelsover a parallel simulationplatform (clusters with multi-core nodes) is a challengingtaskSuch a simulation platformis being developed by aPhD student Josué RangelGonzález, in collaborationwith the Embedded SystemsLab of CIC-IPN
RISC, Nov. 23-24 Mexico-city
Outline
HPC challenges in the Petroleum IndustryMexican Petroleum Institute (IMP)
IMP profileResearch Program for Applied Mathematics andComputing (MAyC)High performance computing in IMP
Research agenda of HPC in MAyCGrid-based Simulation (dynamic data drivenapplications)Grid-based Distributed Data MiningTask assignment in desktop gridsAdaptive grain parallelismDynamic task distribution in multi-core clusters
Conclusions
RISC, Nov. 23-24 Mexico-city
Conclusions
Infrastructure next steps:Cluster installation in the Supercomputing lab.12 Mb I2 connectionIntegration to Delta Metropolitana HPC Grid initiative
Software, taking advantage of the infrastructure:Current platform: Landmark, Petrel, Eclipse, OFM,opensource
Research agenda:Novel tasks and approaches enabled by the HPCincreasing the efficiency of R&D to satisfy the needs ofPEMEX
RISC, Nov. 23-24 Mexico-city 31
MAyC at the IMP: publications & events
March 12-14 2012, Cancun, Mexico