taming the geoscience data dragon- disk 2

15
Steve Darden John Gillespie FINDER Graphics Systems, Inc. Corte Madera, California, USA LaRay Geist Geoffrey King BHP Petroleum (America) Inc. Houston, Texas, USA Scott Guthery Ken Landgren Schlumberger Austin Systems Center Austin, Texas, USA John Pohlman ExplorTech Computer Applications Houston, Texas, USA Samuel Pool ARCO Plano, Texas, USA Dave Simonson Consultant Oakland, California, USA Paul Tarantolo, Jr. Exxon Production Research Company Houston, Texas, USA Dan Turner Petrotechnical Open Software Corporation Houston, Texas, USA 40 Oilfield Review Cross-disciplinary uses of oilfield information are changing the face of petroleum data management. Here is a look at exploration and production data management—where it stands today and where it needs to go. The center of attention for the exploration and production geoscientist is the base map—a map of geologic or geophysical fea- tures, such as structure, fluid type or bed- ding tops. This map, built from cross sec- tions based on log and seismic data, is used to locate new prospects and plan develop- ment of existing ones. To produce this map in early 1980s, the E&P geoscientist spent 70 to 80% of the time locating, sorting and reprocessing data and making names and scales of data con- sistent. 1 At the end of the day, a map was produced by batch processing on a main- frame computer. Hours later, or the next morning, the geoscientist draped the map over a drafting table, beside previous maps. To do the analysis, the scientist looked at everything—compared maps, referred to seismic sections and logs that covered the office wall, and hunted through production and core reports. If the scientist could keep track of the data, he or she could assemble a clear mental picture of the reservoir and therefore generate the best possible map. Ultimately, the geoscientist would meet with specialists from other oilfield disci- plines and, through a series of meetings, contribute to the integration of different map versions. For example, the geologist’s map, built from well log correlation, would be integrated with the geophysicist’s map, built from seismic data. To produce a map in the 1990s, the E&P geoscientist sits before a workstation. Only the remnants of clutter remain. The drafting table is little used. Information from stacks of paper and film have been loaded into the computer. Porosity data are three keystrokes away rather than buried in a pile ( next page). Seismic and log data pop up in sepa- rate windows on the same screen. The overnight batch job is replaced by an inter- active map revision...let’s see what happens if we assume the Dunlin formation here is 200 feet thick instead of 170 feet...The geo- scientist no longer tracks data mentally; that is done by the workstation. He or she can devote more energy to data analysis and more to data management. Now data are shared across disciplines, and interdisci- plinary integration is no longer a separate step. It is contained in the interpretation. The workstation is the most visible prod- uct of a data revolution that, in just ten years, has affected every level of the E&P business. 2 In many oil companies, central- ized data processing is giving way, in whole or part, to smaller units of “project data” downloaded from mainframe computers to workstations. The user’s demand for easier communication between application pro- grams is compelling software vendors and oil companies to set aside rivalries and join forces to write industry standards for data formats, data exchange and computer graphics. Data organization systems are springing up that allow previously isolated data to be accessed from one workstation. Ten years ago, all this was mostly infeasible. Taming the Geoscience Data Dragon $ RESERVOIR OPTIMIZATION

Upload: trinhnhi

Post on 03-Jan-2017

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Taming the Geoscience Data Dragon- Disk 2

Steve DardenJohn GillespieFINDER Graphics Systems, Inc.Corte Madera, California, USA

LaRay GeistGeoffrey KingBHP Petroleum (America) Inc.Houston, Texas, USA

Scott GutheryKen LandgrenSchlumberger Austin Systems CenterAustin, Texas, USA

John PohlmanExplorTech Computer ApplicationsHouston, Texas, USA

Samuel PoolARCOPlano, Texas, USA

Dave SimonsonConsultantOakland, California, USA

Paul Tarantolo, Jr.Exxon Production Research CompanyHouston, Texas, USA

Dan TurnerPetrotechnical Open Software CorporationHouston, Texas, USA

40

Cross-disciplinary uses of oilfield information are changing the face of petroleum data

management. Here is a look at exploration and production data management—where it

stands today and where it needs to go.

Taming the Geoscience Data Dragon

$RESERVOIR OPTIMIZATION

The center of attention for the explorationand production geoscientist is the basemap—a map of geologic or geophysical fea-tures, such as structure, fluid type or bed-ding tops. This map, built from cross sec-tions based on log and seismic data, is usedto locate new prospects and plan develop-ment of existing ones.

To produce this map in early 1980s, theE&P geoscientist spent 70 to 80% of thetime locating, sorting and reprocessing dataand making names and scales of data con-sistent.1 At the end of the day, a map wasproduced by batch processing on a main-frame computer. Hours later, or the nextmorning, the geoscientist draped the mapover a drafting table, beside previous maps.To do the analysis, the scientist looked ateverything—compared maps, referred toseismic sections and logs that covered theoffice wall, and hunted through productionand core reports. If the scientist could keeptrack of the data, he or she could assemblea clear mental picture of the reservoir andtherefore generate the best possible map.Ultimately, the geoscientist would meetwith specialists from other oilfield disci-plines and, through a series of meetings,contribute to the integration of different mapversions. For example, the geologist’s map,built from well log correlation, would beintegrated with the geophysicist’s map, builtfrom seismic data.

To produce a map in the 1990s, the E&Pgeoscientist sits before a workstation. Onlythe remnants of clutter remain. The draftingtable is little used. Information from stacks

of paper and film have been loaded into thecomputer. Porosity data are three keystrokesaway rather than buried in a pile (nextpage). Seismic and log data pop up in sepa-rate windows on the same screen. Theovernight batch job is replaced by an inter-active map revision...let’s see what happensif we assume the Dunlin formation here is200 feet thick instead of 170 feet...The geo-scientist no longer tracks data mentally; thatis done by the workstation. He or she candevote more energy to data analysis andmore to data management. Now data areshared across disciplines, and interdisci-plinary integration is no longer a separatestep. It is contained in the interpretation.

The workstation is the most visible prod-uct of a data revolution that, in just tenyears, has affected every level of the E&Pbusiness.2 In many oil companies, central-ized data processing is giving way, in wholeor part, to smaller units of “project data”downloaded from mainframe computers toworkstations. The user’s demand for easiercommunication between application pro-grams is compelling software vendors andoil companies to set aside rivalries and joinforces to write industry standards for dataformats, data exchange and computergraphics. Data organization systems arespringing up that allow previously isolateddata to be accessed from one workstation.Ten years ago, all this was mostly infeasible.

Oilfield Review

Page 2: Taming the Geoscience Data Dragon- Disk 2

nThe old world ofgeoscience inter-pretation meets thenew. Remnants ofclutter remainaround a demon-stration worksta-tion at FINDERGraphics in CorteMadera, Califor-nia. Nearly all thedata surroundingArlene Fox andEric Erickson havebeen loaded intothe data baseaccessed by theworkstation. Noteon the far wall the“correlation” estab-lished betweenlogs, using pushpinsand rubberbands.The same sort ofcorrelation has beenmigrated into thecomputer, usinggraphic devices.

Gar

y W

agne

r p

hoto

As a result of increasing computer powerand versatility, the exploration and produc-tion data dragon has been caged but nottamed (see “Evolution of Petroleum DataManagement, next page). In 1992, explo-ration and production data managementstands at an uncertain and exciting thresh-old, marked by rapid change in three areas:computer hardware, computing standardsand database development.

January 1992

For help in preparation of this article, thanks to Jeffrey A.Brown and J. William Bradford, GeoQuest Systems, Inc.,Houston, Texas, USA; Ron Dietsch and Lisa Stennes,Petroleum Information, Denver, Colorado, USA; EricErickson and Arlene Fox, FINDER Graphics Systems,Inc., Corte Madera, California, USA; Jerry House,Petroleum Information, Houston, Texas, USA; MelHuszti, Gulf Canada Resources Limited, Calgary, Alberta,Canada; Stephen Kenny, Digitech Information ServicesLtd., Calgary, Alberta, Canada; Pam Koscinski, Dwight’sEnergydata, Inc., Oklahoma City, Oklahoma, USA; MarcLador, Petroconsultants, Geneva, Switzerland; GaryMeyers, ARCO Oil and Gas Company, Plano, Texas,USA; Phil Freeman, Eric J. Milton, Dwight V. Smith and

HardwareReservoir data today are stored and ana-lyzed on four kinds of hardware systems:mainframe computers (including minicom-puters and servers), workstations, a combi-nation of mainframes and workstations, andon small computers, usually workstations orpersonal computers (PCs) dedicated to spe-cific tasks, such as seismic or log analysis.

Many see this as an unstable transitionstate, and predictions of the outcome vary.

Terry J. Sheehy, FINDER Graphics Systems, Inc., Hous-ton, Texas, USA; Av Munger, Munger Oil InformationServices, Los Angeles, California, USA; Bill Quinlivan,Schlumberger Austin Systems Center, Austin, Texas, USA;Ron Samuels, Dwight’s Energydata, Inc., Dallas, Texas,USA; Bill Schork and Lisa Stewart, Schlumberger-DollResearch, Ridgefield, Connecticut, USA; Ron Uchida,FINDER Graphic Systems, Inc., Lakewood, Colorado,USA; Laramie M. Winczewski, Simon-Horizon Inc.,Houston, Texas, USA.The following marks and trademarks appear in this arti-cle: Cray (Cray Research Inc.), FINDER (Schlumberger),GeoShare (joint mark of Schlumberger and GeoQuestSystems, Inc.), Intel (Intel Corp.), Macintosh (Apple Com-puter, Inc.), MOTIF (Open Software Foundation, Inc.),ORACLE (ORACLE Corporation), UNIX (AT&T Bell Labo-ratories), VAXstation (Digital Equipment Corp.), VMS(Digital Equipment Corp.) and X Window System (Mas-sachusetts Institute of Technology).

Makers of mainframes note the petroleumindustry’s long-standing commitment to themainframe, citing its many advantages:computing power, capacity for data security,ease and reliability of data access (usually asingle interface for all data), and ease oftracking data history. Workstation vendorssee their tool as the mouse—limited inpower, but flexible and inexpensive—that

(continued on page 44)

41

A list of acronyms used in this article appears on page 54.1. Citerne A and Yu K: “Knowledge-Based Well Data

Management and Graphical Interface,” paper SPE17611, presented at the SPE International Meeting onPetroleum Engineering, Tianjin, China, November 1-4, 1988.

2. About 50 companies offer E&P hardware or softwaresystems for the exploration and production environ-ment. See Oil & Gas Journal 88, no. 11 (March 12,1990): Special Supplement, 32-3–32-17.

Page 3: Taming the Geoscience Data Dragon- Disk 2

I

TRANSISTORS

VACUUM TUBES

4

1920 1930 1940 1950 1960

•• •••• •••

Evolution of Petroleum Data Management

•1919Munger OilInformation Service first tooffer drilling &completionreports (only inSouthern Cali-fornia, USA).

•1924Everett De Golyer discovers firstoil field usingsingle-fold seismic data, Nash salt dome,Brazoria County,Texas, USA.

•1928Petroleum InformationCorp. (PI)started in Denver,Colorado, USAas weekly drillingreport for Rock-ies only.

•1934

Productionreports1 offered(Texas Panhan-dle) by companythat becameDwight’s Energy-data.

•1945 ElectronicNumerator, Integrator, Ana-lyzer and Com-puter (ENIAC)invented at University ofPennsylvania,USA, first fullyfunctional elec-tronic calculator.

•1951Ferranti Mark I,first commer-cially manufac-tured computer,is installed atManchester University (UK).

•1953First commer-cial digital computer (IBM).

•1955 Productioninformationavailable in digital form(Lockwood).

•1956 Petroconsul-tants S.A. starts monthlyscouting reportservice in Cuba;expanded toworld outsideNorth Americain 1962.

1. Production reports typically include data on monthly and cumulative production, lease data, total depth, completion dates andnumber of wells on the lease.

2. Scout reports, or “tickets,” typically include all drilling-related data from the time the well is permitted through to completion.

Paper and film

Core memorySolid state m

Drum meMagnetic tap

Punch cards and paper tapeHard disk sto

Computing for Oil—A Short History of Data in the Petroleum Industry

Over the past 40 years, the life of the geoscientist

has been changed largely by advances in infor-

mation science—how data about the earth are

acquired, processed, stored, retrieved and ana-

lyzed. Before computers, prospects were found

mainly by surface reconnaissance and crude

seismic surveys. This required a tremendous

number of people, to both acquire and sort

through data. Armchair exploration using com-

mercial data services—scout tickets and other

well data—became possible in the 1930s, but not

practical until the 1950s, and was labor-intensive

until the 1970s.

The first use of computers in the petroleum

industry was far removed from exploration. In the

2

late 1950s, computers proved their mettle in

helping solve differential equations describing

the flow of oil through the reservoir. An early

application for exploration was analysis of a

small number of grids for 2D reservoir modeling.

Computational power jumped a level in the early

1960s, when discrete transistor technology arrived.

For reservoir modeling, this allowed adding a

third dimension and increasing the number of

grids, which improved resolution. But, more sig-

nificantly, for the first time earth properties with

nonlinear characteristics, like relative permeabil-

ity and capillary pressure, could be modeled.

Computers moved from novelty to commercial

popularity with the introduction of the IBM 360

series in 1964, and geoscientists began moving

data from bookshelves and file cabinets into com-

puters. They built well files that had mainly raw

log data, but also some interpreted data. To eval-

uate a prospect, they would take a seismic sec-

tion and correlate it by hand to the sonic log.

There was no automated correlation, and the

computer well files mimicked the paper well files

from which they descended. Later, computer fil-

ing grew more sophisticated and data bases

evolved within each discipline: geology, geo-

physics, drilling, petrophysics and reservoir

engineering. By the 1970s, it was standard to

manually tap information from these separate

computer data bases. Now, exploration didn’t

rely mainly on expensive field reconnaissance. It

Page 4: Taming the Geoscience Data Dragon- Disk 2

MICROCIRCUITRY

INTEGRATED CIRCUITS

1960 1970 1980 1990

• • ••• • •• •

ution of Petroleum Data Management

•1971 Intel inventsmicroprocessor.

•1972Scout reportsavailable nationwide[USA] (PI).

•1973 International well and con-cession datasupplied oncomputer tape(Petroconsul-tants).

•1976 Delivery of first Cray-1 supercomputer.

Firstcommercialsatellitetelecom-munications of log data.

•1978National Pro-duction System(NPS), a com-mon format forproduction dataand histories,established by13 major oilcompanies.First on-linesystem for con-veying produc-tion data world-wide via phonelines (Dwight’s).

•1987RISC technol-ogy becomescommercial.

•1989Development of InternationalRelational InformationSystem (IRIS 21) relational database for E&Pdata (Petroconsul-tants).

Raster images of entire logs ondiskette, opticaldisk or magneto-optical disk (PI).

•1990Scout informa-tion on CDROM (Dwight’s).

•1962First digitalscout2 reports(PI).

Early 1970s: Intergraph first company to adapt interactivecomputer display technology for oil industry application.

Late 1970s: Gulf Oil’s ISIS (InteractiveSeismic Interpretation System), first on-screen integration of two disciplines:seismics and wireline logs.

Late 1980s: Interactive color graphicscapability available for ~ $20,000, mak-ing it commercial for the first time.

Mid- to late-1960s: Wireline logs andseismic data on paper computer tape orcards. First capability to store and printmultiple copies of logs.

Early 1960s: Emergence of 2Dmultifold seismic data.

Video

Solid state memoryMass storage unit (IBM, 1976)

Drum memoryMagnetic tape

Hard disk storageFloppy disk storage

Optical storage

could also be performed in the computer room

with purchased data.

These first discipline-specific data bases were

hierarchical and so only relationships between

data sets anticipated by the author of the hierar-

chy could be considered. The production data

base, for example, might have been organized by

lease block only, not by hydrocarbon type, pro-

duction volume or completion date. If you wanted

to map productivity from the Wilcox formation,

you would have to manually identify the Wilcox in

each well in the petrophysics data base, then

compare that manually with each well in the pro-

duction data base—and the two data bases might

have been in different buildings, on separate

computers and using different software.

These data bases were part of the established

isolation of disciplines. By and large, the produc-

tion engineer never talked with the exploration

engineer. They worked for the same company, but

belonged to separate organizations, with sepa-

rate cost control points, separate budgeting

issues and often different management points

that converged only at the level of chairman.

Today, this separation has largely disappeared,

due in part to a force from outside the industry.

In the 1970s in California’s silicon valley, the

first papers were published on relational data

base technology, and by the early 1980s it was

becoming the rage in the computer industry.1

Although searches on relational data bases were

43

initially slower than on hierarchical ones, the

idea of searching by “relations” fostered a new

way of thinking about database design—you don’t

need to anticipate all the relations between data

when designing the data base. Through the 1980s,

this technology trickled into the oil industry, which

had become thoroughly computerized. Relational

technology made possible data integration across

disciplines—core and log porosities, for instance,

were combined in the same data base. Slowly,

1. The seminal paper that started relational technology wasby Edgar Codd of IBM’s research laboratory in San Jose,California, USA: Codd EF: “A Relational Model of Data forLarge Shared Data Banks,” Communications of the ACM13, no. 6 (June 1970): 377-387. An update on this workis Codd EF: The Relational Model for Database Manage-ment. Addison-Wesley Publishing Company: Reading,Massachusetts, USA, 1990.

Page 5: Taming the Geoscience Data Dragon- Disk 2

replaced the brontosaurus mainframe—powerful, but cumbersome and expensive.

Users are taking several roads. A handfulemphasize mainframes, some are goingcompletely to workstations, and the major-ity are trying to mix workstations and main-frames. The consensus seems to be thatmainframes will not disappear, but their rolemay change from data analysis to dataarchiving and control. The main exceptionsare reservoir simulation and seismic dataprocessing, which benefit from being per-formed with conventional microcomputerslinked in parallel or with supercomputers.

Urgently needed is a reliable link betweenmainframes and workstations, since most oilcompanies have both technologies andneed a way to get the most from each.Many companies have assembled in-houselinks that operate with varying degrees ofefficiency and reliability (right). Often, how-ever, mainframes and workstations cannotcommunicate easily, if at all.

Much work remains before the two tech-nologies can be married in a standard, non-proprietary way. Meanwhile, a leading effortfor a proprietary linkage is a project orga-nized by International Business Machines(IBM) and consisting of IBM and a consor-tium of oil companies. The project is anattempt to develop a large IBM mainframe-based relational data management systemwith a mechanism for workstations toupload and download data (see “DatabaseDevelopment,” page 49, for discussion ondata bases).

The data base is unique in that it will beseveral times larger than any existing oilfielddata base, able to hold information fromseveral fields. Because this system is the firstBoeing 747 in an industry that has so far

evolved only to the Piper Cub, IBM facesseveral new challenges. A technical chal-lenge, given the size of the data base andnature of its contents, will be a design thatpermits establishing meaningful relationsbetween data. How the mainframe willcommunicate with workstations is alsouncertain. Many oil companies and work-station vendors want IBM to design the sys-tem to be compatible with their own sys-tems. This would simplify, for example,networking, checking of data in and out ofthe mainframe and data representation.

The shift to workstations also affects peo-ple doing E&P work. As oil companies try tomaximize their resources, managers wel-come workstations for their economic aswell as technical benefits. Buying and main-taining workstations is significantly lessexpensive than mainframe computers. Addi-tionally, workstations let geoscientists takegreater responsibility for data organizationand quality, formerly the work of data man-agers or geotechnicians.

The Battle for Computing StandardsLinguists estimate that when ChristopherColumbus landed in the New World 500years ago, Native Americans spoke 1000separate, mutually unintelligible languages.Today’s E&P geoscientist, casting around fora data system, may feel like Columbus (see“Misconceptions about Data Management,”next page and “Matching Needs and Capa-bilities,” page 47 ). Among the numerousstandards, only a few are recognized con-ventions—and sometimes within these areseveral incompatible versions.

The current generation of exploration soft-ware consists mostly of workstation- andPC-based, stand-alone applications and

Applicationprograms

Application/datainterface

Data files/data base

Mainframe

MappingLogAnalysis

SeismicInterpretation

nTransfer of data by links between pairsof computers. The advantages of thisapproach are that it does not hinder cre-ation of new application programs; it iso-lates the impact of change; and it allowstight integration of programs within a dis-cipline. The disadvantages are that itdoes not provide integrated access todata, requires transfer software for eachlink pair, and makes tracking of data his-tory and consistency difficult. The namesof the programs are taken as examples;they could be any type. (From Guthery et al,reference 3.)

holes appeared in the walls separating types of

data and separating E&P organizations.

This incremental change was given a boost,

ending disciplinary isolation, by the oil price col-

lapse of 1986. Suddenly, fewer resources were

available to solve a growing number of E&P prob-

lems. Management consequently had to revise

business strategies, and one practice that

emerged was the interdisciplinary team. Today,

the idea of the geophysical team solving a prob-

lem on its own, independent of the geologist and

reservoir engineer, is almost nonexistent. But

before the interdisciplinary teams can work effi-

ciently, a final challenge remains—finding a way

to join data from different disciplines.

Again, the roots of the solution came from out-

side the industry. From the ashes of the price col-

lapse rose a few companies offering relational

database technology for the petroleum industry,

among them, Ingres Corporation (purchased by

Ask Computer Systems Inc.), Oracle Systems

Corp. and Software AG. These data bases

enabled geoscientists to relate data that previ-

ously had not been recognized as related, and it

allowed users to define the relationship between

the data. For example, it allowed log values to

constrain a seismic inversion.

Along with the introduction of this new capabil-

ity, in the late 1980s, was the declining cost of

storage media, which increased the amount of

on-line information. And the more information at

the geoscientist’s fingertips, the more obvious

the need for a way to organize it and make it

accessible between different hardware and soft-

ware configurations.

This brings us to the present, and the efforts of

standards groups to define an open systems

architecture that is acceptable to both vendors

and clients (see pages 46-48 of main text). The

goal now is to make all forms of data available to

any geoscientist working on any workstation,

seamlessly and effortlessly.

For further reading:Augarten S: Bit by Bit: An Illustrated History of Computers.New York, New York, USA: Ticknor & Fields, 1984.Computer Basics: Understanding Computers. Time-LifeBooks, Alexandria, Virginia, USA: 1989.

Page 6: Taming the Geoscience Data Dragon- Disk 2

their proprietary data bases.3 While thesetools are powerful within their own realm,they can neither talk to each other nor eas-ily share data. Routine analysis usuallyrequires several software applications, oftenfrom different vendors, and the user mustmanually reformat the output from oneapplication before loading it into the next.To make matters worse, applications oftenreside on different computer systems, sodata must be moved from system to system,and in many cases must be rekeyed whendata conversion is not performed automati-cally. Even if two systems can be linked,data sharing is often slow because theunderlying proprietary data base is struc-tured to run only its specific application asfast as possible.

This lack of communication emergedbecause software developers seek speedthrough proprietary solutions. Furthermore,until recently, both oil companies and ven-dors tended not to share solutions becauseof the perceived risk of revealing proprietaryalgorithms.4 The direct effect is a higher costof software engineering—each softwarevendor must design user interfaces, graphicsstandards and database architecture. Andthis is often at the expense of analyticalcapability. The indirect effects for the userare less efficient use of software programs,higher training and support costs and time-consuming data management.

Geoscience computing entered the 1990swith a lack of standards—such as for userinterface, data exchange and graphics. For-tunately, by the mid-1980s, the computerindustry at large had recognized the futilityof a proliferation of standards, and that with-out a cooperative effort, everyone wouldlose their competitive advantage. Standardsgroups sprang up for operating systems5,quarter-inch tape manufacturers and

• Buying a data base or data management systemis enough. Buying a system is the beginning,

not the end, of problem solving. Once a system

is acquired, there are other issues to confront:

How are data converted and loaded? Who will

perform quality checks before data are loaded?

What will be required to ensure that the new

system interfaces with existing systems, or

other systems that might be acquired in the near

future? What kind of training is available? What

are the maintenance agreements?

Loading a large data base is a major task. In

the early 1980s, Exxon Production Research

built a seismic interpretation data base. Locat-

ing, converting and loading data for a relatively

small project—100 seismic lines—took

months. During database construction, a geo-

scientist working by hand could finish an inter-

pretation before another geoscientist awaiting

completion of the data base. But after the data

base was complete, mapping took half as long.

• All types of information can be dealt with thesame way. Certain systems may improve pro-

ductivity only for certain tasks. Most information

systems are good at only one or two tasks. One

oil company bought a mapping system that was

designed for cultural data used by city planners

and highway engineers. They found, however,

that well data were so dense that data “over-

posting” produced an uninterpretable image on

the screen. The vendor was not responsible for

the problem—its system was working as

intended. The oil company had to donate to the

vendor millions of dollars in software engineer-

ing to solve the problem.

• If data are in the data base, they must be right.Many geoscientists can tolerate working with

paper copies of data that contain errors. They

perform quality assurance on the fly, or compen-

sate in their interpretation. But once data

become digitized from paper to an electronic

file, the perception is that the data must be cor-

rect—“otherwise, somebody wouldn’t have put

it in the data base.” It is valuable to recognize

that data are data, no matter the form. A valu-

able component of the data base is a record of

who loaded and checked the data.

Checking of data requires not only attention to

detail but a wide-ranging imagination. Exam-

3. Guthery S, Landgren K and Waagbo K: “IntegratedData Access for Geoscience Interpretation Systems,”GEOBYTE 5, no. 5 (October/November 1990): 38-41.

4. Schwager RE: “Petroleum Computing in the 1990s:The Case for Industry Standards,” GEOBYTE 6, no. 1(February/March 1991): 9-13.

5. An operating system is a software family that letsinformation flow through the computer as fast as pos-sible but in an orderly fashion. It also serves as a trans-lator or interpreter for the user. A user doesn’t talk to acomputer, but to an operating system, which com-mands the hardware to fulfill the user’s request. Tasksof an operating system include:•Controlling movement of data into and out of com-puter memory•Supervising operation of the central processing unit•Sending data to peripheral devices, such as monitorsand printers•Keeping data and programs uniquely identified.In oilfield computers, common operating systems areUNIX, Disk Operating System (DOS), and VirtualMemory System (VMS).

45

Misconceptions about Data Management

ples of errors found in vendor data include

February having 30 days, an offshore platform

with the Kelly bushing at 2000 feet, and a 1000-

foot well with TD at 5000 feet. In the future, much

of checking for plausibility will be automatic.

• Just having a geographic information system(GIS) or data base will increase productivity.The learning curve for workstations can be on

the order of months to years. One seismic work-

station vendor’s own survey found that less than

1% of the users of its system ranked themselves

as “experienced,” even after eight years. One

reason is that understanding the nuances of

workstation operation takes time. Another rea-

son is that most users are occasional users,

who tend to learn the machine well enough for

their job but not to the point of mastery. The

learning curve is shrinking, as new machines

become easier to use, but they still require

some computer background. An average user of

the FINDER system, for example, can get the

basics in about one week—given a background

in UNIX and SQL and some experience with

ORACLE data bases.

A second part of this misconception is that the

only advantage of a GIS system is the ability to

work faster. True, the geoscientist can spend

less time looking for data and therefore produce

an interpretation faster. But an often overlooked

advantage is that a more thorough interpretation

can be performed in the same amount of time as

when done on a mainframe.

• Information management is somebody else’sjob. Anyone using a computer that is not a

stand-alone system is managing shared infor-

mation, and information management is part of

his or her job. This attitude helps prevent devel-

opment of an adversarial relationship between

people using an information system and people

managing it. Every explorationist is responsible

for the quality of data that go into the interpreta-

tion. Data loading and quality assurance may be

subcontracted, but the explorationist is respon-

sible for correctly placing the wells, scaling the

logs and picking the formation tops. It is in the

Page 7: Taming the Geoscience Data Dragon- Disk 2

best interest of the company if the explorationist’s

needs and understanding of the data are clearly

communicated to people managing the data base.

• A data base is an application. Workstation appli-

cation products, such as for log interpretation or

mapping, have an underlying internal data base

that is largely invisible to the user. Consequently,

the concept of a data base is often not appreci-

ated. The data base itself is not an application

but a reservoir of information accessed through

an application.

• A central (or a local) data base is preferred.The conflict in the choice of central vs. local is

between the need to freely make changes in the

data and the need to prevent changes that

destroy the accuracy of data. Central data bases

have higher security, but can inhibit interactive

interpretation. One solution to these conflicting

needs is the use of a database management

system that facilitates the automatic creation of

local data bases, devoted to the study area. This

allows the geoscientist to manipulate a sub-

group of data, without altering the integrity of

the central data base.

• Graphics capabilities are of foremost impor-tance. Good graphics capability in a workstation

is only one factor in its overall performance.

Often overlooked in judging the day-to-day prac-

ticality of a workstation is ease of data load-

ing—not something vendors rush to demon-

strate on the convention floor. Nevertheless,

finding and loading data take a large chunk of

the geoscientist’s time, and the easier the load-

ing, the sooner data analysis can begin.

• Speed is of the essence. Speed is important in

an application, but less so for a data base.

Database software and hardware can be tuned

to a specific application, increasing database

speed 20 to 200 times over that of an off-the-

shelf product. Also, workstation hardware is

likely to be upgraded before workstation soft-

ware, so that in two years, there is a good

chance the user will be running the same soft-

ware—or an updated version—on hardware that

is twice as fast.

database programs. By the time the problemof multiple standards surfaced in petroleumcomputing, the broader computer industry’sefforts could be used as a guide.

As geoscientists mix interpretation fromthe four major disciplines—geology, geo-physics, petrophysics and reservoir engi-neering—a foremost need is for a standardexploration and production data model: asystem for organizing and defining geo-science information and interrelationshipsbetween data to model the reservoir in away that fulfills a need (above).6

A typical question in assembling a datamodel is the definition of a so-called majorelement. For example, is (A) a seismic linepart of a seismic survey or is (B) a seismicsurvey part of a seismic line? A computersystem using a model that organizes data as(A) cannot easily exchange information witha computer system using a model that orga-nizes data as (B).

Flaws in data models often becomeapparent when a type of data does not fitthe model or when relations between datacannot be accommodated by the structureof the model—for example, horizontal wellsmay require new ways of computing logdata, such as varying the measured valuewith horizontal distance rather than verticaldepth. A data model may have nothing todo with computers. Computers provide onlya representation of that model, such as arelational or hierarchical database structure.

Because there are so many ways to defineoilfield data and relations between them, oil-field data models are complex, attempting tocover as many phenomena and data permu-tations as possible. Through the 1980s, themove toward integrating data across disci-plines made it clear that a cooperative effortinvolving vendors and users was necessaryto unify the field, or at least reduce the num-ber of geoscience data models.

46 Oilfield Review

nA central part of an E&P data model is the subsurface model, which is used to guidethe search for, and development of, petroleum reserves. The subsurface model is devel-oped with information from the four surrounding disciplines. (Adapted from Berkhout et al,reference 6.)

Bar

rels

of O

il/D

ay

0

10

20

30

40

50

60

Year

82 83 84 85 86 87 88 89 90

40 41

SubsurfaceModel

GeopoliticalData

ProductionData

Well DataLogsCoresCuttings

SeismicsSurveysCrosssections

Page 8: Taming the Geoscience Data Dragon- Disk 2

The IBM effort at linking mainframes andworkstations depends ultimately on devel-oping a satisfactory data model. Two, moresweeping efforts have been initiated by thePetrotechnical Open Software Corporation(POSC), based in Houston, Texas, USA, andthe Public Petroleum Data Model (PPDM),based in Calgary, Alberta, Canada. Bothgroups are not-for-profit organizations.

The mission of POSC is to “...define,develop and deliver, through an open pro-cess, an industry standard, open systemssoftware integration platform for petroleumupstream technical computing applica-tions.” The PPDM seeks to “create an indus-try standard means of assembling, remem-bering and communicating data related tothe petroleum industry.”

POSC, founded in October 1990, is a cor-poration funded by 35 companies fromEurope and North America representing theE&P industry, oilfield services, software andcomputer manufacturers and governmentresearch establishments. POSC acts as a kindof United Nations for geodata standards,soliciting computer-based technology fromacross the industry to be adopted in wholeor part in a POSC “software integration plat-form.” To date, 20 companies have submit-ted 34 models for consideration by POSC.

The POSC platform constitutes an E&Pdata model, a user interface with a “com-mon-look-and-feel,” software offerings anda set of specifications and reference codesthat will allow vendor products to be com-pared with selected industry standards.7 Itsfirst release—a model for data exchangeand user interface—is scheduled for thethird quarter of 1992. The group’s $6.5 mil-lion budget funds a full-time staff of ulti-mately 44, and is supported by membercompanies expected to number 40.Although the group currently operates solelyfrom Houston, its board of directors hasauthorized the opening of a second office inthe UK.

The success of POSC is linked to theemergence of the so-called open systemsenvironment: software and hardware designthat permits any application to run on anyhardware.8 Although “open systems” is onthe lips of many software engineers, the oilindustry has reason for some resistanceagainst it—a large investment in proprietarysoftware and hardware. POSC sees part ofits job as showing how these existing sys-tems can be adapted to talk with each otherand with new systems.

POSC cites several advantages to theopen systems environment. It is in line with

• What is going to be the present and future use ofthe system? Who are the users?

• What do I want to store and why do I want tostore it that way?

• Does the data base have facilities that permiteasy transfer of data between different databases—between the corporate, basin and pro-ject data bases?

• Can the data base exchange information withsoftware applications we use?

• Can the data base connect with any StructuredQuery Language (SQL) data base or product, suchas spread sheets or geoscience applications?

• What is the size of the business unit I will beworking within—all the wells in the Adriatic?The Niger delta? The Eastern Hemisphere?

• What is the form of data to be used—a monthlyreport? A weekly report?

• How is information accessed? Are privilegeslimited or unlimited? Do users need simultane-ous access to the same data? Will I need a sys-tems manager?

• Where will information be stored—centrally,locally or regionally? Will I need one big diskdrive?

47January 1992

Matching Needs and CapabilitiesQuestions to Ask When Selecting an E&P Information System

A chief cause of dissatisfaction with information systems—from data bases to integrated interpretationpackages—is that the buyer had neither the time nor expertise to fully evaluate the capabilities and limi-tations of the system. Answering several questions while shopping can help the buyer clarify his needsand find the product that best satisfies them.

the oil company trend of buying and modi-fying off-the-shelf software, rather thandeveloping it in-house, which has becometoo expensive. It also prevents “boxlock”—when the time comes to retire acomputer, the replacement won’t necessar-ily tie the user to a single hardware supplier.POSC expects that these advantages willoutweigh one drawback that may be associ-ated with open-system architecture, com-promised speed. Proponents of open sys-tems maintain that this problem will beminimized by hardware advances—an oftenquoted figure is doubling of computationalspeed every 12 to 18 months.

The PPDM group, founded in 1989, is avolunteer organization currently of 20 pay-

6. For an example of an E&P data model, see BerkhoutAJ, Smeets G and Ritsema I: “A Strategy for Develop-ing a Standard E&P Data Model,” Geophysics: TheLeading Edge of Exploration (September 1991): 33-37.

7. Turner D: “Petrotechnical Open Software Corp.—AWay Forward,” GEOBYTE 5, no. 5 (October/Novem-ber 1990): 36-37.

8. Wilson DC and King JW: “An Integrated Geoscienceand Reservoir Engineering Software System,” Advancesin Reservoir Technology: Characterization, Modeling& Management, Royal College of Physicians, Edin-burgh, UK, February 21-22, 1991.

• Will workstations be networked together? If so,is network software available and how well doesit perform?

• What platform(s) will the vendor support, andwhen?

• What data input formats does the vendor sup-port? Are they compatible with those that I haveor am likely to get?

• How will I load the various types of data that Iexpect to put on the system? How does the sys-tem handle paper data?

• Do I need satellite or phone links?

• Will it interface with existing internal systems(for lease administration, production account-ing, asset accounting, etc) or systems we planto acquire? On what standards is the systembased?

• What is the upgrade path for this system? In twoyears will I feel stuck with this system, or will itbe able to grow to meet future needs?

• Can the data base be easily customized to fit theneeds of different operating groups?

• How thorough and helpful is the vendor’s usersupport?

Page 9: Taming the Geoscience Data Dragon- Disk 2

ing subscribers comprising oil companiesand vendors of applications and hardware.9Membership is $900, which is used by thegroup for printing and other operationalexpenses. There are no paid employees.

The PPDM is a standardized, nonpropri-etary database format that serves as a frame-work on which to build petroleum com-puter application programs. Like POSC’smodel, the PPDM is released in stages. Thefirst database format, called a schema, wasreleased in April 1990. Subsequent versionsare revised, based on user feedback andreleased periodically. The model currentlycovers exploration and production data,including production and core analysis,openhole pressure and drillstem test data.Plans call for pipeline, reserve and eco-nomic data, lease information (expirationdates, price paid) and seismic and log data.

Since its first release, the PPDM hasbecome the de facto standard in Calgary.Only a small percentage of purchasers ofthe model are from outside Canada. Inhopes of expanding the model’s appeal,parts of it have been submitted to POSC.

The proliferation of workstations has stim-ulated demand for other kinds of computerstandards for petroleum applications. Hereare highlights of a few.10

•User interface. This is perhaps most inneed of a standard. The only standardsthat exist today govern the structure ofwhat things look like on the screen. Animportant standard for the oil industry inthis respect is OSF/MOTIF, which gives acommon look to the screen. Each windowhas a similar configuration and border; itis moved around the screen and closed inthe same way; it has buttons that the userclicks on to operate certain functions. Pro-grammers have called for an applicationstyle guide, defining standards for screenlayout, menu bars, dialogue boxes anddata input fields. Standards are alsoneeded to allow the user to select a pro-ject and automatically start requiredapplications and access data bases.11

•Data conversion. The first generation ofworkstations were cumbersome becausedata often had to be converted to a formatacceptable to the program, which wasoften specific to the workstation. Thisextra step is removed in the emerginggeneration of workstations, which arecapable, for example, of reading standardlog tape formats.

A subset of the data conversion prob-lem is data compression, particularly ofseismic data. Some precision is lost ingoing from 32-bit to 8-bit seismic data. Insome workstations, compression changesseismic amplitudes, resulting in anomaliesin amplitude mapping.

•Data definition. Trouble is in the makingwhen a user tries to integrate data fromtwo systems that do not use coincidentaldefinitions of fundamental concepts. Forexample, is bottomhole temperature staticor circulating? Is porosity log- or core-derived? An effort to address this long-standing problem is the Petroleum Indus-try Data Dictionary (PIDD). The PIDD hasbeen assembled over the past few yearsby the American Association of PetroleumGeologists (AAPG) working with repre-sentatives of 17 companies and 2 UnitedStates government agencies. Although theeffort is entirely US-based, the dictionaryis intended to standardize industry termsfor what things are called, how they arenamed in the data base. The 2448-termglossary is being serialized in thepetroleum computing magazine of theAAPG, GEOBYTE, and is available for$10 on diskette.12 The next step for thePIDD is creation of a standard number ofbytes— physical computer characters—for data referenced by the glossary terms.

•Data sharing. As oil companies search forthe best workstation, they often end upwith four, five or more mutually exclusiveplatforms. Proprietary programs are writ-ten to link these resources with each otheror with the mainframe. These links areoften highly specific and expensive todevelop and maintain (page 44). An alter-native approach to connecting applica-tions has been developed, called theGeoShare data exchange standard. It isnot a pair-wise link, but a bus to whichany number of applications can be con-nected. In order for an application to

make a connection through the GeoSharestandard, the application developer mustwrite a link between the application‘s pro-prietary data base and the GeoShare stan-dard. The standard is based on the Ameri-can Petroleum Institute (API) standard fordata exchange known as RecommendedPractice 66 (RP 66) or Digital Log Inter-change Standard (DLIS).13

The first release of the GeoShare standard,in spring of 1991, is a program and blueprintfor a two-way bridge connecting applica-tions with each other, with a data base orboth, regardless of platform (next page, top).The specification is being written by Geo-Quest Systems, Inc. and Schlumberger.

The GeoShare system is analogous to theclipboard in the Macintosh environment. Tomove text from one program to another, youcan move it temporarily into a clipboard.The GeoShare standard largely defines theformat of data that can go in the clipboard.But it is more complex. The Mac simply hasto define each letter in each word. In theGeoShare standard, however, the senderand receiver have to agree not only on thedefinition of a letter or number, but on thedefinition of a fault, horizon or a log. Ittherefore conducts not only data but a dic-tionary of words to define the data in a sin-gular way. Later, the GeoShare standard isexpected to include graphics interface stan-dards that will permit shipping imagesbetween different workstations.

Several major oil companies believe theycan utilize the GeoShare standard toachieve data integration approximating the“tight” integration obtained with a commondata base (see “Some Data ManagementStrategies,” page 51). By configuring theirsystems as shown on the next page, theycan move study data from the common database to the selected workstation as neededand therefore keep only one master copywith the most current interpretation. Uponcompletion of analysis on the workstation,the results are then moved from the work-station into the common data base. Thesooner interpreted data are moved back intothe common data base, the less chance forconfusion over the location of the correctversion of the interpretation.

48 Oilfield Review

Page 10: Taming the Geoscience Data Dragon- Disk 2

SharedData Base

GeologicModeling Log Analysis

SeismicInterpretation

SeismicModeling

Cross SectionsMapping

Production,Accounting

Lease, Contract,Administration

Geo

Share Data Exchange Standard

TightIntegration

withData Base

Read and write application half-links

Data Integration nData integrationtoday, before indus-try-wide adoption of a common datamodel. Applicationscan talk with eachother or a central,shared data basethrough theGeoShare dataexchange standard.This is a form ofloose integration.Note that applica-tions can talk witheach other withouthaving to passthrough the shareddata base, but atthe end of an inter-pretation cycle, theinterpretation canbe downloaded tothe shared database. In this exam-ple, mapping andcross section appli-cations bypass theGeoShare link andare tightly inte-grated with thedata base.

A Data System Comparison

Data base

DB Manager

Quality control

Databasemanagementsystem

Hard copy device

Books, magazines, newspapers, microforms

Librarian

Staff evaluates condition ofholdings, arranging for repair ofdamaged items; correctlyreshelves holdings

Dewey decimal systemCard/computer catalogHoldings checkout

Photocopying machine,printer

Seismic, log, productionand geopolitical data, etc.

DB Manager or committee

Geotechnicians, users, data base administrators, or all three. Data are checked for plausibility, correct scaling of logs, placement of wells,formation top picks, etc.

Relational tablesStructured Query LanguageCentral processing unit

Printer, plotter

Public Library Oilfield Data System

Another approach to data sharing is theOpenWorks program, being developed byLandmark Graphics Corp. There are severaldifferences between OpenWorks and theGeoShare standard. Whereas the GeoSharestandard can be viewed basically as abridge, OpenWorks is a link plus an attemptto establish a unified user environment, adatabase management structure and inter-process communications. Although Open-Works has been made available to third-party software developers, its architectureoptimizes Landmark applications.

Database DevelopmentA data base is a collection of informationand a database management system is anorganizing system for storing, accessing andrevising information that reflects the needsof the user (see “A Data System Compari-son,” above, right). The key idea is that thedatabase organization satisfies a basic need,otherwise it’s just an unsorted pile of infor-mation. Structure of the data base must alsoaccommodate the data model.

January 1992

9. Rhynes PJ: “PRISM: Petroleum and Resource Indus-try Information Storage and Management,” GEOBYTE5, no. 5 (October/November 1990): 31-35. The PPDM was founded by Applied TerravisionSystems, Digitech Information Services Ltd., FINDERGraphics Systems, Inc. and Gulf Canada ResourcesLimited.

10. Nation L: “Data Standards Work Progresses,” AAPGExplorer (August 1991): 9.

The first computer data bases were simplyelectronic means of storing what was in afiling cabinet. The main advantage was thatthe computer could locate data faster. Likethe first generation of application software,oil companies mainly generated them in-house. Database technology has evolved tothe point where the state of the art is multi-

11. Schwager, reference 4.12. “Petroleum Industry Data Dictionary—Part 1,”

GEOBYTE 6, no. 3 (February/March 1991): 17-32.The rest of the dictionary will be published in threesubsequent issues. For diskettes of the dictionary,contact Pam Koscinski, Dwight’s Energydata, Inc.,P.O. Box 270295, Oklahoma City, OK 73137.Phone: (1) 405-948-7008; fax: (1) 405-948-6053.

ple and networked systems, supplied by avendor, capable of understanding the orga-nization of different data bases on differentmachines and keeping a master list of every-thing stored.14 (See “Bringing Data manage-ment into the 90s: The BHP Case Study,”page 52.)

49

13. Although the name DLIS suggests that the format isgood only for logs, it is in fact appropriate forexchange of any kind of scientific data.

14. Gillespie JG: “Database Strategy for an ExplorationWorkstation,” presented at the ASEG/SEG Interna-tional Geophysical Conference and Exhibition, Ade-laide, Australia, February 17, 1988.

Page 11: Taming the Geoscience Data Dragon- Disk 2

nComparison of hierarchical and relational data base structures for the same informa-tion used in a drilling operation. To find wells intersecting a gas zone completed byXYZ company in November, 1990, the programming interfaces would be as follows.(Adapted from Date CJ: An Introduction to Database Systems. Reading, Massachusetts, USA: Addi-son-Wesley Publishing Co., 1977.)

For a hierarchical data base:next_field:

get first/next FIELDif not found then goto exit

next_well:get first/next WELL where WELL.OPERATOR =“XYZ” andWELL.COMPL_DATE = “NOV 1990”if not found then goto next_fieldget first/next POROUS_ZONE where POROUS_ZONE.FLUID = “GAS”if found then print WELL.WELL_IDgoto next_WELL

exit:

Hierarchical Data Base

FIELD_NAME FIELD_TYPE REGION well_ptr nxt_fieldField

WELL_ID OPERATOR COMPL_DATE zone_ptr nxt_wellWell

ZONE_NAME TOP BOT POROSITY FLUID nxt_zonePorous_zone

Relational Data Base

FIELD_NAME FIELD_TYPE REGIONField

WELL_ID OPERATOR COMPL_DATE FIELD_NAMEWell

ZONE_NAME TOP BOT POROSITY FLUID WELL_IDPorous_zone

For a relational data base, a query interface might be: “Print WELL_IDs for wells inter-secting gas zones completed by XYZ Company in November 1990.”

select unique Well.well_ID from Well, Porous_Zone where

Well.OPERATOR = “XYZ” andWell.COMPL_DATE = “Nov 1990” andPorous_Zone.FLUID = “GAS” andWell.WELL_ID = Porous_Zone.WELL_ID.

There are two main types of data bases,classified by the structure they use to orga-nize information: hierarchical and relational(right). A rapidly evolving variation on thesetwo themes, called object-oriented method-ology, is just becoming commercial.15

Today, it operates mainly as an overlay ontop of a relational foundation, but increas-ingly it may function as a separate approachto which relational, hierarchical or flat-filestructures are subsets.

A hierarchical data base organizes datalike the branches of a tree. To reach a cer-tain leaf, one must always ascend the trunkand follow the same set of branches. Or incomputer vernacular, a hierarchical systemis a list from which an item is selected. Thisitem refers to another list, and another listand so on until the end point is reached.

This system is well suited to data that arealways used in a fixed structure. Informationcontained in well log headers, which isalways presented in the same order and for-mat, is often represented hierarchically. Theadvantage of this structure is its speed for aparticular item. A disadvantage is absenceof a shortcut to the end point. In a stricthierarchical system, one must alwaysascend and descend the same path.

A relational data base is more like a seriesof file drawers than a tree. Data items areloaded one at a time into lists and sublists,in a way that allows each item to be locatedindependently. For example, a list of wellsmight be made up of one actual list ofrecords, but indexed in several ways: byoperator, by hydrocarbon type and by com-pletion date. The relation between items isbuilt with the query, which tells the com-puter how to order data coming out. Mostcommercial data bases are relational, withsome sections arranged hierarchically andsome in other formats. Two advantages ofrelational technology are the simplicity of itsinternal structure, and the ease with whichthis structure can be modified, with little orno effect on supporting software.

Relational database technology becamepractical in mid-1970s, when IBM intro-duced Structured Query Language (SQL),which has become the de facto industrystandard.16 This language enabled commu-nication between data bases or betweenapplication and data bases that adhere tothe SQL standard.

50 Oilfield Review

Page 12: Taming the Geoscience Data Dragon- Disk 2

15. Brewer KE and Pritchard RJ: “Integrated TechnicalComputing of the 1990’s,” paper SPE 20359, pre-sented at the 5th SPE Petroleum Computer Confer-ence, Denver, Colorado, USA, June 25-28, 1990.Rumbaugh JE, Blaha MR, Premerlani WJ, Eddy F andLorensen W: Object-Oriented Modeling and Design.New York, New York, USA: Prentice-Hall, 1991.Fong E, Kent W, Moore K and Thompson C: 1991,X3/SPARC/DBSSG/OODBTG Final Report. NationalInstitute of Standards and Technology Draft, Septem-ber 17, 1991. (This is a review draft of a study by theobject-oriented database technology group. Thislandmark report states that object-oriented technol-ogy and methodology is ready to use now by gov-ernment agencies. For copies, contact ElizabethFong, NIST, Building 225, Room A226, Gaithers-berg, Maryland 20899, USA. Phone: (1) 301-975-3250.)

16. Although IBM invented the SQL principle, its firstcommercial implementation was in 1979 by OracleCorporation.

Relational database technology still leadsthe way into the 1990s, but anotherapproach, called object-oriented technol-ogy, is drawing attention. This is consideredby some to be a new way to access data,but by others to be a layer that fits on top ofa relational structure, allowing a new wayto group data.

The definition of an object-oriented database is still uncertain, but it is generallyagreed that the basic unit in an object-ori-ented approach is an “object,” which ismanaged as a single entity. An example is aseismic section. In a conventional relationaldata base, to call up a seismic section onthe screen, the user locates and requests thevarious traces before the section can appear.In an object-oriented system, the user asksfor the section by name or location, and thecomputer locates the data and applicationsneeded to draw the section. Another exam-ple of an object would be all the well logsfrom a traverse across a reservoir, named,for instance, A-A’. By simply marking A-A’on the screen, the user can call up all thelogs without knowing anything about theunderpinnings of data in the data base. Thelogs are treated as objects. Some object-ori-ented products are coming out now, but thetechnology isn’t expected to become signifi-cant until the mid-1990s.

Data bases have recently been challengedby growth in the amount and complexity ofdata and demand for new connectionsbetween data types. Two notable develop-ments to meet these challenges are graphicsinteraction and mixed-media storage.

In the last three years, advances in graph-ics interactions have enabled computers toemulate the way geoscientists think: inmaps. About 50 companies today are mak-ing and selling mapping systems under thegeneric name geographic information sys-tem (GIS). Most of these first-generation sys-tems are designed to handle 2D representa-tions of the earth’s surface for use, forexample, in city planning and forestry pro-jects. These GISs are limited, however, inthat they cannot gracefully satisfy thedemands of the E&P industry, which needsto apply GIS principles to model the subsur-face in three dimensions. These systems alsohave no understanding of the E&P dataneeds. Users have to write software to con-nect data descriptions of leases, seismic sur-veys, well logs and so on to the GIS’s recog-nition of lines, points and polygons.

January 1992

The new generation GISs can model thesubsurface in 3D and have a complete com-prehension of E&P entities. They understandthe data model used to store data in the database, and can translate this into lines, pointsand polygons that allow a map to be built torepresent the subsurface data. Old GISs can-not make this translation between the datamodel and map. Users of the new genera-tion can create searches of the data basefrom a map. The new GIS shows on the mapall wells that satisfy a query—such as gaswells the XYZ company completed in April1990. By clicking on the well, the user canbring up text annotations, such as whologged the well, what logs are available,total depth and bottomhole temperature.

The latest advance in database technol-ogy—although still in commercial infancy—is mixed-media capability, which has devel-oped in response to the demand for differentkinds of data. Until recently, data baseswere limited mainly to digitized data, suchas seismic maps, log curves and tabulardata. With mixed-media storage, any kind ofinformation can be accessed—text, tables,raster images of downhole data, such aslogs and core photos, as well as video andaudio information. Interpretations can bestored as well as raw data.

With mixed media loaded into the database, the geoscientist can draw a polygonaround a set of wells and type a query,“show core information.” A text trackappears with the list of wells cored, or coredwells appear in a new color or a new sym-bol. Then the user can query again: “showPierre formation,” and up comes a coreimage, with thin section data and the coreanalyst’s interpretation and notes—paleon-tology, sedimentary structures, graindescription, and so on.

Advances have also taken place in rela-tional database capabilities of workstations.Some newer data management systems,such as used in the FINDER system, havethe capability to withdraw two kinds of datain one swipe: bulk data (sometimes calledvector data) and parametric data. Bulk dataare large chunks of data, composed of manysmaller data items that are usually retrievedin a specific order—for example, a sequenceof values such as a porosity curve or a seis-mic trace. Parametric data are single piecesof data, such as those included in a logheader: mud density, bottomhole tempera-ture, field name, and so on. In most databases, bulk and parametric data are storedseparately and retrieved separately—the

user must ask separately for a log curve andheader. The FINDER system, because itstores some bulk data as if they were singledata items, allows the user to access bothtypes of data with one command. Storingsome data in bulk, such as well depths, per-mits faster access because the computersearches one already ordered entity, ratherthan searching each depth and having toimpose order on it.

Some Data Management StrategiesWhere data management stands todaymight be described as “loose” integrationbetween platforms, and where it wants to goas “tight” integration. While the distinctionmay be mostly blurred, there are some cleardifferences. The main difference is in thenumber of occurrences of a given data item.In a tightly integrated system, there is onedata store; in a loosely integrated systemthere may be many. In general, today thereis often tight integration of application pro-grams within an existing platform of a singlevendor. The long-term desire is for tight inte-gration between platforms while, in theshort term, there is a need for loose integra-tion between platforms.

To illustrate the difference between tightand loose integration, consider two seismicinterpretation systems. One is used for seis-mic structural interpretation, the other forseismic inversion. It may be possible tomove a copy of the seismic section from theinterpretation to the inversion program, per-form an inversion, then move it back to theinterpretation program to interpret an inverted

(continued on page 54)

51

Page 13: Taming the Geoscience Data Dragon- Disk 2

Bringing Data Management into the 90s: The BHP Case Study

Geological Interpretation(Cogniseis Development DLPS

Interpretation Packageand Landmark Stratworks Module)

Seismic Interpretation(Landmark Seisworks

Module)

FINDER Exploration Data Base

ComputerContoured Maps

Log Displays,Cross Sections

1990s

SeismicInterpretation

DigitizingHorizon Picks

MAGIC Data Base

Base Maps

Drafting

HandContoured

Maps

ComputerContoured

Maps

GeologicalInterpretation

Log Displays,Cross Sections

Drafting

Well CompletionReports Well Data BaseLog

Data Base

1980s

Manual tasks

Computerized tasks

In 1987, the Australian energy company, Broken

Hill Proprietary (BHP), recognized the need to

replace its Mapping and Geophysical Interpreta-

tion Computer (MAGIC), an in-house system for

exploration and production file management.

MAGIC was a suite of applications programs and

a data store that was mainly a repository for seis-

mic data. Its main limitation was that it operated

only in batch mode, not interactively.

To assess its needs, BHP surveyed users

throughout the company on requirements for soft-

ware applications, data processing and data

management. A significant finding was that users

wanted closer interaction with their data. They

complained of a lack of control of data interpreta-

tion and processing functions, and wanted data to

be more accessible and more easily modified.

At the same time, the company recognized the

industry trend toward workstations and user

access from the desktop, which fit well with the

user request. BHP also had the desire to store,

catalog and make available E&P data of all types,

not just seismic data.

Once BHP identified the company-wide

requirements, it looked for systems to meet these

needs. The field was quickly narrowed to the Vor-

text system from Aangstrom Precision Corp. in

Michigan, the Exploration and Production Office

System/Geoscience Information System

(EPOS/GSIS) from the TNO Institute of Applied

Geoscience in The Netherlands, and the FINDER

data management system from FINDER Graphics

nBHP’s exploration technical systems overview.

52 Oilfield Review

Drafting(Minor Alterations)

Mapping(Zycor)

Page 14: Taming the Geoscience Data Dragon- Disk 2

Systems Inc. in California. The FINDER system

was found to have the greatest potential to satisfy

BHP’s requirements, and initial acquisition was

made in 1989. Since that purchase, BHP has

installed the system in offices in Melbourne, Aus-

tralia; Houston, Texas, USA; London, England

and Calgary, Canada.

Currently, BHP stores data in the FINDER sys-

tem as basin-wide data bases. In the Gulf of Mex-

ico, for example, the data base has about 7800

seismic lines representing 141,700 line miles

[228,000 km]; 202,600 well locations with 91,300

scout tickets, and cultural, lease and other types

of data that allow easy generation of base maps.

In the Melbourne office, the Timor Sea data set

comprises 5400 seismic lines representing about

123,000 line kilometers [76,400 miles], 200

wells, cultural, permit and interpretative data.

The hardware environments at BHP’s two main

sites are based on two different distributed com-

puting models. In Houston, the FINDER system

runs on seven stand-alone VAXstations with

access via a network from 19 X Window System

terminals, which are for graphics display. In Mel-

bourne, there is a central file server with seven

dataless VAXstations and X Window System ter-

minals attached to the server. Because of the

experience gained in these test configurations,

future implementations will probably be based on

a central server for each business unit with data-

less workstations and X Window System terminals.

The working environment is flexible. In some

cases, subsets of data are extracted as project

data bases within the FINDER system. At other

times, users who want regional context maps

work directly with the entire basin’s data set.

Within this framework, the FINDER system pro-

vides much of the routine functionality that BHP

had with its MAGIC system, plus the ability to

January 1992

interact with data. For capabilities not part of the

FINDER system, such as seismic interpretation or

geologic modeling, BHP has purchased packages

from other vendors, including Zycor Inc., Land-

mark Graphics Corp. and Sierra Geophysics, Inc.

BHP has also written routines to work with the

FINDER system, such as a program that digitizes

horizon picks marked on paper displays of seis-

mic sections.

Through the FINDER system, BHP intends to

integrate different types of data and data from

many sources, and, most importantly, make

those data available to the user via workstations

and X Window System terminals. It is hoped that

this flexibility will permit integration of geology

and geophysics in ways either not possible in the

past, or too difficult and, hence, impractical.

There have been several keys to BHP’s success

with the FINDER system. Early in the process, BHP

acquired the FINDER Software Development

Facility (source code for modifying the FINDER

system) and assigned dedicated in-house support

staff to the FINDER system. This has allowed BHP

to work closely with the FINDER development

staff and has given BHP the ability to add special-

ized features and functionality to the system.

Each business unit at BHP Houston has a data

administrator whose sole responsibilities are

loading data, performing quality assurance, and

correcting and updating the FINDER data bases.

The data administrators work with the explo-

rationists on quality assurance, but the ultimate

responsibility for the integrity of data lies with

the explorationists.

Data loading and verification is an essential

function that BHP finds to be the most problem-

atic. Moving data from old data bases or mag-

netic tape into the FINDER system often requires

reformatting. Even new data can require consid-

erable manipulation because the organization

and definition of data may vary. For example,

commercial scout tickets from various vendors

often don’t agree on the name of the well opera-

tor. Vendors may furnish the name of the operator

at the time of drilling or during production. Both

are valid. The explorationist usually wants the

operator’s at the time of drilling, because well

logs will use this name, whereas the production

engineer usually wants to know the name of the

operator of the producing well. Each well contains

dozens of data fields with similar possibilities for

ambiguity. With about 47,000 wells in just the off-

shore portion of BHP’s Gulf of Mexico data base,

the size of this problem can be daunting.

Even when each data field is well defined, data

often have to be manually reorganized to fit the

format specifications of existing data loaders. If

this task is too large or cumbersome, custom

data loaders must be written to convert the ven-

dor data format into the database format. These

loaders are written either by FINDER Graphics or

BHP. Throughout the process, BHP draws on the

expertise of its data administrators, who have

learned from experience which data from which

vendors or areas need special attention.

BHP’s long-term goal is to simplify the every-

day work flow, allowing professionals to be more

productive and more creative, by reducing the

time spent searching, accumulating, requesting

and moving data.

53

Page 15: Taming the Geoscience Data Dragon- Disk 2

section. Since there are now two copies ofthe section, the two systems are said to beloosely integrated. In a tightly integrated sys-tem, processing and interpretation systemswould obtain data from a common datastore (possibly distributed across multiplecomputing sites or machines), with a com-mon file format, and would not require refor-matting of data. Data handling is faster andthe border between computers and applica-tions becomes invisible to the user.

There are good reasons why tight integra-tion will be slow in coming. Tight integrationof applications written for independent sys-tems requires modification of existing appli-cations to permit them to use the commondata store. It also requires agreement on auniversal model for data exchange—one ofthe long-term projects of POSC.

Loose integration between domainsrequires a mechanism for moving data fromone to the other. The advantage of thisapproach is that it does not require rewritingprograms. The GeoShare data exchangestandard, which uses RP66/DLIS as thecommon data format, is one effort in thisdirection, requiring only the writing of a linkto the GeoShare standard. Additionally, oilcompanies are interested in prolonging thevalue of their investment in proprietary sys-tems. Therefore, the immediate efforts, suchSchlumberger’s geoscience data bus imple-mentation using the GeoShare standard, arefor loose integration of independent sys-tems, while a hopeful eye is kept on POSC’sdevelopment of a universal system permit-ting tight integration.

The shift toward workstations also callsinto question the means of data security andcontrol, which are often deeply rooted inorganization culture. Where should data bekept—centralized, decentralized or inregional data centers? Who controls dataquality? How can consistent quality of all

data be assured? Who will determine whatgoes back in the data base and how is thisdetermined? Today, just as no two oil com-panies have the same computer systems, notwo companies approach these challengesthe same way.

A leading debate in data management iscoping with simultaneous alteration of thesame data set. Imagine two geoscientistsworking concurrently on different aspectsof the same project, using some of the samewell data. They have downloaded datafrom the master data base, revised it, cor-rected it and made interpretations. Wheninterpreted data are ready for storage on themaster data base, how is it decided whichversion goes in?

There are various solutions. In BP USA,automatic data revision is anathema. Whensomeone changes data in the master database, the computer produces a report, stat-ing there are two versions of the same thing,and listing the changes. It asks if those ver-sions should be merged or supplanted. Thisdetermination is made by the database man-ager. The manager cannot be expert inpetrophysics, sedimentology, geophysicsand petrology in every province, but he orshe is still responsible for the integrity of themaster data base. The manager therefore sitsdown with the geoscientists and, donningthe hats of diplomat, defender of data andgeoscientist, negotiates which version ofwhat to keep. BP finds that the successfuldata manager must be high enough in theorganization to wield recognized authority,yet close enough to the field to know thearea in question.

ARCO is installing a data managementsystem in its Plano, Texas, USA, facility andexpects to use a slightly different solution.The data management plan calls for down-loading project data from the mainframe tothe FINDER system. When an interpretationis complete, it is screened at three levelsbefore being installed in the corporate database. First, the project manager approves allformation top picks and performs a generalquality check. The appropriate operationscenter then checks data pertaining to itsarea, such as well location and geopoliticaldata. Finally, with recommendations of the

54 Oilfield Review

previous two checkers attached, the inter-pretation reaches the computing servicesgroup in Plano, which then performs qualitychecks as if it were vendor data, and installsthe interpretation in the corporate data base.If the computing group questions the data, itnegotiates a solution with the project man-ager, the operations group, or both.

Managing simultaneous alteration is onechallenge in this early stage of technologytransfer to workstations. The major chal-lenge, in using workstations to reduce find-ing and development costs, is developmentand acceptance of a standard E&P datamodel. Once this hurdle is cleared, onlyminor obstacles in the path to seamless inte-gration of geoscience data remain. —JMK

AcronymsAAPG: American Association of

Petroleum GeologistsAPI: American Petroleum InstituteCD ROM: Compact disk read-only memoryCPU: Computer processing unitDB: data baseE&P: Exploration and productionGIS: Geographic information systemOSF: Open Software FoundationPC: Personal computerPIDD: Petroleum Industry Data DictionaryPOSC: Petrotechnical Open

Software CorporationPPDM: Public Petroleum Data ModelRISC: Reduced Instruction Set ComputerSQL: Structured Query Language