reproducible computational experiments using madagascar software package sergey fomel bureau of...

28
Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied Inverse Problems Vancouver BC June 29, 2007 http://rsf.sf.net/

Upload: oliver-parks

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

Reproducible Computational Experiments Using MADAGASCAR Software Package

Sergey FomelBureau of Economic Geology

University of Texas at Austin

Applied Inverse Problems

Vancouver BC

June 29, 2007

http://rsf.sf.net/

Page 2: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Principles of Scientific Software

EncapsulationFile FormatsTestingReproducibilityMaintenance

Page 3: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Principles of Scientific Software

EncapsulationFile FormatsTestingReproducibilityMaintenance

Page 4: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Encapsulation

Information hiding (Parnas, 1972) Separation of concerns (Dijkstra, 1974)

Separate physics from mathematics

A is physics Going from b to is mathematics

x̂ argmin Ax -b R x

Page 5: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Example: Velocity Transform

Page 6: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Physics of Velocity Transform

Page 7: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Page 8: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Page 9: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Encapsulation in Programming

Separation of concerns– Classes or templates (C++)– Function pointers (C)– Function interfaces (Fortran-90)

/* initialize velocity transform (A) */ veltran_init (true, x0, dx, nx, s0, ds, nv, o1, d1, nt, s02, anti, psun1, psun2);

/* least-squares minimization of |A x – b|^2, x=vscan, b=cmp */sf_solver (veltran_lop, sf_cgstep, ntv, ntx, vscan, cmp, niter,

"err", error, "nmem", 0, "nfreq", miter, "mwt", mask, "end");

Page 10: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Encapsulation in UNIX

Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because

that is a universal interface.

Page 11: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Encapsulation in UNIX Shell

bash$ sfveltran < cmp.rsf > vtran.rsf adj=y v0=1 dv=0.025 nv=60bash$ sfdottest sfveltran mod=vtran.rsf dat=cmp.rsf v0=1 dv=0.025 nv=60sfdottest: L[m]*d=21665.9sfdottest: L'[d]*m=21665.9bash$ sfdottest sfveltran mod=vtran.rsf dat=cmp.rsf v0=1 dv=0.025 nv=60sfdottest: L[m]*d=21906.2sfdottest: L'[d]*m=21906.2bash$ sfconjgrad sfveltran < cmp.rsf > vtran.rsf niter=3 v0=1 dv=0.025 nv=60 sfconjgrad: iter 1 of 3sfconjgrad: grad=6.36797e+09sfconjgrad: iter 2 of 3sfconjgrad: grad=1.39068e+09sfconjgrad: iter 3 of 3sfconjgrad: grad=7.50257e+08

Page 12: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Principles of Scientific Software

EncapsulationFile FormatsTestingReproducibilityMaintenance

Page 13: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

The Art of UNIX Programming

(Raymond, 2004) To design a perfect anti-Unix, make all file

formats binary and opaque, and require heavyweight tools to read and edit them.

If you feel an urge to design a complex binary file format, or a complex binary application protocol, it is generally wise to lie down until the feeling passes.

Page 14: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

RSF (Regularly Sampled Format)

SEPlib (Stanford Exploration Project) Data separated from text headers Conceptually N-dimensional hypercubes Multiple files for complex geometries Not application specific

Data

n1=1000 in=“/path/data.rsf@”n2=500 n3=100 d1=0.001 d2=0.1 o2=1

Page 15: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Principles of Scientific Software

EncapsulationFile FormatsTestingReproducibilityMaintenance

Page 16: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Testing

Test-driven development (Beck, 2003) YAGNI principle

– Always implement things when you actually need them, never when you just foresee that you need them.

In scientific software development, tests are computational experiments

Page 17: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Testing with SCons

Software Construction Replacement for “make”

– reliable and extensible dependency analysis

– configuration files are Python scripts

– cross-platform– open-sourcehttp://www.scons.org

Page 18: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

SConstruct File

# Mobil AVO CMP gather 807 at well4 locationFetch('cmp807_raw.HH','rad')

# PreprocessingFlow('cmp','cmp807_raw.HH', 'dd form=native | tpow tpow=2 | mutter half=n v0=1.3 tp=0.2')Plot('cmp','grey title="Input CMP Gather" ‘)

# Velocity TransformFlow('veltran','cmp','veltran s02=0.25 v0=1.250 dv=0.025 nv=60 adj=y')Plot('veltran','grey title="Velocity Scan" ')

# Display Side by SideResult('veltran','cmp veltran','SideBySideAniso')

Page 19: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Experimenting with SCons

bash$ sconsretrieve(["cmp807_raw.HH"], [])< cmp807_raw.HH sfdd form=native | sftpow tpow=2 | sfmutter half=n v0=1.3 tp=0.2 > cmp.rsf< cmp.rsf sfgrey title="Input CMP Gather" > cmp.vpl< cmp.rsf sfveltran s02=0.25 v0=1.250 dv=0.025 nv=60 adj=y > veltran.rsf< veltran.rsf sfgrey title="Velocity Scan" > veltran.vplvppen yscale=2 vpstyle=n gridnum=2,1 cmp.vpl veltran.vpl > Fig/veltran.vplbash$ sed s/Velocity/Slowness/ < SConstruct > SConstruct2bash$ mv SConstruct2 SConstructbash$ scons < veltran.rsf sfgrey title=“Slowness Scan" > veltran.vplvppen yscale=2 vpstyle=n gridnum=2,1 cmp.vpl veltran.vpl > Fig/veltran.vpl

Page 20: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Principles of Scientific Software

EncapsulationFile FormatsTestingReproducibilityMaintenance

Page 21: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Reproducible Research at Stanford

(Knuth, 1992)– A computer program should be written with

human readability as a primary goal. (Claerbout and Karrenbach, 1992)

– The purpose of reproducible research is to facilitate someone going a step further by changing something.

(Buckheit and Donoho, 1995)– An article about computational science in a

scientific publication is not the scholarship itself, it is merely advertising of the scholarship.

Page 22: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Reproducible Experiments

Within the world of science, computation is now rightly seen as a third vertex of a triangle complementing experiment and theory. However, as it is now often practiced, one can make a good case that computing is the last refuge of the scientific scoundrel […] Where else in science can one get away with publishing observations that are claimed to prove a theory or illustrate the success of a technique without having to give a careful description of the methods used, in sufficient detail that others can attempt to repeat the experiment? (LeVeque, 2006)

Page 23: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Page 24: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Page 25: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Principles of Scientific Software

EncapsulationFile FormatsTestingReproducibilityMaintenance

Page 26: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Maintenance

Computational experiments that are not continuously maintained loose reproducibility.– Regression testing (Brooks, 1975)

Contribute computational software and experiments to a community-maintained repository to enable research productivity.

Page 27: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Open Science

Page 28: Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied

http://rsf.sourceforge.net/

Conclusions

Principles of Scientific Software– Encapsulation– File Formats– Testing– Reproducibility– Maintenance

Madagascar software package– Open source, open community, open science

http://rsf.sf.net/