post-processing cactus data - ulisboa · cactus simulation folders i typically contains one...

27
Post-Processing Cactus Data Wolfgang Kastaun AEI Hannover ET school, Lisbon, Sep. 2018

Upload: others

Post on 10-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Post-Processing Cactus Data

Wolfgang Kastaun

AEI Hannover

ET school, Lisbon, Sep. 2018

Page 2: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Post-Processing

General tasks

I Read different data formats from simulation

I Analyse data (e.g. statistics, integrals)

I Visualize data

I Need to combine different tools

I Need custom infrastructure to move data between tools

Page 3: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Visualization ToolsPackage Good for Interface License

Matplotlib 1D,2D Python FreeBokeh 1D,2D Python FreePlotly 1D,2D Python Free

(P)YGraph 1D evolution GUI FreeR 1D,2D Own language Free

Gnuplot simple 1D archaic FreeMathematica 1D,2D Own language Non-Free

Yt 2D,3D Python FreeVisIt 3D Gui,Python FreeVTK 3D C++,Python,Java,.. Free

DataVault 3D GUI FreeBlender Raytracing Gui, Python Free

⇒ Once data can be imported in Python, you can do anything.

Page 4: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Visualization ToolsPackage Good for How to import Cactus data

Matplotlib 1D,2D PostCactus(P)YGraph 1D evolution Supports Cactus

R 1D,2D ?Gnuplot simple 1D Only plain text format

Mathematica 1D,2D SimulationToolsYt 2D,3D SimulationIO

VisIt 3D PluginVTK 3D PostCactus or VisIt

DataVault 3D Conversion toolBlender Raytracing PostCactus+VTK

Page 5: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Data Analysis Tools

Mathematica

I Non-Free

I Very specialized programming language

I Powerful, well-tested, very robust

I Access Cactus data via SimulationTools module

I Integration, differentiation, Fourier analysis, ODE solvers,statistics, vectors, matrices.

I Huge collection of mathematical methods

I Notebook interface

Page 6: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Data Analysis Tools

Python based

I Python is a free, well-known general purpose language

I Numerical capabilities via numpy + scipy packages

I Great for reading files (text, HDF5, JSON, ..)

I Access Cactus data via PostCactus package

I Integration, differentiation, Fourier analysis, ODE solvers,statistics

I Notebook web interfaces: Jupyter, Sage

I Sage Math: Python environment for numerics

I Sage Manifolds: Differential Geometry support

Page 7: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Workflows

Jupyter notebook

I Specialized web-server

I Can be run local (Laptop) or remote (cluster)

I Running remote needs some setup

I Support for many languages, e.g. Python, R, Julia

I Interactive Python coding via webinterface

I Self-documenting workflow

I Notebooks are files, can be shared

I Can embed plots created on the fly

I Can use version control (diffs are ugly though)

Page 8: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Workflows

Python scripts

I Great to automate tasks

I More reproducible

I Trivial to run on cluster

I Easily version-controlled

I Typical use case: figure in publication

Page 9: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Workflows

VisIt

I Explore data interactively

I Easily make movies

I Difficult to repeat stuff

I Restricted by GUI

I Can be scripted though

IPython

I Interactive Python commandline

I Convenient tab-completion, history, “magic” commands

I Good for quick one-time tasks

Page 10: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Workflows

Version control

I git or mercurial (hg)

I Don’t use svn, darcs, or even CVS (Ew!)

I Use on all scripts, articles, talks

I Can be used on notebooks

I Not suitable for large simulation data(Maybe with git-lfs or git-annex)

I Second purpose: sync between machines

I Central repo hosting: github, gitlab, bitbucket

Page 11: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Workflows

Anaconda

I Installing Python environment can be tricky

I Cannot install everything because dependency collisions,e.g. Python 2 versus Python 3

I Create virtual environments

I Need specialized package management

I Anaconda does both

Page 12: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Workflows

Docker

I OS inside OS

I Cheaper than virtual machines

I Provides well defined environments (images)

I Build on one machine, use everywhere

I No more OS updates breaking scripts

I Can use Anaconda and Jupyter inside container

I Can make snapshots of ongoing work

Page 13: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Cactus Simulation Folders

I Typically contains one subfolder for each restart

I Restarts overlap in time

I Folder structure up to user

I Data as single files for different data type

I No infrastructure for metadata

I Many data formats based on HDF5 file format.

I Different file formats in use for the same type of data

I Different variants of each format

I Grid and Scalar data one file per variable or one file per group.

Total mess, impossible to use directly⇒ Need abstraction layer

Page 14: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

Cactus Simulation Folders

Typical restart folder contents

I Parfile: *.par

I Logfiles for normal output (*.out) and errors (*.err)

I Scalar data Timeseries: *..asc

I Reduction results Timeseries: *.minimum.asc (minimum),*.norm2.asc (2-norm), etc

I 3D Grid data: *.xyz.h5, * file*.h5

I 2D cuts, e.g. *.xy.h5

I 1D cuts, e.g. *.x.h5 and/or *.x.asc

I Multipole data: mp * l* m* r*.asc or mp *.h5

I GW data: mp psi4 l* m* r*.asc or mp psi4.h5

I Black hole properties BH diagnostics.ah*.gp

I Apparent horizon shape h.t*.ah2.gp

Page 15: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus Framework – Overview

I Access Cactus data formats from PythonI Transparent merging of restartsI Hides technicalities (e.g. data formats, extensions)

I Also provides some analysis toolsI Time Series: differentiation, resampling, FFTI Gravitational waves: strain, spectraI Grid data: interpolation, histograms, percentiles, gradient

I Some helper functions to integrate with matplotlib and VTKI Simplify 2D data color- and contour-plotsI VTK: isosurfaces, volume rendering, field lines

I History: grown from piles of postprocessing scripts

Page 16: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus – Supported Data Types

I Grid dataI 1D,2D,3D hdf5 (1 file per variable only)I 1D ASCII

I Scalar dataI ASCII format (1 file per variable or per group)I Scalar, min, max, normsI Integrals from norms (needs grid volume)I Transparent decompression

I Multipole data: ASCII, HDF5

I GW signal from Ψ4 multipoles or WaveExtract (deprecated)I Apparent horizon data

I AHFinderDirect, QuasiLocalMeasures,IsolatedHorizons (deprecated)

I Horizon shape: ASCII format

I Partial support for parameter files

Page 17: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus – Limitations

I Does not support one file per group hdf5 format

I Reads scalar data only in ASCII format

I More data format readers in preparation

I Parameter file language not fully supported

I No MPI support for postprocessing

I Not ready for Python 3

Page 18: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus Framework – Installation

Required packages

I Python 2.7

I H5Py

I NumPy, SciPy

Recommended packages

I Ipython, Jupyter notebook server

I Matplotlib

I Sphinx (Documentation)

I ffmpeg (Movies)

I VTK (3D)

Easiest way to install dependencies: Anaconda Python distribution

Page 19: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus Framework – Installation

I Download from public repository

hg clone\

https://[email protected]/DrWhat/pycactuset

I If using Anaconda, activate environment

source activate <your environment name>

I Install Python package

cd pycactuset/PostCactus

python setup.py install

Can use --prefix= option to install in custom location

I Build documentation

cd doc

make html

See alsohttps://bitbucket.org/DrWhat/pycactuset/wiki/Home

Page 20: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus – Gallery

I Cactus GW data → PostCactus → Matplotlib

I Instantaneous frequency using PostCactus timeseries

4

3

2

1

0

1

2

3

4

hr e

x/(

100

MP

c)

1e 22

h+ h×

0 5 10 15 20 25 30 35

(t−r) [ms]

0

1

2

3

4

5

f[k

Hz]

Page 21: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus – Gallery

I Cactus GW data → PostCactus → Matplotlib

I Spectrum using PostCactus GW utilities

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

h̃ef

f

1e 24

0 1 2 3 4 5

f [khz]

0

5

10

15

20

25

30

35

40

(t−r)

[ms]

Page 22: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus – Gallery

I Cactus 2D + BH data → PostCactus → MatplotlibI 2D horizon cuts done by PostCactus.

100

50

0

50

100

y[k

m]

t = 3.0 ms t = 3.0 ms t = 3.0 ms

100 50 0 50 100x [km]

100

50

0

50

100

y[k

m]

t = 8.9 ms

100 50 0 50 100x [km]

t = 15.0 ms

100 50 0 50 100x [km]

t = 15.0 ms

8.8

9.6

10.4

11.2

12.0

12.8

13.6

14.4

log 1

0(ρ[g/c

m3])

Page 23: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus – Gallery

I Cactus 3D data → PostCactus → VTK

I Fieldline integration and selection by custom Python code

Page 24: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus – Gallery

Cactus 3D data → PostCactus → VTK → Blender

Page 25: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

PostCactus – Utilities

I simsync: transfers specified variables of a simulation

I pardiff: parses two parameter files and prints differences

I simrep: automated generation of html reports for runs

I simvideo: make movies

Page 26: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

SimVideo framework

I Produce movies from Cactus data

I Movies iplemented as Python modules

I Each contains code to plot single frame & load required data.

I Support for matplotlib and VTK

I Uses ffmpeg to assemble frames

I Code not parallel yet

Page 27: Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one subfolder for each restart I Restarts overlap in time I Folder structure up to user I Data

SimRep framework

I Automatic generation of html reports from simulation data

I Modular design, easy to design own report pages

I Python based document description language

I Can run arbitrary postprocessing scripts to get plotsI Available modules

I LogfilesI Global quantities (total baryon mass, max density, lapse, ..)I Constraint violationI Performance (rudimentary, only memory and speed)I GW signal