astronomical tiled image compression how & why. authors: zrob seaman, noao zbill pence,...

12
Astronomical Tiled Image Compression How & Why

Upload: imogene-ball

Post on 24-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

Astronomical Tiled Image Compression

How

&

Why

Page 2: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

Authors:

Rob Seaman, NOAOBill Pence, NASA/GSFCRick White, STScIMark Dickinson, NOAOFrank Valdes, NOAONelson Zárate, NOAO

Page 3: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

Statement of problem

No one compression is always bestNew instruments and survey

programs will dwarf data sets that have come before

Observatories' data storage costsTransport latency & bandwidth

challenge not just budgets, but technology and human patience

The bottom line is data handling throughput, not static storage

Page 4: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

Host level compressionPer-file gzip compression

Contents of file are opaque

Speed of compression

Speed of decompression

Size of output

Limited support for on-the-fly decompression

Page 5: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

How

FITS tile compression convention

Provides a general framework

Supports any compression

algorithm that can operate on

multidimensional image sections

FITS headers remain readable

Access to individual FITS HDUs

Files are still FITS

Page 6: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

LimitationsOnly partially supported by IRAFSupported by CFITSIO, but caveats:Not idempotent, even a losslessly

compressed file would suffer keyword changes

Original convention covered only per-HDU issues, e.g., compressing a SIF produced same binary table as MEF original

Only application was the limited imcopy example program

Unsupported algorithms

Page 7: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

Improvementsfpack compression toolCompress images in-place Multi-image archives for efficiencyIdempotentSupports FITS ChecksumApplications layered on CFITSIO

access compressed files and file archives transparently

Support for HcompressGeneral purpose option for

adaptively scaling input data.

Page 8: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

fpack / funpackfpack, a FITS tile-compression engine. Version 0.8.2 (25 September

2006)usage: fpack [-r|-p|-g|-h] [-w|-t <axes>] [-n <bits>] [-v] [-Etc] <FITS>

Flags must appear (separately) before filenames: -r Rice compression [default], or -p PLIO compression, or -g GZIP (per-tile) compression -h Hcompress compression -w override tile size to be whole image, or -t <axes> comma separated list of tile sizes [default=row] -n <bits> noise bits to preserve for real pixels [default=4] -v verbose -F clobber output [default overwrites input in-

place] -K keep (don't delete, overwrite or change) input files -A <file> write (append or clobber) output to single file, or -P <pre> prepend <pre> to create separate output filenames -L list and validate contents, files unchanged -H print this message -V print version number <FITS> FITS files or extensions to pack

Page 9: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

… & WhyPreserve the scientific integrity of

processed astronomical data setsNative integer data products permit

lossless compression techniques for neutral effect, or

May benefit from lossy compression for high compression factors

Processing, pipeline or hands-on, often creates floating point

Choose lossy compression, orScale data into integers

Page 10: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

Compression statistics

Additional cost for gzip’ed floating point output from pipeline is $2.86 per image versus Rice compressed integers.

Page 11: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

BenefitsReduced:

DiskspaceBandwidthLatency

Remove need to decompressPack multiple files for efficient

transportHeaders remain readableIndividual HDUs are accessibleChoice of algorithm isn’t fixed

Page 12: Astronomical Tiled Image Compression How & Why. Authors: ZRob Seaman, NOAO ZBill Pence, NASA/GSFC ZRick White, STScI ZMark Dickinson, NOAO ZFrank Valdes,

DMS architecture

Benefits NSA, NHPP, NVO portal

No need for ASCII header filesSmaller footprintFaster replicationFiles remain FITS throughoutExtends upstream into domesExtends downstream to usersCompression can be free or

better than free