astrophysical surveys: visualization/data managements
DESCRIPTION
Astrophysical Surveys: Visualization/Data Managements. Peter Nugent (LBNL/UCB). “Current” Optical Surveys. Photometric: Palomar Transient Factory La Silla Supernova Search SkyMapper PanSTARRS Spectroscopic: SDSS III (BOSS, SEGUE-II, APOGEE, MARVELS) - PowerPoint PPT PresentationTRANSCRIPT
Astrophysical Surveys: Astrophysical Surveys: Visualization/Data Visualization/Data
ManagementsManagements
Peter Nugent (LBNL/UCB)Peter Nugent (LBNL/UCB)
INT Exascale Workshop
““Current” Optical Current” Optical SurveysSurveys
Photometric:
Palomar Transient FactoryLa Silla Supernova SearchSkyMapperPanSTARRS
Spectroscopic:
SDSS III (BOSS, SEGUE-II, APOGEE, MARVELS)
All of these surveys span astrophysics from planets to cosmology, from the static to the transient universe.
INT Exascale Workshop
PTF: CompetitionPTF: CompetitionThe competition were two wide-field multi-color surveys with cadences that were either unpredictable (SkyMapper) or from days to weeks (PanSTARRS) in a given filter.
How could we do something better/different?
- Start quickly - P48” coupled with the CFHT12k camera- Don’t do multiple colors- Explore the temporal domains in unique ways- Take full advantage of the big-iron at Super-Computing
Centers- Get all the science we possibly can out of this program
Thus we need the capability of providing immediate follow-up of unique transients, using 4 to 10-m class telescopes.
INT Exascale Workshop
Transient Phase-SpaceTransient Phase-Space
INT Exascale Workshop
PTF SciencePTF SciencePTF Key ProjectsPTF Key Projects
Various SNeVarious SNe Dwarf novaeDwarf novae
Transients in nearby Transients in nearby galaxiesgalaxies
Core collapse SNeCore collapse SNe
RR LyraeRR Lyrae Solar system objectsSolar system objects
CVsCVs AGNAGN
AM CVnAM CVn BlazarsBlazars
Galactic dynamicsGalactic dynamics LIGO & Neutrino LIGO & Neutrino transientstransients
Flare starsFlare stars Hostless transientsHostless transients
Nearby star kinematicsNearby star kinematics Orphan GRB afterglowsOrphan GRB afterglows
Rotation in clustersRotation in clusters Eclipsing stars and Eclipsing stars and planetsplanets
Tidal eventsTidal events H-alpha sky-surveyH-alpha sky-surveyThe power of PTF resides in its diverse science goals and follow-up.
INT Exascale Workshop
PTF (2009-2013)PTF (2009-2013)
92 Mpixels, 1” resolution, 60s exposures - 128MB data per minute2 cadences SN & Dynamic with g in dark time & R in bright time
INT Exascale Workshop
PTF SciencePTF ScienceThe power of PTF resides in its diverse science goals and follow-up.
And so does the cost…these resources are an order of magnitude more expensive than the survey itself.
INT Exascale Workshop
PTF PipelinePTF Pipeline
INT Exascale Workshop
PTF SubtractionsPTF Subtractionsm
oon
4096 X 2048 CCD images - over 3000 per night
NewRef Sub
INT Exascale Workshop
QuickTime™ and a decompressor
are needed to see this picture.
PipelinePipeline
NERSC GLOBAL FILESYSTEM170TB
DataTransferNodes
ScienceGatewayNode 2
ScienceGatewayNode 1
Observatory PTF Collaboration
via Web
Processing/db Subtractions
Carver3200
core IBM
128 MB/90s50 GB/night
512 MB/90s200 GB/night
INT Exascale Workshop
PTF DatabasePTF DatabaseR-bandR-band g-bandg-band
imagesimages 1.07M1.07M 166k166k
subtractiosubtractionsns
810k810k 50k50k
referencesreferences 18.6k18.6k 3.0k3.0k
CandidateCandidatess
487M487M 22.2M22.2M
TransientsTransients 2911729117 1670*1670*
The db selection (psql) and schema was based on getting the fastest and cheapest solution to last the length of the survey and serve the science needs.
INT Exascale Workshop
PTF Sky CoveragePTF Sky Coverage
1000
100
10
0
To date:• 1200 Spectroscopically typed supernovae• 105 Galactic Transients• 104 Transients in M31
INT Exascale Workshop
PTF: Real or BogusPTF: Real or BogusPTF produces 1 million candidates during a typical night:
• Most of these are not real Image Artifacts Misalignment of images due to poor sky conditions Image saturation from bright stars
• 50k are asteroids• 1-2k are variable stars• 100 supernovae• 3-4 new, young supernovae or other explosions
INT Exascale Workshop
Real or BogusReal or Bogusm
oon
4096 X 2048 CCD images - over 3000 per night
INT Exascale Workshop
Real or BogusReal or Bogus
230 bogus candidates, 2 variable stars, 4 asteroids and the youngest Type Ia supernovae observed to date.
PTF10ygu: Caught 2 days after explosion
INT Exascale Workshop
Real or BogusReal or Bogus
Scanners helped vet over 1000 candidates and weights/biases were determined for each
scanner.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
INT Exascale Workshop
Real or BogusReal or Bogus
There is a natural cut at a value of 0.1, but we go to 2 w/ 0.07 to be safe.
INT Exascale Workshop
Users…Users…
QuickTime™ and a decompressor
are needed to see this picture.
INT Exascale Workshop
Citizen Scientists…Citizen Scientists…
http:// supernova.galaxyzoo.org is now up and running!A beta version appeared last year to support the SN Ia program in PTF and a WHT spectroscopy run. I spent a week with the folks at Oxford setting up the db and giving them training sets of good and bad candidates. They did the rest… 1200 members of galaxy zoo screened all the candidates between Aug 1 and Aug 12 in 3 hrs. The top 50 hits were all SNe/variable stars and they found 3 before we did. They scanned ~25,000 objects - 3 objects/min. They now do ~200 nightly and we have 15,000 users.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
INT Exascale Workshop
RobotRobot
QuickTime™ and a decompressor
are needed to see this picture.
A robot (built by Josh Bloom at UCB) queries the db every 20 min and compares new transients with archival information to ascertain its likely nature and publishes them to the collaboration - classification.
INT Exascale Workshop
RobotRobot
QuickTime™ and a decompressor
are needed to see this picture.
Complications to traditional methods include varying uncertainties in data, non-structured temporal sequence (bad weather, etc.), differing levels of historical information (in SDSS or not, known host in NED, etc.)
And this is just for stars…we also have ones for SNe, AGN…
INT Exascale Workshop
Turn-aroundTurn-aroundThe scanning is handled in three ways:
(1)Individuals can look through anything they want and save things to the PTF database(2)SN Zoo(3)UCB machine learning algorithm is applied to all candidates and reports are generated on the best targets and what they are likely to be (SN, AGN, varstar) by comparison to extant catalogs as well as the PTF reference catalog. These come out ~15 min after a group of subtractions are loaded into the database.
On June 3, 2010 we were able to photometrically screen 4 SN candidates with the Palomar 60” telescope in g, r and i-band (50% of the time on P60 is devoted to this) within 2.5 hrs of discovery on the Palomar Schmidt and take spectra of them at Keck the same night. Now a nightly occurrence.
INT Exascale Workshop
Robot -10vdlRobot -10vdl
Discovery and follow-up of PTF 10vdl a SN II.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
INT Exascale Workshop
PTF TotalsPTF Totals
Transients = 1220Papers = 20
In addition to these we have followed 2 triggers from IceCube and one from LIGO.
We estimate that at the end of the survey we will have 40B detections in the individual images and 40B detections in the deep co-additions.
INT Exascale Workshop
Historical DataHistorical Data
The DeepSky program was started in response to the needs of several astrophysics projects hosted at NERSC. The result of this project is an all-sky digital image based upon the point-and-stare observations taken via the Palomar-QUEST Consortium and the SN Factory & Near Earth Asteroid Team. This data (over 9 million ccd images) spans 7 years and almost 20,000 square degrees, with typically 10-100 pointing on a particular part of the sky.
INT Exascale Workshop
Historical Data (Own)Historical Data (Own)
Deep co-additions of many images go into making a reference image. Requires Deep co-additions of many images go into making a reference image. Requires HPC to handle the sheer volume of data and the solution to large linear systems HPC to handle the sheer volume of data and the solution to large linear systems to normalize the images to each other before the co-additions. Science can be had to normalize the images to each other before the co-additions. Science can be had from either the reference or the individual images that went into it.from either the reference or the individual images that went into it.
QuickTime™ and aYUV420 codec decompressor
are needed to see this picture.
INT Exascale Workshop
SNe IaSNe Ia810 SNe Ia discovered to date.
The cosmology dataset (total of 153) has been followed with a ~2-3 day cadence in gri with LT and FTN/FTSAnd PTF in R/g-band
INT Exascale Workshop
PTF TotalsPTF Totals
INT Exascale Workshop
Near FutureNear FutureNext Generation Transient Survey (aka PTF-II)
- Upgrade to 5X PTF: 36 sq. deg. (~ 1 billion pixels)- Would like to explore the sky on 100s timescales- Turnaround in 10-20 minutes with list of new
candidates- Ingest SDSS, BOSS, NED, etc. catalogs to refine
our understanding of these candidates in real-time
- Able to handle Advanced LIGO, neutrino detectors, etc.
INT Exascale Workshop
QuickTime™ and a decompressor
are needed to see this picture.
Bottlenecks…Bottlenecks…
NERSC GLOBAL FILESYSTEM240TB
DataTransferNodes
ScienceGatewayNode 2
ScienceGatewayNode 1
Observatory PTF Classification
Processing/db
Carver
Subtractions2.5 MB/s
1 GB/s
12 MB/s
4 MB/s(crude)
0.5 MB/s(full)
INT Exascale Workshop
Bottlenecks…crude Bottlenecks…crude vsvs. . realreal
time
bri
gh
tness
5- data in db
INT Exascale Workshop
Heavy Heavy RandomRandom I/O I/O
SC09 Storage challenge allowed us to couple both the SDSS db and the PTF candidate db to ask the question, which objects that we think are quasars in the static SDSS data vary like one in the PTF data. PTF db is now 250GB and growing nightly…
INT Exascale Workshop
Heavy Heavy RandomRandom I/O + I/O + analyticsanalytics Aster Data
provides a parallel db solution that also allows us to embed many of our machine learning algorithms. Already handle PB datasets.
Likely will couple both solutions (Aster + SSD).
QuickTime™ and a decompressor
are needed to see this picture.
INT Exascale Workshop
Conclusions - FutureConclusions - Future
LSST - 15TB data/nightOnly one 30-m telescope
INT Exascale Workshop
Conclusions - FutureConclusions - FutureScaling issues:
• Much larger volume of data (100X)• Much larger community (100X)• Greater science scope, not just transients
• Much larger set of historical resources one would wish to query against
• Follow-up resources (All the 8-10-meter & the 30-meter telescopes) much more expensive