Gaia, next frontier in Astronomy Jose Hernandez
Gaia Data and Calibration EngineerEuropean Space Astronomy Centre (ESAC)
Madrid, Spain
Big Data in The Scientific World• IT has evolved at an impressive rate over the
last decades• In parallel and partly driven by the above the
IT resources needed have grown at a similar pace
• When Gaia was being designed it was not clear if the IT could cope with the Data Processing needs…
Thinking in Big Terms• Astronomy Data growing at an incredible rate also:
– More and better instruments– In Space missions more bandwith, on-board storage– Better IT to do the data Processing and Analysis
• Lots of Data waiting out there: There are more Stars in the Universe than grains of sand in all the Beaches on Earth together:– Our Galaxy: 100.000.000.000 Stars– Universe: 100.000.000.000 Galaxies – 1022 Stars: Would need 1 yottabyte to store the positions– Yet there are more molecules in a glass of water…
•Gaia is the next Astrometry mission of the European Space Agency•Will be launched towards the end of 2013•Main Objectives:–Build a 3D map of the Galaxy–History and evolution of the Milky Way–Stellar Astrophysics–Multiple systems, exoplanets–Solar System Asteroids–General Relativity
Vault of Heaven
•We have stereoscopic vision, each eye perceives a flat image and our brain builds a 3D image• This doesn't work for the
stars because they are very far away• But we can compare the
images taken from two opposite points of Earth's orbit
•We need to measure extremely small angles• It took a long time (1838) to know the distance
to the nearest stars• ESA's Hiparccos satellite cataloged 120.000 stars •Gaia will be a giant leap in the field: • 1000 million y 100 times more precise
• 1 GigaPixel camera•Will observe the stars from
different directions and at different times
Blue Photometer CCD
sBlue Photom
eter CCDs
Red Photometer CCD
sRed Photom
eter CCDs
One Giga Pixel Camera
Star Images Motion
2
Radial Velocity Spectrometer CCDs
2
Radial Velocity
Astrometric Field CCDs
106 CCDs , 938 million pixels, 2800 cm2
104.26cm
42.3
5cm
Sky Mapper CCDs
• Two telescopes with a 35m focal length • 10 Mirrors• Common focal plane• Prisms in front of the CCDs
determining the colors• An spectrometer to measure
the radial velocity of the brightest stars
•Not all the CCD image is read and sent to Earth • The onboard Video
Processing Units read the areas containing the star images•Data is stored compressed in
the memory onboard•Data is sent to Ground at
night (50GB per day)
Gaia Data Processing
•DPAC is an European Consortia in charge of the Data Processing• 440 Members• 75 European Institutions• Six Data Processing Centres:–Barcelona–Cambridge–Geneve–Madrid (ESAC)–Toulouse–Torino
•Multidisciplinary Teams: –Astrometry experts– IT Engineers–Calibration Engineers–Mathematicians, Staticians–Coordinators–Computers, email, Skype,
meetings…
Lennart Lindegren, Lund (Sweden)
Java Workshop, ToulouseMare Nostrum, Barcelona
Data Processing• Very complex due to the large amounts of data and
the precision needed• As daily data arrives preliminary processing to verify
the performance•When we have data of the whole sky we start the
global Data Processing• Then the process will be iterated including more data
and more centres.• By mid-mission there will be an intermmediate
catalogue• Final catalogue towards 2021
After Gaia• Europe will continue to lead Astrometry• Biggest camera ever flown into Space• Reference catalogue in Astronomy for the
decades to come• Astronomy will be different after Gaia• But surely the best discovery will be something
that we can not imagine today.
Data Management System•On the ESA part of the processing we have two
main demands:• Processing of the daily Data received from the
Satellite• Global Processing of the Data accumulated (over 6
months, 1 year, 2 years,…) in an iterative manner•Need to select data using a few configurable
patterns•Daily processing more stringent robustness
requirements
Data Management System• For the last 10 years test using an increasing
amount of data• Systems all in Java, using Beowulf clusters and
NetApp storage•One of the critical issues compromising scalability
is the IO•Daily Processing needs to be finished on time•Global processing needs to read many times large
amounts of data (50 TB)
Data Management System• Since 2008 we have a fruitful partnership with
InterSystems• A very good symbiosis •Developed prototypes and bring in together the
InterSystems/NetApp/ESA experts•Work very closely with InterSystems engineers,
fast turnaround in new features implementation, problem solving,…•Now all our systems in production have migrated
to Cache