automated classification of hetdex...
TRANSCRIPT
Automated Classification of HETDEX Spectra
Ted von Hippel (U Texas, Siena, ERAU)
Penn State HETDEX Meeting May 19-20, 2011
Outline
• HETDEX with stable instrumentation and lots of data ideally suited to automated classifiers
• ANNs as classifiers • examples: stars an planets • how ANNs might fit into HETDEX pipeline
and what they might do for project
Example: Stellar Classification
• spectra span a wide range of patterns within three physical dimensions (temperature, pressure, heavy element abundance).
• have many thousands of new spectra and want to quickly determine their classification (interpolated position in parameter space) – goals are statistics, rare types, anomalies
parameter 1
para
met
er 2
inverse problem: recover location in parameter space from observed spectra?
parameter 1
para
met
er 2
reduced wavelength range
reduced spectral resolution decreased
signal-to-noise
more difficult inverse problem
How to classify?
• classical, Chi-by-eye approach? • cross correlation? • Artificial Neural Networks (ANN)
Artificial Neural Networks
• embed expertise without being an expert • multi-dimensional interpolator • which data properties correlate with which
classification parameters? • uses entire spectral range of input data,
unbiased by preconceived notions of utility • best fit / global minimum? • can be Bayesian classifier
input layer
hidden layer(s)
output layer
spectra training (goal)
bias
1
2
3
n-2
n-1
w1
wn
0.000
1.000
0.000
0.000
0.000
f=(1+e-∑ws)-1
n
input layer
hidden layer(s)
output layer
spectra training (1st iter)
bias
1
2
3
n-2
n-1
w1
wn
0.211
0.018
0.411
0.301
0.077
f=(1+e-∑ws)-1
n
input layer
hidden layer(s)
output layer
spectra classification (nth iteration)
bias
1
2
3
n-2
n-1
w1
wn
0.003
0.807
0.101
0.054
0.008
f=(1+e-∑ws)-1
n
Example: Planetary Classification Problem
• spectra for object (planets) that belongs in a multidimensional model parameter space (abundances of range of atmospheric gases)
• test ability to recognize spectra in this parameter space as a function of data quality
1994MNRAS.269...97V
1994MNRAS.269...97V
1994MNRAS.269...97V
1994MNRAS.269...97V
stars
other continuum
SFR, age
T, log(g), Z
emission line(s)
HII, AGN
+morphology
+photometry
Automated Classification Could …. • hot stars and weak-lined stars as continuum calibrators
(varying throughput affect window function) – many degrees of freedom in the instrument + unknown
LAE environmental effects – channel-to-channel: optical or electronic effects – field-size effects: pupil efficiency (psf, guiding drift) – time-dependent effects: Temp-drifts in instrument,
gunk on optical surfaces, electronics drifts – astrophysical effects: reddening changes over field
(may be able to use nearly all stars for this)
Automated Classification Could …. • stellar science:
– statistical population studies, WD search, extremely metal-poor stars, outer halo stars, C-studies via G-band, EHB stars, very rare stellar types
• continuum galaxies – classify by SFR/age to study as a function of redshift,
clustering • AGN:
– ANN good at digging out low S/N versions with a known recovery and contamination fraction; possibly faster than template matching
• unusual objects discovery potential – automated classifier looks through data in real time and
flags poor matches to training library, yet at good S/N