Principal Components Analysis Download
Principal Component Analysis of SDSS Stellar Spectra

We apply Principal Component Analysis (PCA) to &sim 100,000 stellar spectra obtained by the Sloan Digital Sky Survey (SDSS). In order to avoid strong non-linear variation of spectra with effective temperature, we bin the sample into 0.02 mag wide intervals of the g-r color (-0.20 < g-r < 0.90, roughly corresponding to MK spectral types A3 to K3) and find that in each bin the first four eigenspectra are sufficient to describe the observed spectra within the measurement noise. We make publicly available the resulting high signal-to-noise mean spectra and the other three eigenspectra. These data can be used to generate high quality spectra for an arbitrary combination of effective temperature, metallicity, and gravity within the parameter space probed by the SDSS. We analyze correlations of eigencoefficients with metallicity and gravity estimated by the Sloan Extension for Galactic Understanding and Exploration (SEGUE) Stellar Parameters Pipeline. The SDSS stellar spectroscopic database and the PCA results presented here offer a convenient method to classify new spectra, to search for unusual spectra, to train various spectral classification methods, and to synthesize accurate colors in arbitrary optical bandpasses.

Here is the link to the paper (AJ 139, 1261-1268):
McGurk, Kimball & Ivezić (2010)

Public Content

We are making the results of our research public through the release of 4 files. Using the master.dat file, the basic IDL read-in program master.pro, and the eigenspectra files in es.tar.zip provided below, anyone can reconstruct any of our spectra or attempt to fit one of our eigenspectra to an arbitrary spectrum. Additionally, we include gap-corrected spectra for 790 spectra of our ~100,000 spectra sample.

Spectral data
master.dat (text file, 17 Mb) contains a large quantity of information for the ~100,000 SDSS stars that underwent Principal Component Analysis. For each star we provide the MJD, plate, fiber, the bin number used for our analysis, the metallicity, effective temperature, gravity, two radial velocity measurements, SDSS u, g, r, i, and z magnitudes and errors, AND the PCA eigencomponents and normalizations calculated.

IDL Program
master.pro (text file, 5.5 Kb) is a basic IDL program created to perform several simple tasks using master.dat. It can read in master.dat and return all of the data in a structure. Also, the user can input the star's MJD, plate, and fiber and get back the star's data from master.pro as well a plot of the reconstructed spectrum or variables containing the reconstructed spectrum. master.example at the bottom of the page illustrates the easy ways to use master.pro .

Eigenspectra Files
es.tar.zip (2.1 Mb) is a gzipped tarball containing a folder with the eigenspectra produced from the Principal Component Analysis for each temperature/color bin (for details of the binning system, please see our paper above). There are four eigenspectra for each bin, with the first eigenspectrum being the average spectrum of the bin.

Gap-Corrected Spectra
gapcorr.tar.gz (28 Mb) is a gzipped tarball containing a folder with the 790 gap-repaired spectra produced from our Principal Component Analysis program. Each spectrum contains the wavelength and flux. The list of spectra that were gap-repaired can be found here (24 Kb).

Instructions

As a reminder, for the es.tar.zip file, type
gunzip es.tar.zip   % to unzip
tar -xvf es.tar       % to extract the tarball (it will already be in a folder so you don't have to make one)

The following example file discusses the easy ways to use master.pro:
master.example (1.1 Kb)