CALCOS performs the basic calibration of COS data, producing a flat fielded 2-D image and a 1-D flux calibrated spectrum. The high level portion of CALCOS uses three classes: Association, Observation, and Calibration. The Association class contains a list of Observation instances and information about the relationships between files in the association. It checks for consistency in the header keywords of all the input files, for example that they were taken with the same detector and grating, and that they use the same reference files. The Observation class contains information for an individual input file. The Calibration class contains the high level calibration methods. The lower level functions that do the actual calibration are procedural.
Further information can be found in Hodge (2002).
For time-tag data, tabular format is retained for the basic calibration steps. The thermal and geometric corrections are applied by changing the X and Y pixel coordinates of each photon event. Orbital and heliocentric Doppler corrections are done by changing the pixel coordinate in the dispersion direction (X for FUV, Y for NUV). Bad regions on the detector, bad time intervals, and events rejected because the pulse height is out of range are flagged by setting a bit in the data quality column. Flat field and deadtime corrections are applied by assigning a value in a weight column. After applying these corrections, the results are written to an output events table, which is similar in format to the input events table but with additional columns and different data types for some columns. The corrected table of photon events is also binned into an image. For each row in the table, the pixel nearest to the corrected X,Y position of the photon is incremented by the value in the weight column.
For accum image data, if the FUV detector was used, the image will be temporarily converted into a pseudo-time-tag list. For each count in the raw image, the X and Y coordinates of the pixel will be appended to in-memory lists; the time of arrival of the photon cannot be known, so there is no list of times. A pseudo-random number on the interval is added to each pixel coordinate to reduce aliasing effects. Thermal and geometric corrections are then applied using the same code as for time-tag data. The lists of X and Y coordinates are then binned back into an image, and subsequent calibration steps (flat field, deadtime) are performed on the image. The heliocentric correction for accum data is done by writing the radial velocity to a header keyword, which is later used during 1-D spectral extraction to shift the wavelengths.
After the basic calibration described above, a 1-D spectrum is extracted from the calibrated image. A wavelength is assigned to each pixel of the extracted spectrum, and the wavelength scale can be shifted to correct for an error in the grating adjustment (see below). One exposure yields two or three noncontiguous sections of spectrum. The FUV detector consists of two separate ``segments'' with a narrow gap between them. For NUV, three separate ``stripes'' of the spectrum are focussed onto the detector at one time. The gap between NUV stripes is large, approximately twice the length of a stripe. Each extracted 1-D spectrum is stored as one row in the output table, i.e., two rows for FUV and three rows for NUV, with the wavelength, flux, etc., stored as arrays in each row.
The mechanism to select or reposition a grating is not perfectly repeatable. For each exposure at a given position, the offset from the nominal position is determined using a ``wavecal,'' an exposure using an internal emission-line lamp. A cross correlation of the 1-D extracted wavecal spectrum with a template spectrum gives the offset. The individual spectra within an exposure (for the two FUV detector segments or the three NUV spectral stripes) are processed independently, rather than averaging them to yield one offset per exposure. If multiple wavecal observations are available and two wavecals bracket a science observation, the offset to apply to the wavelength scale for the science observation is determined by linear interpolation with time. Two diagnostic tests have been implemented, to catch gross errors such as a failure of the grating mechanism: (1) the width of the cross correlation should be small, comparable to the spectral resolution; and (2) the maximum value of the cross correlation should be much larger than the median value.
The default observing mode is to take multiple exposures at four slightly different positions, offset in the dispersion direction, to smooth out flat-field irregularities and avoid detector defects. It is also possible to take multiple exposures at the same grating position, or to take single exposures. When multiple exposures were taken, an additional output file will be written that contains the averages of the individual 1-D extracted spectra. The flat-fielded images will also be averaged, for repeated exposures at the same grating position.
CALCOS is written in Python, with some C code for row-by-row or pixel-by-pixel operations. The PyFITS module is used for FITS file I/O. The numarray module supports efficient array operations, and array arithmetic is as simple as scalar arithmetic. An image data array is represented in PyFITS as a numarray object, and a table is a 1-D array of records (rows). Each column of a table is a numarray object (or chararray for text strings). The numarray `strides' attribute allows column access directly within the table data, without making a separate copy of the column. The total size and structure of a numarray object are fixed when the array is created (e.g., the columns and their data types, and the number of rows). Rather than actually deleting rows that are rejected during processing, CALCOS flags them as bad in the data quality column, because this doesn't change the number of rows; it also has the advantage of being reversible.
Most arithmetic operations use numarray. There are some operations that are not sufficiently generic to have been implemented in numarray, however, and for these a C extension module was written. A comparison of the flat field calibration step for time-tag and for accum data illustrates this issue. Time-tag data are corrected individually for each photon event (table row), and the entire procedure uses C code. For each row, the pixel coordinates of the event are gotten from the X and Y columns. The coordinates are rounded to the nearest integer, and the value is taken at that pixel in the flat field reference image. The weight for the event is then set equal to the reciprocal of the flat field value. A combination of C code and numarray operations are used for accum data. The first step is to convolve the flat field image in the dispersion direction, to account for the orbital Doppler shift during the exposure. (The on-board software shifts the pixel location before incrementing the image array in memory, so the flat field should be shifted by the same amount before being applied to the image.) This is done using C code, because it is a 1-D convolution of a 2-D image. After this, however, the actual flat-field calibration is just a division of two arrays, using numarray arithmetic. If the flat field is a subset of the entire detector (which would be the case for FUV data), just the matching subset of the input image will be corrected by using standard Python slice notation.
Hodge, P. 2002, in ASP Conf. Ser., Vol. 281, Astronomical Data Analysis Software and Systems XI, ed. David A. Bohlender, Daniel Durand and T. H. Handley (San Francisco: ASP), 273