Download - Preliminary Census of ROSAT Bright Sources: Results from ...sundog.stsci.edu/rick/aas03jan.pdfClassX isbeingdevelopedasa system for automated classiﬁcation of X-ray sources within

0.0 0.2 0.4 0.6 0.8Probability (QSO)

0

500

1000

1500

2000

2500

Num

ber

0.0 0.2 0.4 0.6 0.80

500

1000

1500

2000

2500 Distribution of QSO probabilitiesNon-QSOs

QSOs

0.0 0.2 0.4 0.6 0.8Probability (QSO)

0

500

1000

1500

2000

2500

Nu

mb

er

0.0 0.2 0.4 0.6 0.80

500

1000

1500

2000


QSOs

Preliminary Census of ROSAT Bright Sources: Results from ClassXR.L. White, A.A. Suchkov, R.J. Hanisch, M. Postman, M.E. Donahue (STScI), T.A. McGlynn, L. Angelini, M.F. Corcoran, S.A. Drake,W.D. Pence, N. White, E.L. Winter (NASA/GSFC), F. Genova, F. Ochsenbein, P. Fernique, S. Derriere (CDS), W. Voges (MPE)

Abstract. ClassX is being developed as a

system for automated classification of X-

ray sources within the Virtual Observa-

tory environment. Its core is a network of

classifiers “trained” using diverse data

sets for X-ray sources of known object

type and their optical, infrared, radio, etc.

counterparts. The network is integrated in

the ClassX pipeline with a search engine

that queries remote multi-wavelength data

repositories, using systems such as CDS

VizieR service, to get data (in the VOTa-

ble format) for the sources to be classi-

fied.

In this paper we present a preliminary

census from ClassX of previously unclas-

sified X-ray sources observed with the

ROSAT PSPC. The early results include

the finding that our sources are dominated

by QSOs (Fig.1), in contrast to the star-

dominated samples were used to train our

classifiers. The ClassX census appears to

be consistent with expectations when one

considers the fainter population of

sources being studied compared with pre-

viously classified objects.

Fig. 1. Class distribution. Comparison of class fraction for X-ray sources

classified in the WGACAT (blue) and sources previously unclassified

(red) for which GSC2 counterparts were found. Each panel presents

results from different classifiers: “X-ray–optical” (xo9a_xo), “X-ray”

(xo9a_x), and “optical” (xo9a_o). The green histogram at the bottom

shows the distribution of the classes from the original WGACAT classifi-

cation for the training set. Note that while the training set of known classi-

fications was dominated by stars, the most common class among the

newly classified objects is QSO, followed by galaxy clusters. This change

in population is expected since the unidentified X-ray sources are gener-

ally fainter. That our classifier is able to respond to the changing popula-

tion is encouraging, as it is generally a challenging problem to classify a

set of objects that differs substantially from the training set.

Fig. 6. Near IR colors for newly classified sources.Same as in Fig. 5 but for previously unclassified

sources. Classes are from the “X-ray” classifier.

Fig. 5. Near IR colors. 2MASS J-K color distri-

bution of X-ray sources with previously known

classifications in the WGACAT.

Fig. 2. Classification probabilities. Class probability

distribution for previously unclassified X-ray sources

with GSC2 and 2MASS counterparts (from the classi-

fier trained using only X-ray data). ClassX provides

classification probabilities for every class for each

source. The plot shows the distribution of QSO proba-

bilities for all objects (gray) and objects classified as

QSOs (red), meaning that QSO is the highest probabil-

ity class. The probabilities are relatively low because

the QSO and AGN classes are so similar.

Fig. 7. Mean IR & X-ray colors. Mean 2MASS

J-K color (upper panel) and mean X-ray “color”

(lower) for classified sources (classes based on the

WGACAT -- blue) and unclassified sources

(classes from the X-ray classifier -- red.)

ReferenceA.A. Suchkov, T.A. McGlynn, L. Angelini, M.F. Corcoran, S.A. Drake, W.D. Pence, N. White, E.L. Winter, R.J.Hanisch, R.L. White, M. Postman, M.E. Donahue, F. Genova, F. Ochsenbein, P. Fernique, & S. Derriere, 2002.Automated Object Classification with ClassX, Astro-ph/0210407

Introduction: ClassX classifiers. Clas-

sification of observed astronomical

objects plays in major role in converting

observational data into science. It is also

trickier than one might guess because

the class categories often overlap: the

same object can be called a star and a

white dwarf, a galaxy and an AGN, an

AGN and a QSO, etc. The situation gets

even more complicated when the same

object is viewed with different instru-

ments: for instance, at the position of an

X-ray cluster of galaxies, an optical

counterpart from, say, GSC2, would typ-

ically be a galaxy rather than a combined

entity called a cluster of galaxies. These

and similar conceptual issues related to

the ClassX project were discussed ear-

lier by Suchkov et al. (2002).

This paper. In this paper, we classify

previously unidentified ROSAT sources

with several different classifiers, each

trained with a different set of parame-

ters. For instance, the training of the “X-

ray” classifier involves X-ray magni-

tudes but not optical and infrared magni-

tudes, while the training of the “X-ray

and optical” classifier involves both X-

ray and optical magnitudes. Fig. 1 com-

pares class frequency of two samples of

X-ray sources with classification from

three classifiers.

Data. The WGA catalog of X-ray

sources from ROSAT PSPC observa-

tions contains 36995 sources for which

we found optical counterparts within 30

arcsec in the GSC2, with both F and J

magnitudes. Of those, 6505 sources

were classified in the WGACAT; we

used this sample to train our classifiers.

The classifiers were then applied to the

remaining 30490 sources to determine

the object type (class) associated with

these previously unclassified sources

(see Fig. 1).

AAS Meeting 201, January 5 – 9, Seattle, WA

Class properties of previously unclassified sources. Not surprisingly, the unclassified sources are on average fainter, which implies that the respective

class objects are on average more distant or less luminous. We expect some systematic differences between classes in the classified and unclassified sam-

ples. Illustrations of such differences can be found in the figures presented here. For example, QSOs are much more common at fainter magnitudes, which

accounts for the large increase in the fraction of QSOs compared with the training set.

Observational biases. Class properties in the classified and unclassified samples are also different because of different source detectability in different

bands.. Bluer 2MASS colors are found in the unclassified sample because at faint magnitudes detections in the K band are possible only for bluer sources

(Fig. 7, upper panel). Similarly, fainter sources are softer in the X-ray because detections in the hard band, x3, are available only for softer sources (Fig. 7,

lower panel).

Future work. Clearly these statistical checks on the properties of different classes are not a substitute for checking the accuracy of the classifications using

spectroscopically identified sources. In the near future we plan detailed comparisons of our classes with external data such as SDSS.

HighlightsCompared with the previously classified objects, for the newly classified sources:

• QSOs and clusters of galaxies are much more common (whereas stars dominate the

training set.)

• All classes in the newly classified sample are softer in X-rays (except for OF stars).

• All classes in the newly classified sample are bluer in the 2MASS bands.

• Class QSO is the “softest” and much softer than class AGN in both samples.

• Class AGN is the reddest and much redder than class QSO in both samples.

• AGNs, galaxies, & clusters of galaxies all show bimodal IR color distributions.

• In the infrared, AGNs, galaxies, and clusters of galaxies are dominated by the group of

blue 2MASS counterparts as opposed to the group of red counterparts in the classified

sample.

0.0 0.2 0.4 0.6 0.8P(QSO or AGN)

0.0

0.2

0.4

0.6

0.8

1.0

P(S

tar)

QSO+AGN

Stars

Other

Validation of ClassX classification. We explore validity of the ClassX

classification using a variety of checks on the internal and external consis-

tency of the classification results. Figs. 5 and 6 display the distribution of

the 2MASS J-K color, a parameter that was not used in the training or

classification of the sources. Comparing the two figures, we notice a num-

ber of common features. For example, the distribution of AGNs is obvi-

ously bimodal in both the classified and unclassified samples, which

isolates two groups, blue and red, centered at ~0.7 and ~1.4 (although the

relative prominence of the two groups is different for the two samples).

This kind of consistency suggests that the classifier does indeed a good

job statistically in identifying AGNs among X-ray sources.

Fig. 7 shows the class variation of the mean infrared and X-ray colors for

classified and unclassified sources. There is a remarkable consistency

between the two samples in the color variation from class to class:

Classes that are redder/softer in the classified sample are also redder/

softer in the unclassified sample. Again, this is indicative of a substantial

degree of reliability of the ClassX classification.

Fig. 4 Separation in probabilities. The total probabili-

ties P(Star) vs. P(QSO) are plotted for the combined

classes of Fig. 3. Note that the histograms in Fig. 3

result from summing this distribution along the y direc-

tion. Most of the stars are very well separated from the

other classes, as are many of the QSOs+AGN. Objects

near the intersections and boundaries are difficult to

classify.

0.0 0.2 0.4 0.6 0.8Probability (QSO)

0

500

1000

1500

2000

2500

Num

ber

0.0 0.2 0.4 0.6 0.80

500

1000

1500

2000


QSOs

Fig. 3 Combining class probabilities. The class

probabilities can be usefully combined to compare

groups of similar classes. Here the normal stellar

classes have been combined into a single “Stars” class,

QSOs & AGNs have been combined into a second

class, and the X-ray Binary, Galaxy and Cluster

classes are left unchanged. Now the QSOs/AGNs

(red) separate very well from the stars (blue).

0.0 0.2 0.4 0.6 0.8Probability (QSO or AGN)

0

500

1000

1500

Num

ber

0.0 0.2 0.4 0.6 0.80

500

1000

1500 Distribution of combinedQSO+AGN probabilities

StarsQSO/AGNs

Other