Scales for the identification of adults with attention deficit hyperactivity disorder (ADHD): A...

Review article

Scales for the identification of adults with attention deficithyperactivity disorder (ADHD): A systematic review

Abigail Taylor a, Shoumitro Deb b,*, Gemma Unwin c

a Milton Keynes Hospital NHS Foundation Trust, Standing Way, Eaglestone, Milton Keynes MK6 5LD, UKb University of Birmingham, The Barberry-National Centre for Mental Health, 25 Vincent Drive, Edgbaston, Birmingham B15 2FG, UKc University of Birmingham, School of Psychology, Edgbaston, Birmingham B15 2TT, UK

Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925

1.1. Scale development and items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925

1.2. Type of scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925

1.3. Study quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926

1.4. Comparison groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926

1.5. Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926

2. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926

2.1. Psychometric properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927

2.1.1. Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927

2.1.2. Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927

3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927

3.1. Study quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927

Research in Developmental Disabilities 32 (2011) 924–938

A R T I C L E I N F O

Article history:

Received 16 November 2010

Received in revised form 20 December 2010

Accepted 27 December 2010

Available online 12 February 2011

Keywords:

Attention deficit hyperactivity disorder

Rating scales

Adults

A B S T R A C T

Attention deficit hyperactivity disorder (ADHD) is prevalent in the adult population. The

associated co-morbidities and impairments can be relieved with treatment. Therefore,

several rating scales have been developed to identify adults with ADHD who may benefit

from treatment. No systematic review has yet sought to evaluate these scales in more

detail. The present systematic review was undertaken to describe the properties, including

psychometric statistics, of the currently available adult ADHD rating scales and their

scoring methods, along with the procedure for development. Descriptive synthesis of the

data is presented and study quality has been assessed by an objective quality assessment

tool. The properties of each scale are discussed to make judgements about their validity

and usefulness. The literature search retrieved 35 validation studies of adult ADHD rating

scales and 14 separate scales were identified. The majority of studies were of poor quality

and reported insufficient detail. Of the 14 scales, the Conners’ Adult ADHD Rating scale and

the Wender Utah Rating Scale (short version) had more robust psychometric statistics and

content validity. More research into these scales, with good quality studies, is needed to

confirm the findings of this review. Future studies of ADHD rating scales should be

reported in more detail so that further reviews have more support for their findings.

� 2010 Elsevier Ltd. All rights reserved.

* Corresponding author. Tel.: +44 0 121 414 7130; fax: +44 0 121 301 2351.

E-mail addresses: [email protected] (A. Taylor), [email protected] (S. Deb), [email protected] (G. Unwin).

Contents lists available at ScienceDirect

Research in Developmental Disabilities

0891-4222/$ – see front matter � 2010 Elsevier Ltd. All rights reserved.

doi:10.1016/j.ridd.2010.12.036

http://dx.doi.org/10.1016/j.ridd.2010.12.036

mailto:[email protected]



http://www.sciencedirect.com/science/journal/08914222

http://dx.doi.org/10.1016/j.ridd.2010.12.036

3.2. Scale development and items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 928

3.3. Type of scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 928

3.4. Completion methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 929

3.5. Scoring methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 929

3.6. Detailed explanation of study methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934

3.7. Representative population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934

3.8. Comparison groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934

3.9. Psychometric properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934

3.9.1. Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934

3.9.2. Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934

4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 936

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937

1. Introduction

Attention deficit hyperactivity disorder (ADHD) is a developmental disorder of childhood onset. Functional impairmentsassociated with ADHD include poor work and school performance (Daley, 2006). Adults with ADHD have a high rate ofpsychiatric co-morbidity, particularly substance abuse and antisocial personality disorder (Mannuzza, Klein, & Bessler,1998; Weiss, Hechtman, Milroy, & Perlman, 1985; Zwi & York, 2004).

The Diagnostic and Statistical Manual 4th edition (DSM-IV; American Psychiatric Association; APA, 1994) providesclinical criteria for diagnosis of ADHD. It is estimated that around 3–5% of children and about 1–3% of the adult populationhas ADHD (Fayyad et al., 2007; Kessler et al., 2006; Polanczyk & Rohde, 2007). A meta-analysis of follow-up studiesconcluded that in up to 65% of children with ADHD, symptoms and impairments may persist into adulthood (Biederman,Faraone, & Mick, 2006). A systematic review of 37 intervention studies found that psychostimulants and someantidepressants have a beneficial effect on adult ADHD symptoms (Wilens, Spencer, & Biederman, 2001).

Rating scales have been shown to be useful in identifying and screening for ADHD in children (Daley, 2006). Several scaleshave been developed and validated specifically to identify adults with ADHD as the disorder has a different symptom profileto childhood ADHD (Biederman et al., 2006; Zwi & York, 2004).

The accuracy of a rating scale can be affected by the content validity, including the types of items included in that scaleand method of completion. A valid rating scale should be well designed and fit for purpose. Below, we have described criteriawhich could be used to assess usefulness and validity of rating scales.

1.1. Scale development and items

Clinically valid scales are likely to be those based on standard diagnostic criteria (for example the DSM-IV criteria) incombination with other criteria that are specific to adult population. The DSM-IV criteria were developed in field trialsstudying children with ADHD and were validated in a large group of children (American Psychiatric Association, 1994;Mannuzza, 2003). Studies which have used the DSM-IV criteria in adults have found that they can also be useful forassessment of adult ADHD (Mannuzza et al., 1998; Polanczyk & Rohde, 2007; Weiss et al., 1985). However, exclusive use ofthe DSM-IV criteria could restrict the evaluation of adult ADHD to a limited symptom list. Follow up studies, such as that byBiederman et al. (2006), show that inattentive symptoms persist more into adulthood than hyperactivity and impulsivity.Adults face more social situations that increase the potential for manifestation of impairment than children such as in theworkplace, at home, friendships, and marriages.

1.2. Type of scales

Assessment of current symptoms is important to demonstrate that the patient currently suffers from impairment.However, as ADHD is a developmental disorder, in order for an adult to receive a DSM-IV diagnosis of ADHD, there needs to beevidence of presence of ADHD symptoms in childhood (American Psychiatric Association, 1994). As the patient is required torecall symptoms and behaviour from as young as 5 years of age, recall bias could affect the reliability of retrospective scales,particularly with ADHD patients (Mannuzza, Klein, Klein, Bessler, & Shrout, 2002). Neuropsychological tests such as theWechsler-Memory scale, where participants are asked to remember as much as they can about a short story they have beenread, show that ADHD adults have impaired short term and long term memory recall (Pollak, Kahana-Vax, & Hoofien, 2008).This will affect the patient’s recall of both childhood and adulthood symptoms.

The relationship between self and informant symptom ratings, and the accuracy of these reports are unclear. There maybe several reasons for the discrepancies between self and informant reports. Informant reports may be unreliable. Parentsmay be unaware of delinquent behaviour in youth, as shown by Du Paul et al. (2001). Belenduik, Clarke, Chronis, and Raggi(2007) suggested that patients with ADHD may conceal symptoms from friends, family and co-workers to ‘‘get on with theirlives’’ and not jeopardise their jobs. They suggested that inattentive symptoms were the easiest to conceal, perhaps

A. Taylor et al. / Research in Developmental Disabilities 32 (2011) 924–938 925

explaining why inattentive symptoms have the greatest reporting discrepancy. Informants may be unaware of more internalsymptoms such as emotional problems.

Conversely, self reports may be inaccurate as adults with ADHD may be unaware of externally manifested symptoms,such as fidgeting, as it has become a natural part of their behaviour. In the same way, adults may have adapted to theirsymptoms and therefore do not feel they are problematic. As self and informant ratings of symptoms can differ, good qualityscales should have both informant and self rated versions for assessment of both adult and childhood symptoms (Barkley,Fischer, Smallish, & Fletcher, 2002).

1.3. Study quality

There should be sufficient detail in reporting the study methods to make a judgement on study quality; including howsamples were selected and how the tests were administered. The Quality Assessment for Diagnostic Accuracy Studies (QUADAS;Whiting, Rutjes, Reitsma, Bossuyt, & Kleijnen, 2003) can be used to judge the quality of study. The QUADAS stipulates thatstudies should include sufficient detail regarding the study methods, including how samples were selected. Samples shouldbe representative of the population that the scale is intended for and there should be comparison groups. Scales should becompared against the gold standard reference test. Study samples should be sufficiently large to provide statistical power.

1.4. Comparison groups

Ideally, scale validation studies should include a population based sample of adults clinically diagnosed with ADHD and amatched control group of participants taken from the same population of adults who do not have a diagnosis of ADHD.Samples should include those with co-morbidities such as substance abuse and depression. Mehringer et al. (2002)suggested that symptoms of cocaine withdrawal in patients with substance abuse could mimic ADHD. Patients withpsychiatric conditions, without ADHD, tend to score highly on ADHD rating scales (McCann, Scheele, Ward, & Roy-Byrne,2000; Mehringer et al., 2002; Ward, Wender, & Reimherr, 1993). Therefore, it is important to determine how well scalesperform in these populations in terms of their discriminant validity. Comparison groups may reduce the effects ofconfounding variables on scale scores.

Gender can affect ADHD scale scores independent of disease status. For instance, delinquency symptoms are reportedmore often in males than females (Conners, Erhadt, Epstein, et al., 1999; Erhardt, Epstein, Conners, Parker, & Sitarenios, 1999;Young, 2004). However, some studies describe no differences between male and female symptom report (Belenduik et al.,2007; Heiligenstein, Conyers, Berns, & Smith, 1998; Mancini, Ameringen, Oakman, & Figueiredo, 1999; Ward et al., 1993).Younger people report more symptoms than older people (Conners, Erhardt, & Sparrow, 1999; Heiligenstein et al., 1998;Murphy and Barkley, 1996; Solanto, Etefia, & Marks, 2004; Young, 2004). It is not clear whether this is a cohort effect, in thatpeople of one generation do not self report ‘‘hyperactivity’’ in childhood for instance, as it was perceived to be normal or dueto the attrition of childhood memories. ADHD symptoms may also decline with age, in disproportion to impairment(Biederman et al., 2006; Mannuzza, 2003; Weiss et al., 1985).

1.5. Objectives

Although there are a number of literature reviews available on scales for screening for ADHD in adults (Adler, Shaw, Sitt,Maya, & Ippolito, 2009; Faraone & Antshel, 2008; Murphy & Adler, 2004), no systematic reviews have been published.Therefore, we have carried out a systematic review in order to identify and analyse all studies validating rating scales used toidentify or screen for adults with ADHD.

2. Methods

The review protocol was developed in accordance with guidelines and advice from the Centre for Reviews andDissemination (CDC), UK (Khan & Kleijnen, 2001). Suitable papers were identified by searching four online medical journaldatabases, namely MEDLINE (1950 – June 2010), CINAHL (1981 – June 2010), EMBASE (1980 – June 2010) and PsycINFO(1967 – June 2010). The terms used to search each database are included in Appendix A.

Search results were documented in a Reference Manager 11� database. The software was used to search for and removeduplicated articles. Titles were then scrutinised in order to remove obviously inappropriate articles according to theinclusion and exclusion criteria (see Table 1).

Two researchers (AT & GU) independently applied inclusion and exclusion criteria to the abstracts and then to the full textarticles. Pre-piloted inclusion/exclusion forms were used to document and guide this process. The primary reviewer (AT)completed data extraction on the final selection of articles using pre-piloted data extraction forms. The properties of thestudied scales, sample demographics and the study findings were extracted. A second reviewer (GU) independentlyconducted data extraction on five of these articles. The articles for dual data extraction were chosen by a consensus decisionwith a third reviewer (SD) as these were thought to be the most salient articles. Data synthesis was descriptive. The findingsfrom each included study were tabulated to describe the scales and assess the psychometric properties of each scale. Studyquality was assessed using the QUADAS.

A. Taylor et al. / Research in Developmental Disabilities 32 (2011) 924–938926

2.1. Psychometric properties

Psychometric statistics objectively demonstrate the reliability and validity of scales. Reliable scales are consistent andreproducible. Valid scales truthfully measure the underlying concept that they are designed to measure, in this case, ADHD.

2.1.1. Reliability

One of the main statistics used to assess reliability is internal consistency. Internal consistency demonstrates how wellrelated items are in a scale, and that all items in a subscale measure the same concept. Cronbach’s alpha is one measure ofinternal consistency and the minimal acceptable level is 0.7 (Field, 2005). Split-half reliability has also been used to assessinternal consistency, as it measures the correlation between two halves of a scale. The concordance between patient andinformant scores shows how well informant ratings agree with self ratings. Although this is related to inter-rater reliability,it is not the same, as the patient is completing the scale about themselves. This can be measured by Cohen’s kappa. Cohen’skappa of 0.6–0.74 denotes good agreement and 0.75 and upwards is excellent (Field, 2005).

Pearson’s correlation is also used to assess patient-informant concordance. Intra-class correlation coefficient (ICC) is alsoused to assess concordance. Test–retest reliability demonstrates the stability of scale measurements over time and is oftenassessed in normative samples, as it is assumed that scores would be stable, as there is no pathology present.

2.1.2. Validity

A reliable scale is not necessarily a valid scale. Even if scores on scales are consistent and reproducible, scales are notuseful unless they measure the underlying pathology adequately. Factor analysis can be considered part of validation as itdemonstrates that items in, for instance, a hyperactivity subscale, measure only hyperactivity and not other concepts. Factorvariance, as a percentage, shows to what extent score variation is due to actual differences in pathology rather than chance.

Construct validity demonstrates that a scale measures the underlying construct of ADHD and does not measure unrelatedconstructs. Scores on ADHD scales should correlate with scores on general psychiatric scales. However, it is important toensure that ADHD rating scales are not merely assessing general psychopathology. Therefore, the correlation should not betoo great.

Concurrent validity demonstrates how well scale ratings agree with a gold standard such as the DSM-IV diagnosticinterview. This is arguably one of the more important aspects of scale validation. There is little point in using a rating scalewhich has no correlation with the gold standard assessment, as this scale would be invalid. Cohen’s kappa, Pearson’scorrelation and ICC can be used to measure concurrent validity.

Sensitivity and specificity are related to concurrent validity, but provide more information about the accuracy of the scale.Sensitivity shows how well scales identify true cases and specificity shows how well scales identify true non-cases. Totalclassification accuracy (TCA) is a measure of the overall diagnostic accuracy of the scale; and shows the percentage of bothcases and non-cases correctly diagnosed by the scale.

Positive predictive value (PPV) is the proportion of those who screen positive who actually have the disease, and negativepredictive value (NPV) is the proportion that screen negative that are true non-cases.

Receiver operating characteristics assess the accuracy of scales with continuous variables. A graph of sensitivity over 1-specificity gives the area under the curve (AUC), which can demonstrate the accuracy of a scale.

3. Results

Fig. 1 shows the results of the search at each stage in terms of the numbers of identified articles. Thirty-five validationstudies were identified for review. A summary of each of the 14 adult ADHD scales identified is presented in Table 2. Thecharacteristics of these are shown in Table 3.

3.1. Study quality

Only the test procedures and the scale properties were well described in all studies. Recruitment methods, howindeterminate results were dealt with, and withdrawals from the study were not well explained in nearly all of the studies.Only one study stated whether or not scale scores were blinded from interviewers and vice versa (Kooij et al., 2004).

Table 1

Inclusion and exclusion criteria.

Inclusion criteria Exclusion criteria

Study design: studies investigating a structured symptom or behaviour

based scale, of childhood or current symptoms, for diagnosis, screening

or identification of ADHD in adults

Publication: Foreign language studies

Participants: adults (18 years and over) with ADHD Scales: neuropsychological functioning scales/tests

or quality of life scales. Scales assessing personality traits

Outcomes: psychometric properties of rating scales (validity, reliability,

factor analysis, sensitivity, specificity, internal consistency, etc.)


Two studies had unrepresentative samples, namely, healthy college/university students or an all female population(Belenduik et al., 2007; Young, 2004). A further 17 provided insufficient detail to determine sample representativeness.

Many studies used small samples of less than 100 participants, except for selected studies of the Wender Utah RatingScale (WURS), Current Symptoms Scale (CSS), Conners’ Adult ADHD Rating Scale (CAARS), Young Adult Rating Scale (YARS),Attention Deficit Scales for Adults (ADSA), ADHD Rating Scale (ADHD-RS) and the Caterino Scale. Only Rossini and O’Connor(1995) undertook a power calculation before recruitment.

3.2. Scale development and items

Scale items are based on ADHD symptoms, behaviours and difficulties. The Assessment of Hyperactivity and Attention

(AHA), CSS, ADHD-RS, Adult Self Report Scale (ASRS-18) and Symptom Inventory (SI) are based entirely on the DSM-IV ‘A’ criteria.These scales contain the 18 DSM-IV ‘A’ criteria which have been reworded so that they can be included in a rating scale. TheAdult Rating Scale (ARS) contains 25 items, based on the DSMIII-R criteria. During the development of the ASRS, several itemswere identified as ADHD symptoms which could not be mapped onto the DSM-IV criteria, and were therefore discarded(Adler et al., 2006).

Other scales have attempted to mitigate the potential restrictions of DSM-IV by combining these items with other criteriaor developing an entirely new set of criteria. The ADSA and Adult Problems Questionnaire (APQ) items were developed frominterviews of ADHD adults. The CAARS items were developed from childhood rating scales using the DSM-IV criteria and theUtah criteria. The WURS was developed using the Utah criteria. Similarly the Brown Attention Deficit Disorder Scales (BADDS)

items are based on DSM-IV and the author’s own published studies of ADHD. The YARS uses 17 of 18 DSM-IV ‘A’ criteria, andincluded seven of their own items which they considered to tap into educational difficulties The Young Adult Questionnaire

(YAQ) items were chosen from a literature review of adult ADHD symptoms (Young, 2004).

3.3. Type of scales

Ten of the scales assess current symptoms. Four of the scales (WURS, YAQ, ADHD-RS and AHA) either enquire aboutchildhood symptoms separately or use the same scale items to retrospectively assess childhood symptoms, in the same waythat the DSM-IV criteria are used to diagnose both adults and children.

All of the scales were designed for adults to report their own symptoms. However, six scales (CAARS, YAQ, ADHD-RS, CSS,

AHA, and WURS) derive an informant version from the self report symptoms, to be completed by a spouse or co-worker forcurrent symptoms, or a parent or teacher, if available, for childhood symptoms.

However, studies have shown a lack of concordance between self and informant reports of symptoms (Fossati et al., 2001;Zucker, Morris, Ingram, Morris, & Bakeman, 2002). Informants tended to report fewer inattentive symptoms. Similar resultswere found by Zucker et al. (2002), and Fossati et al. (2001). Conversely, Murphy and Schachar (2000) found no differencebetween informant and self report symptoms on a brief ADHD questionnaire. In all of these studies, it is unclear whose reportis more accurate, as no objective measures, such as an assessment of attention span, were conducted.

[()TD$FIG]

All Databases1899

1317

77 abstractsobtained

44 full textsobtained

1 excluded on full text afterdiscussion

582 duplicatesremoved

1240 excluded on titlel

33 excluded onabstract

12 excluded on fulltext

32

31 included

4 cross-referencesidentified

35 studies

MEDLINE657

EMBASE721

CINAHL45

PsycInfo476

Fig. 1. Summary of the search process at each stage.


3.4. Completion methods

In all but one of the scales, symptom frequency is rated on a Likert scale, from never/very rarely, to always/very often. Thelanguage used for each scale is slightly different; a score of two on one scale does not necessarily equate with a score of twoon another. Nine of the scales are based on a four point (0–3) Likert scale, two are five point (0–4), one is five point (1–5) andone is an eight point Likert scale (0–7). The AHA is the only scale which is scored differently. Items are worded as they appearin the DSM-IV; for example; ‘‘I often feel restless’’ and the patient merely answers ‘‘yes or no’’.

3.5. Scoring methods

Several different scoring methods have been employed in these scales. The main scoring method is a symptom count.Where patients have a clinically significant symptom occurrence; for instance they answered very often or often (usually 2 or3), this is scored as a positive symptom. The patient scores one point for each positive symptom. In the case of the DSM-IV

criteria based scales, this gives a total score of 18 (9 per subscale). A continuous scoring method involves summation of theactual item responses, for instance if a patient has answered two to all 18 questions, they will receive a score of 36.

Alternatively, a continuous scoring method has been used either exclusively or as a compliment to the symptom countmethod. The patient’s actual item responses are summed (e.g. 2 + 3 + 4). This can be a more useful scoring method as thescore is compared to means for that particular population, which gives the score a population specific context. All of thesescoring methods take a clinically significant score to be 1.5 standard deviations (SD) above the presented mean. This is acommonly used cut-off point for rating scales, although 2 SD above the mean has also been used. The CSS and ADHD-RS have

Table 2

Studies included in the systematic review.

Scale (date when the scale was first published) Studies

Wender Utah Rating Scale (WURS)(Ward et al., 1993)

(1) Ward et al. (1993) (6) McCann et al. (2000)

(2) Stein et al. (1995) (7) Fossati et al. (2001)

(3) Rossini and O’Connor (1995) (8) Wierzbicki, 2005

(4) Weyandt et al. (1995) (9) Belenduik et al. (2007)

(5) Mancini et al. (1999)

Adult Rating Scale (ARS) (Weyandt et al., 1995) (4) Weyandt et al. (1995)

(10) McCann and Roy-Byrne (2004)

Current Symptoms Scale (CSS)(Barkley & Murphy, 1998)

(11) Murphy and Barkley (1996) (14) Zucker et al. (2002)

(12) Heiligenstein et al. (1998) (15) Aycicegi et al. (2003)

(13) O’Donnell et al. (2001)

(33) Katz, Petscher, Welles,

and Welles (2009)

Conners’ Adult ADHD Rating Scale (CAARS)(Conners, Erhardt, & Sparrow, 1999)

(16) Conners, Erhadt, Epstein,et al. (1999)

(19) Cleland, Magura, Foote,

Rosenblum, and Kosanke (2006)

(17) Erhardt et al. (1999) (9) Belenduik et al. (2007)

(18) Solanto et al. (2004) (20) Kooij et al. (2008)

(34) Adler et al. (2008)

Adult Problems Questionnaire (APQ)(De Quiros & Kinsbourne, 2001)

(21) De Quiros and Kinsbourne (2001)

Young Adult Rating Scale (YARS) (Du Paul et al., 2001) (22) Du Paul et al. (2001)

Assessment of Hyperactivity and Attention (AHA)(Mehringer et al., 2002)

(23) Mehringer et al. (2002)

Attention Deficit Scales for Adults (ADSA)(Triolo & Murphy, 1996)

(24) West, Mulsow, and Arredondo (2003) (10) McCann and Roy-Byrne

(2004)

(25) Dowson et al. (2004) (26) West, Mulsow, and

Arredondo (2007)

ADHD Rating Scale (ADHD-RS)(Du Paul, Power, Anastopoulos, & Reid, 1998)

(27) Kooij et al. (2004)

(20) Kooij et al. (2008)

Brown Attention Deficit Disorder Scales (BADDS) (Brown, 1996) (18) Solanto et al. (2004)

(20) Kooij et al. (2008)

Symptom Inventory (SI) (McCann & Roy-Byrne, 2004) (10) McCann and Roy-Byrne (2004)

Young Adult Questionnaire (YAQ) (Young, 2004) (28) Young (2004)

Adult Self Report Scale (ASRS) (Adler et al., 2006) (29) Kessler et al. (2005) (31) Reuter et al. (2006)

(30) Adler et al. (2006) (32) Kessler et al. (2007)

Caterino Scale (Caterino et al., 2009) (35) Caterino et al. (2009)

The articles highlighted in bold text are those retrieved through cross-references and not in the original database search.


Table 3

Summary of characteristics of adult ADHD scales.

Type of scale Items Completion method Scoring method Cut off scores Score range Scale development

1a. WURS-61

(Long)

Self report of

childhood

and current

symptoms

61 Symptom frequency is

rated on a 5 point

Likert scale (0–4)

Sum actual answer

responses (score 0, 1, 2,

3 or 4) per item

No cut off scores have

been reported owing to

the weaker

psychometric properties

compared with the 25-

item scale

0–244 Items were taken from Wender’s ‘‘Minimal

Brain Dysfunction in Children’’

1b. WURS-25

(short)

Self and

informant

report of

childhood

and current

symptoms


rated on a 5 point


Sum actual item

responses (score 0, 1, 2,

3 or 4) per item

>36 if depression is

present

>46 if depression is

absent

0–100 25 items from long WURS which had the

highest mean difference between ADHD and

non-ADHD participants. The higher cut off

score in depression was based on a study using

the scale in this population

2. ARS Self report of

current

symptoms


rated on a 4 point


Sum actual item

responses (score 0, 1, 2

or 3) per item

31 0–75 Items derived from DSM-III-R criteria.

Designed to be a similar format to the original

children’s ADHD-RS

3. CSS Self report of

current

symptoms

18 (9 inattention

+ 9 hyperactivity/

impulsivity)

Symptom frequency is

rated on a 4 point


Score 1 for a positive

symptom rating

(answered 2 or 3 to an

item)

Also, continuous

scoring where actual

item responses are

summed (as for the

ARS and WURS)

6/9 on one or both

subscales

Or 1.5 SD above the

mean total score for age/

sex

0–18

(symptom

rating)

0–54

(summed)

Taken directly from the 18 DSM-IV ‘A’ criteria.

Has also been used for retrospective childhood

symptom report

4a. CAARS

(Long)

Self and

informant

report of

current

symptoms

66 (42 items in

4 subscales

+ 18 DSM-IV items

+ 12 item ADHD

index)

Some items tap

into more than

1 subscale


rated on a 4 point


Actual responses are

entered onto a scoring

sheet. T scores are then

obtained from the

scoring sheet as per

age and gender. T

scores are then

compared to the

normative T value

(T = 50)

T> 65 0–198

(T = 0–100)

Developed 93 items from children’s rating

scale and Utah criteria in 9 domains. After

factor analysis of these 93 items, 42 were

chosen. Gender and age specific scores were

obtained. An inconsistency index is included

to ensure that the scale was completed

honestly. T scores between 50 and 65 are

considered borderline and require

interpretation by a trained clinician

4b. CAARS

(Short)

26 (20 items in

4 subscales

+ 12 item ADHD

index)

Some items

tap into both

subscales

0–78

(T = 0–100)

20 items were selected from the 42 subscale

items in the long CAARS that discriminated

ADHD the best

5. APQ Self report

of current

symptoms


rated on a 4 point


Sum responses on

individual items (0–3)

and divide by the

number of items

No scale specific cut offs

are presented. However,

a score of 2.5/3 on 3 of

the items may be used

0–3 Pool of common symptoms was identified by

ADHD adults. Items which tapped into the

Utah criteria and DSM-IV were chosen

A.

Ta

ylo

ret

al./R

esearch

inD

evelo

pm

enta

lD

isab

ilities3

2(2

01

1)

92

4–

93

89

30

6. YARS Self report

of current

symptoms

24 (17 out of the

18 DSM-IV criteria

+ 7 items addressing

difficulties

encountered

at college)


rated on a 4 point



symptom rating


item)

No scale-specific cut off

scores are presented

0–24 The investigators constructed the scale based

on 17 DSM-IV criteria. Included 7 items that

the authors considered to reflect specific

difficulties encountered by ADHD adults in

college or university. A cut-off score of 1.5 SD

above the mean may be used though the

authors do not include scale-specific cut-off

scores

7. AHA Self and

informant

report of

current and

childhood

symptoms

18 (2 subscales of

9 inattention

+ 9 hyperactivity/

impulsivity items)

items from DSM-IV

Symptoms are rated

‘Yes’ or ‘No’ as to

whether or not they

were present in

childhood and

adulthood


symptom rating


item)

4/9 adult symptoms + 6/

9 childhood symptoms

(on one or both

subscales)

0–18 The items were taken directly from the DSM-

IV criteria. Both childhood and adulthood

symptoms on the AHA are required for a

diagnosis of ADHD

8. ADSA Self report

of current

symptoms


rated on a 5 point


Actual answers are



obtained from the

scoring sheet for each

subscale and the total.

T scores are then

compared with the

normative T values

(T = 50)

If the Total T >60

(total = 161), patient is

likely to have ADHD

If T >70 (total = 181),

highly likely to have

ADHD

54–270

(T = 0–100)

Interviewed adults with attention problems.

The authors used this information to construct

9 subscales based on their clinical experience

(not factor analysis), namely attention,

interpersonal, disorganisation, co-ordination,

academic theme, emotive, long term,

childhood, and negative social. Additionally,

an inconsistency index is included

9. ADHD-RS Self and

informant

report of

current and

childhood

symptom

18 adult items

(2 subscales;

9 inattention

+ 9 hyperactivity)

+ 3 childhood items


rated on a 4 point



symptom rating


item). Also continuous


item responses are

summed.

4/9 adult symptoms on

one or both

subscales + 3/3 on

childhood items

1.5 SD above age group

mean (continuous)

0–18 (symptom

counts). 54

(summed)

Adaptation of children’s ADHD-RS which was

taken from the DSM-IV criteria. Both the child

and adult symptoms on the ADHD-RS are

required for a diagnosis of ADHD

10. BADDS Self report of

current

symptoms.

40 items in

5 subscales.


rated on a 4 point

Likert scale (0-3).

Actual answers are



obtained from the

scoring sheet for each

subscale and the total

T = 50 0–120

(T = 0–100)

Items are based on DSM-IV criteria and the

author’s own observations of ADHD from

several published studies. Scale was piloted

and the data were published in the manual.

Only one cut off as scores did not vary by age

or gender. Five subscales are organisation/

work, attention, energy/effort, mood, and

memory

11. SI Self report of

current

symptoms

18 (2 subscales;

9 inattention

+ 9 hyperactivity/

impulsivity)


rated on a 4 point



symptom rating


item)

6/9 on one or both

subscales

0–18 The authors used the DSM-IV criteria to

develop a scale for use in their own ADHD

clinic

12. YAQ Self report of

childhood

symptoms

112 (4 subscales) Symptom frequency is

rated on an 8 point


Sum responses (score

1–8 per item), divide

by the number of items

in that subscale

None presented. Can use

1.5 SD above presented

mean

1–8 per subscale The author conducted a literature review of

ADHD symptoms which were likely to

diagnose ADHD and co-morbid factors. Four

subscales are ADHD symptoms, emotional,

delinquency, and social

A.

Ta

ylo

ret

al./R

esearch

inD

evelo

pm

enta

lD

isab

ilities3

2(2

01

1)

92

4–

93

89

31

Table 3 (Continued )

Type of scale Items Completion method Scoring method Cut off scores Score range Scale development

13a. ASRS-18

(long)

Self report of

current symptoms

18 (2 subscales;

9 inattention

+ 9 hyperactivity)


rated on a 5 point



symptom rating

(answered 2, 3 or 4

to an item)

Also continuous


item responses are

summed

9/18 across both

subscales

Or 21/36 on either

subscale

0–18 (symptom

count)

72 (summed)

An item pool of ADHD symptoms was

generated. Mapped onto DSM-IV criteria.

Psychiatrists chose the items which best fitted

the DSM-IV criteria

13b. ASRS-6

(short)

Self report of

current symptoms


rated on a 5 point



symptom rating

(answered 2, 3 or 4 to

an item)

Also continuous


item responses are

summed

4/6

or 14/24

0–6 (symptom

count)

0–24 (summed)

Six items from the 18 item ASRS that had the

same strength of association with clinician

diagnosis as the 18 item ASRS. The six items

with the most stable psychometric properties

were chosen for the short ASRS

14. Caterino

Scale

Self/informant

+ Child/adult

18 (in 4 scales,

self report of

child and

adult symptoms)

Rated on a 3 point scale

(0–2) in 4 situations –

As a child, at work, at

home, in social

settings

Continuous scoring,

summation of actual

responses

None presented 0–144 Based on DSM-IV, psychologists chose

behaviours that best met the DSM-IV criteria.

Factor analysed in a group of children and

adults

A.

Ta

ylo

ret

al./R

esearch

inD

evelo

pm

enta

lD

isab

ilities3

2(2

01

1)

92

4–

93

89

32

Table 4

Psychometric statistics retrieved from the studies sorted by scale.

Scale Internal Consistency Inter-informant reliability Test–retest Sensitivity* % Specificity* % TCA * % PPV * % NPV* % AUC* Con k

a SH k r ICC ICC r

WURS-61 0.69–0.91 – – – – 0.68 0.68–0.90 – – – – – – –

WURS-25 0.86–0.92 0.35–0.90 0.72 – 0.88 0.74 0.62–0.98 96 96 – – – – –

WURS-C+A 0.95 – – – – – – 73 58 64.5 – – – –

ARS 0.89 0.86 – – – – – 0.80 92 33 – – – – –

CSS-C 0.75–0.91 – 0.30–0.31 0.55–0.57 – – 0.82 22–43 96–100 – 76–100 32–71 – –

CSS-C+A – – 0.32–0.35 0.56–0.65 – – – – – – – – – –

CAARS 0.74–0.92 – – – – – 0.80–0.91 82 87 85 87 83 – 0.67

APQ – – – – – – – 83 90 – – – – –

YARS 0.86 – – – – – – – – – – – – –

AHA – – – – – – – 80–84 60–67 70 67 75 0.79 0.40

ADSA 0.70–0.93 0.92 – – – – – 58–81 46–94 71–83 78 87 – –

ADHD-RS 0.76–0.88 – – – – – – 71 67–77 – – – 0.72–0.76 –

BADDS 0.69–0.81 – – – – – – 84–92 33 74 76 67 – –

SI 0.91 – – – – – – 78 54 – – – – –

YAQ 0.50–0.98 – – 0.20–0.77 – – – – – – – – – –

ASRS-18 0.75–0.89 – – – – – – 56 98 96 25–82 98.3 0.77 0.22–0.60

ASRS-6 0.63–0.72 – – – – 0.47–0.77 69–39 88–100 84–98 24–57 94 –97 0.79–0.84 0.21–0.52

Caterino Scale 0.81–0.91 – – 0.519–0.661 – – – 0.94 0.87 – 0.87 0.93 – –

C – Childhood Symptoms; A – Adult Symptoms; a – Cronbach’s alpha; SH – Split half reliability; k – Cohen’s kappa; r – Pearson’s correlation coefficient; ICC – intra-class correlation coefficient; TCA – total

classification accuracy; PPV – positive predictive value; NPV – negative predictive value; AUC – area under the curve; Con – concurrent validity; * – at given cut-off scores (see Table 2).

A.

Ta

ylo

ret

al./R

esearch

inD

evelo

pm

enta

lD

isab

ilities3

2(2

01

1)

92

4–

93

89

33

age and gender dependant cut-off scores. This allows cut-off scores to be adjusted depending on the demographics of thepatient being assessed.

Another scoring method employed which is an extension of the continuous scoring method, is used in the CAARS, ADSA

and BADDS, where T values are used to determine the clinical significance of a patient’s score. The score on each subscale andtotal score is entered onto a graphical scoring sheet. Different sheets are used depending on the patient’s age and gender. Thissheet then shows the corresponding T value for that patient’s score. On the CAARS, a T score greater than 65 is clinicallysignificant. On the BADDS, a T score of 50 and on the ADSA a T score of 60–70, may signify ADHD. As with the continuousscoring method, T values allow patient’s scores to be referenced to population means.

3.6. Detailed explanation of study methodology

In 15 studies, it was not clear how participants were recruited from the population. It is unclear in some cases whether ornot participants were excluded on the basis of psychiatric co-morbidity. All of the studies explained the test procedures indetail. In particular, the scales themselves were discussed in detail, particularly how the items were developed, what theitems are and how they are scored.

3.7. Representative population

Healthy university populations, or similarly unrepresentative populations, were used to validate the scale in 16 of thestudies. Whilst these samples could have been representative, the authors did not discuss whether or not this was the scale’sintended population, except in the case of the YARS, which was designed for use in university students (Du Paul et al., 2001).Belenduik et al. (2007) study was considered to have the most unrepresentative population as only females were included.

3.8. Comparison groups

Only four studies used matched ADHD and non-ADHD groups, matched by age and gender (Conners, Erhardt, & Sparrow,1999; De Quiros & Kinsbourne, 2001; Dowson et al., 2004; O’Donnell, McCann, & Pluth, 2001). Nine studies used unmatchedcontrol groups (Caterino, Gomez-Benito, Balleurka, & Amador-Campos, 2009; Erhardt et al., 1999; Fossati et al., 2001; Kessleret al., 2007; McCann et al., 2000; Mehringer et al., 2002; Solanto et al., 2004; Ward et al., 1993; Young, 2004). These includedcontrols with psychiatric co-morbidity, which allows evaluation of the effects of co-morbidities on scale scores.

Twenty-two studies did not have control comparison groups, e.g. an ADHD group and a non-ADHD group. Some of thesestudies did use mixed groups; i.e. some had ADHD and some did not, but they did not report separately on these groups andanalyses covered the entire sample.

3.9. Psychometric properties

The psychometric properties of the scales (including different versions of the scale such as 25-item and 61-item version ofWURS) are summarised in Table 4. For some scales, different values of psychometric statistics are reported in differentstudies and these are presented in Table 4. Some psychometric statistics were not available for some scales. Therefore, nodata could be entered for these in Table 4.

3.9.1. Reliability

All but the APQ, AHA and CSS-C (childhood version) have internal consistency data. The WURS-C+A, (childhood + adultversions) and the SI have the highest Cronbach’s alphas (>0.90). The ADSA has the highest split half reliability (>0.90).

Cohen’s kappa for concordance has been calculated for three scales. The WURS-25 has the highest Cohen’s kappa, at 0.72.The CSS-C+A has the highest Pearson’s concordance, at r = 0.56–0.65. Only one study used ICC; the WURS-25 has excellent

concordance with an ICC of 0.88 (Ward et al., 1993).Test–retest reliability was assessed over different time periods varying from one week to two months. Test–retest

reliability assessed by ICC was calculated for two scales. The WURS-25 has a test–retest reliability ICC of 0.74. The WURS-61

has an ICC of 0.68. Six scales have test–retest reliability measured in Pearson’s coefficients. The CSS-C, ARS and CAARS havehigh test–retest reliability, with Pearson’s coefficients being greater than 0.80.

3.9.2. Validity

Factor analysis was undertaken for 11 scales. The WURS-61 has the highest variance explained by the five factor structure,at 71%. The next highest was the WURS-25 which has a factor structure variance of 60%. Many of the studies assessedconstruct validity and 10 of the scales have construct validity correlations presented. A wide range of different measureshave been used.

Of the four scales that have concurrent validity data, the CAARS performed the best, with a Cohen’s kappa of 0.67. Cohen’skappa for the ASRS-18 is low at 0.22–0.59, ICC is 0.84 and agreement between scale items and interview items is 43–72%.Cohen’s kappa is 0.40 for the AHA. Pearson’s correlations with total ADSA score and the DSM-IV score is 0.22–0.51 for theADSA.


Only 13 of the 17 scale versions have sensitivity scores. The WURS-25, ARS and BADDS have excellent sensitivity (>84%).The ASRS-18, CSS-A, WURS-25 and APQ have excellent specificity (>90%), and the CAARS and ASRS-6 have good specificity (87–88%).

Only seven scales have TCA calculated. The ASRS-18 has the highest TCA at 96%, followed by the ASRS-6 (84%) and theCAARS (87%).

Seven scales have PPV and NPV calculated. The CAARS has the highest PPV at 87%. The ASRS-18 and the ASRS-6 have anNPV greater than 90%. The CAARS and ADSA have PPV greater than 80%.

AUC has only been calculated for four scales. Field (2005) states that AUC should be at least 0.89 but none of the studiesreached this threshold. The AUC for the ADHD-RS, ASRS-18, ASRS-6 and AHA are all between 0.72 and 0.79.

Overall, the WURS-25 has the best combination of psychometric properties, followed by the CAARS and the ASRS-18.However, the WURS-25 has only moderate split half reliability. The CAARS and ASRS-18 have moderate Cronbach’s alphas.Additionally, the ASRS-18 has only moderate sensitivity, positive predictive value and concurrent validity. The APQ, ARS andSI have some good psychometric properties but do not perform as well in other areas.

As can be seen from Table 4, the WURS-25 has a high internal consistency (0.86–0.92), the highest patient-informantconcordance (Cohen’s kappa 0.72; ICC 0.88), a high test–retest reliability (ICC 0.75), a high factor structure variance (60%,showing that variation in score is explained by variation in pathology rather than due to chance), 85% sensitivity and >90%specificity at its given cut off. The CAARS and ASRS-18 also perform well although have only moderate internal consistencies.The APQ, ARS and SI have some good psychometric properties but perform poorly in other areas.

There are insufficient psychometric properties published for some scales. No internal consistency data are published forthe CSS, APQ or the AHA. Test–retest reliability is only available for the WURS, CAARS, CSS and ASRS-6. Table 4 highlights thegaps in psychometric data published.

The CAARS and WURS are well designed scales with good content validity. Both scales have had factor analysis conducted;therefore, only relevant items are included. Both use a combination of items, including the Utah criteria, and the DSM-IV

criteria.

4. Discussion

The main findings of this study are discussed below. Whilst some of the scales are similar, each has their own particularstrengths and weaknesses. By nature, the language used in rating scales can be vague (Barnes, Cerrito, & Levi, 2003). Likertscales ask patients to assess how ‘‘often’’ they manifest a symptom. However, there is no standardised reference point for‘‘often.’’ For instance, one patient may consider often to be once a week, whilst their informant considers it to be once a day.Rating scales often give no advice to people completing them as to what ‘‘often’’ should mean.

No studies were excluded on the basis of study design or because they were of poor quality as only 35 studies wereretrieved. It is possible that many of these studies would have been excluded based on the quality analysis. Studies were notassigned a quality score to ‘‘weigh’’ their findings as this can be inappropriate for diagnostic accuracy studies, particularlywhere meta-analysis is not undertaken. Different methods for study weighting can produce very different conclusions(Whiting, Harbord, & Kleijnen, 2005).

Indeterminate results in these studies were likely to have been caused by incomplete or incorrectly completed scales.Some studies do report having incomplete scales and these results were removed from the analysis. However, theincomplete scales could provide useful information. For instance, where questions are left out, it could indicate they werepoorly understood by the participants. No details were given about missing data or extra participants.

For the symptom count scoring method, a cut off of six of nine symptoms of either inattention or hyperactivity, asspecified in DSM-IV, is used to identify adults with ADHD in many of the DSM-IV criteria based scales. This may beinappropriate for adults, as follow up studies show that whilst the number of ADHD adults with ‘‘clinically significant’’persistent symptoms can be as low as 40%, as many as 90% of these adults still show significant impairment and maybenefit from treatment (Biederman, Mick, & Faraone, 2000; Polanczyk & Rohde, 2007). Heiligenstein et al. (1998) foundthat a cut off of four DSM-IV symptoms was best at discriminating ADHD in adults. Other studies using the DSM-IV

criteria based scales have used similarly lower cut off scores (Kessler et al., 2005; Kooij et al., 2004, 2008; Mehringeret al., 2002).

Many of the studies used small samples, particularly where sub-group analyses have taken place. Whilst it can beacceptable to validate scales in small populations, large samples are needed to confirm the results of these validity studies.Large samples will be more representative of the population and will reduce the likelihood that any observed groupdifferences are due to chance. Only Rossini and O’Connor (1995) undertook a power calculation for their sample size. Themost widely studied scales (WURS, CAARS and CSS) unsurprisingly, had the largest overall sample sizes. The YARS wasvalidated in a large sample of 1209 participants.

Only 18 studies used the gold standard clinical interview for comparison, although it is important to note that theinterview itself has not been extensively validated. In six studies, scale scores were used as part of the ADHD diagnosis, whichis unacceptable (Aycicegi, Dinn, & Harris, 2003; Kooij et al., 2004; McCann & Roy-Byrne, 2004; McCann et al., 2000; Reuter,Kirsch, & Hennig, 2006; Solanto et al., 2004). Verification bias could have been introduced in a further six studies as onlythose who scored highly on the rating scale were interviewed (Conners, Erhardt, & Sparrow, 1999; De Quiros & Kinsbourne,2001; Dowson et al., 2004; Erhardt et al., 1999; McCann et al., 2000; Weyandt, Linterman, & Rice, 1995).


Out of the 14 scales identified, the short version WURS and the CAARS have the best psychometric properties. They are alsothe most widely studied rating scales. However, these results should be interpreted with caution. Many of the studies were ofpoor quality, for instance unrepresentative samples were used and in some cases there was no gold standard reference testfor comparison. Unfortunately, a large proportion of these studies were deemed as poor quality due to poor reporting.

The CAARS and the WURS are not based exclusively on the DSM-IV criteria, unlike many of the other scales. The currentsystematic review therefore, provides some evidence that the current DSM-IV criteria for adult ADHD are perhaps notspecific enough for accurate diagnosis in this group, and that other symptoms should be considered. It supports thesuggestion that field trials for DSM-IV criteria in adults with ADHD should be conducted, so that future diagnosis of adultADHD can be improved (Zwi & York, 2004).

The findings of this study should be considered in the context of the strengths and weaknesses of its design. The protocolwas developed in accordance with guidelines from the Centre for Reviews and Dissemination; therefore, it does have severalstrengths (Khan & Kleijnen, 2001). The most important medical databases were used for the search. Search terms werepiloted and the search had a high sensitivity (88%) but a low specificity (2%).

However, no grey literature e.g. conference reports and unpublished research was retrieved. Therefore, identification biascould have arisen as only positive research might have been identified. However, cross-references were retrieved asappropriate. Also, a meta-analysis of data was not possible.

To the authors’ knowledge, the present paper is the only published systematic review of adult ADHD rating scales. Anumber of journalistic reviews of adult ADHD scales have already been published (Adler et al., 2009; Faraone & Antshel,2008; Murphy & Adler, 2004). Whilst these reviews have significant value, and have reached similar conclusions, theirliterature searches were not as systematic as in this paper, and therefore, may not include all the published literature(Murphy & Adler, 2004). These results provide evidence that further research into the validity of adult ADHD rating scales isrequired. Further research on the WURS and CAARS should be conducted with large samples in order to confirm previousfindings. Rating scales which performed well in some areas, such as the ASRS, SI, ARS, Caterino Scale and APQ, may be usefulbut cannot be reliably used as they have not been independently validated in good quality studies.

Many different psychometric statistics were retrieved for this study. In future, other systematic reviews could focus onone or more of these statistics in more detail, so that a meta-analysis could be performed. For instance, sensitivity andspecificity are good measures of diagnostic accuracy which can be easily compared. A meta-analytic or large scale systematicreview may provide further support for these results.

Rating scales only provide a snapshot of a patient’s life, based on a restrictive symptom list. Therefore, rating scales shouldbe used along with other methods of information gathering such as a direct examination of patients, direct interview withthe patients and where possible informants, and careful examination of case notes before a clinical diagnosis could be made.Rating scales alone, particularly the case detection instruments, should not be relied upon for making a diagnosis.

Conflicts of interest

None.

Acknowledgement

Gemma Unwin is currently supported by the Baily Thomas Charitable Fund.

Appendix A. Search terms

attention deficit hyperactivity disorder$.mp. or exp Attention Deficit Disorder with Hyperactivity/

ADHD.mp.

attention deficit disorder$ with hyperactivity.mp.

hyperkinetic syndrome$.mp.

hyperkinetic disorder$.mp.

attention deficit disorder$.mp.

minimal brain dysfunction.mp.

psychiatric status rating scale$.mp.

self report$.mp.

self disclosure.mp. or exp Self Disclosure/

psychiatric diagnos$.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

question$.mp.

instrument$.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

screening.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

screening tool$.mp. [mp=title, original title, abstract, name of substance word, subject heading word]


screening scale.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

diagnostic tool.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

diagnostic scale.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

assessment$.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

diagnos$.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

exp Diagnosis/

valid$.mp. or exp ‘‘Reproducibility of Results’’/

reliab$.mp.

psychometric propert$.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

reproducib$.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

specific$.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

sensitiv$.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

internal$ consisten$.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

inter-rater.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

test–retest.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

inter-informant.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

factor analysis.mp. [mp=title, original title, abstract, name of substance word, subject heading word]

exp Adult/ or adult$.mp.

1 or 2 or 3 or 4 or 5 or 6 or 7

Questionnaires/ or exp Psychiatric Status Rating Scales/ or rating scale$.mp.

*Psychometrics/cl, di, sn [Classification, Diagnosis, Statistics & Numerical Data]

8 or 9 or 10 or 11 or 12 or 13 or 14 or 15 or 16 or 17 or 18 or 19 or 20 or 21 or 35

22 or 23 or 24 or 25 or 26 or 27 or 28 or 29 or 30 or 31 or 32 or 36

33 and 34 and 37 and 38

References

Adler, L., Faraone, S., Spencer, T., Frederick, W., Reimherr, F., Glatt, S., et al. (2008). The reliability and validity of self- and investigator ratings of ADHD in adults.Journal of Attention Disorders, 11, 711.

Adler, L., Shaw, D., Sitt, D., Maya, E., & Ippolito, M. M. (2009). Issues in the diagnosis and treatment of adult ADHD by primary care physicians. Primary Psychiatry,16, 57–63.

Adler, L. A., Spencer, T., Faraone, S. V., Kessler, R. C., Howes, M. J., Biederman, J., et al. (2006). Validity of pilot adult ADHD Self-Report Scale (ASRS) to rate adultADHD symptoms. Annals of Clinical Psychiatry, 18, 145–148.

American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (DSM-VI) (4th ed.). Washington, DC, USA: American PsychiatricAssociation.

Aycicegi, A., Dinn, W. M., & Harris, C. L. (2003). Assessing adult attention-deficit/hyperactivity disorder: A Turkish version of the Current Symptoms Scale.Psychopathology, 36, 160–167.

Barkley, R. A., Fischer, M., Smallish, L., & Fletcher, K. (2002). The persistence of attention-deficit/hyperactivity disorder into young adulthood as a function ofreporting source and definition of disorder. Journal of Abnormal Psychology, 111, 279–289.

Barkley, R. A., & Murphy, K. R. (1998). Attention deficit/hyperactivity disorder: A clinical workbook (2nd ed.). New York: Guildford Press.Barnes, G. R., Cerrito, P. B., & Levi, I. (2003). An Examination of the variability of understanding of language used in ADHD behavior rating scales. Ethical Human

Sciences and Services, 5, 195–208.Belenduik, K. A., Clarke, T. L., Chronis, M. A., & Raggi, V. L. (2007). Assessing the concordance of measures used to diagnose adult ADHD. Journal of Attention

Disorders, 10, 276–287.Biederman, J., Faraone, S. V., & Mick, E. (2006). The age-dependent decline of attention deficit hyperactivity disorder: A meta-analysis of follow-up studies.

Psychological Medicine, 36, 159–165.Biederman, J., Mick, E., & Faraone, S. V. (2000). Age-dependent decline of symptoms of attention deficit hyperactivity disorder: Impact of remission definition and

symptom type. American Journal of Psychiatry, 157, 816–818.Brown, T. S. (1996). Brown attention deficit disorder scales. TX: The Psychological Corporation.Caterino, L., Gomez-Benito, J., Balleurka, N., & Amador-Campos, J. (2009). Development and validation of a scale to assess the symptoms of attention-deficit/

hyperactivity disorder in young adults. Psychological Assessment, 21, 152–161.Cleland, C., Magura, S., Foote, J., Rosenblum, A., & Kosanke, N. (2006). Factor structure of the Conners Adult ADHD Rating Scale (CAARS) for substance users.

Addictive Behaviors, 31, 1277–1282.Conners, C. K., Erhadt, D., Epstein, J. N., Parker, J. D., Sitarenios, G., & Sparrow, E. (1999). Self ratings of ADHD symptoms in adults I: Factor structure and normative

data. Journal of Attention Disorders, 3, 141–151.Conners, C. K., Erhardt, D., & Sparrow, E. (1999). Conners’ adult ADHD rating scales. New York, USA: Technical Manual: Multi-Health Systems.Daley, D. (2006). Attention deficit hyperactivity disorder: A review of the essential facts. Child Care, Health and Development, 32, 193–204.De Quiros, G. B., & Kinsbourne, M. (2001). Analysis of self-ratings on a behavior questionnaire. Annals of the New York Academy of Sciences, 931, 140–147.Dowson, J. H., McLean, A., Bazanis, E., Toone, B., Young, S., Robbins, T. W., et al. (2004). The specificity of clinical characteristics in adults with attention-deficit/

hyperactivity disorder: A comparison with patients with borderline personality disorder. European Psychiatry, 19, 72–78.Du Paul, G. J., Power, T. J., Anastopoulos, A. D., & Reid, R. (1998). ADHD rating scale-IV: Checklists norms and clinical interpretation. New York: Guildford Press.Du Paul, G. J., Schaughency, E. A., Weyandt, L. L., Tripp, G., Kiesner, J., Ota, K., et al. (2001). Self-report of ADHD symptoms in university students: Cross-gender and

cross-national prevalence. Journal of Learning Disabilities, 34, 370–379.Erhardt, D., Epstein, J. N., Conners, C. K., Parker, J. D., & Sitarenios, G. (1999). Self-ratings of ADHD symptoms in adults II: Reliability, validity, and diagnostic

sensitivity. Journal of Attention Disorders, 3, 153–158.Faraone, S. V, & Antshel, K. M. (2008). Diagnosing and treating attention-deficit/hyperactivity disorder in adults. World Psychiatry, 7, 131–136.Fayyad, J., De Graaf, R., Kessler, R., Alonso, J., Angermeyer, M., Demyttenaere, K., et al. (2007). Cross-national prevalence and correlates of adult attention-deficit

hyperactivity disorder. British Journal of Psychiatry, 190, 402–409.


Field, A. P. (Ed.). (2005). Discovering statistics using SPSS. London, UK: SAGE Publication.Fossati, A., Ceglie, A. D., Acquarini, E., Donati, D., Donini, M., Novella, L., et al. (2001). The retrospective assessment of childhood attention deficit hyperactivity

disorder in adults: Reliability and validity of the Italian version of the Wender Utah Rating Scale. Comprehensive Psychiatry, 42, 326–336.Heiligenstein, E., Conyers, L. M., Berns, A. R., & Smith, M. A. (1998). Preliminary normative data on DSM-IV attention deficit hyperactivity disorder in college

students. Journal of American College Health, 46, 185–188.Katz, N., Petscher, Y., Welles, T., & Welles, T. (2009). Diagnosing attention-deficit hyperactivity disorder in college students: An investigation of the impact of

informant ratings on diagnosis and subjective impairment. Journal of Attention Disorders, 13, 277.Kessler, R. C., Adler, L. A., Ames, M., Demler, O., Faraone, S. V., Hiripi, E., et al. (2005). The World Health Organization adult ADHD self-report scale (ASRS): A short

screening scale for use in the general population. Psychological Medicine, 35, 245–256.Kessler, R. C., Adler, L., Barkley, R., Biederman, J., Conners, C. K., Demler, O., et al. (2006). The prevalence and correlates of adult ADHD in the United States: Results

from the national comorbidity survey replication. American Journal of Psychiatry, 163, 716–727.Kessler, R. C., Adler, L. A., Gruber, M. J., Sarawate, C. A., Spencer, T., & Van Brunt, D. L. (2007). Validity of the World Health Organization Adult ADHD Self-Report

Scale (ASRS) Screener in a representative sample of health plan members. International Journal of Methods in Psychiatric Research, 16, 52–65.Khan, K. S., & Kleijnen, K. (2001). Undertaking systematic reviews of research on effectiveness: CRD’s guidance for those carrying out or commissioning reviews. CRD

Report 4 (2nd ed.). UK: York, York Publishing Services.Kooij, J. J. S., Boonstra, A. M., Swinkels, S. H., Bekker, E. M., de Noord, I., & Buitelaar, J. K. (2008). Reliability, validity, and utility of instruments for self-report and

informant report concerning symptoms of ADHD in adult patients. Journal of Attention Disorders, 11, 445–458.Kooij, J. J. S, Buitelaar, J. K., VanDenOord, E. J., Furer, J. W., Rijnders, C. A., & Hodiamont, P. P. (2004). Internal and external validity of attention-deficit/hyperactivity

disorder in a population-based sample of adults. Psychological Medicine, 35, 817–827.Mancini, C., Ameringen, M. V., Oakman, J. M., & Figueiredo, D. (1999). Childhood attention deficit/hyperactivity disorder in adults with anxiety disorders.

Psychological Medicine, 29, 515–525.Mannuzza, S. (2003). Persistence of attention-deficit/hyperactivity disorder into adulthood: What have we learned from the prospective follow-up studies?

Journal of Attention Disorders, 7, 93–100.Mannuzza, S., Klein, R. G., & Bessler, A. (1998). Adult psychiatric status of hyperactive boys grown up. American Journal of Psychiatry, 155, 493–498.Mannuzza, S., Klein, R. G., Klein, D. F., Bessler, A., & Shrout, P. (2002). Accuracy of adult recall of childhood attention deficit hyperactivity disorder. American Journal

of Psychiatry, 159(11), 1882–1888.McCann, B. S., & Roy-Byrne, P. (2004). Screening and diagnostic utility of self-report attention deficit hyperactivity disorder scales in adults. Comprehensive

Psychiatry, 45, 175–183.McCann, B. S., Scheele, L., Ward, N., & Roy-Byrne, P. (2000). Discriminant validity of the Wender Utah Rating Scale for attention-deficit/hyperactivity disorder in

adults. Journal of Neuropsychiatry and Clinical Neurosciences, 12, 240–245.Mehringer, A. M., Downey, K. K., Schuh, L. M., Pomerleau, C. S., Snedecor, S. M., & Schbiner, H. (2002). The Assessment of Hyperactivity and Attention (AHA):

Development and preliminary validation of a brief self-assessment of adult ADHD. Journal of Attention Disorders, 5, 223–231.Murphy, K. R., & Adler, L. A. (2004). Assessing attention deficit/hyperactivity disorder in adults: Focus on rating scales. Journal of Clinical Psychiatry, 65, 12–17 (S).Murphy, K., & Barkley, R. A. (1996). Prevalence of DSM-IV symptoms of ADHD in adult licensed drivers: Implications for clinical diagnosis. Journal of Attention

Disorders, 1, 147–161.Murphy, P., & Schachar, R. (2000). Use of self-ratings in the assessment of symptoms of attention deficit hyperactivity disorder in adults. American Journal of

Psychiatry, 157(7), 1156–1159.O’Donnell, J. P., McCann, K. K., & Pluth, S. (2001). Assessing adult ADHD using a self-report symptom checklist. Psychological Reports, 88, 871–881.Polanczyk, P., & Rohde, L. A. (2007). Epidemiology of attention deficit/hyperactivity disorder across the lifespan. Current Opinions in Psychiatry, 20, 386–392.Pollak, Y., Kahana-Vax, G., & Hoofien, D. (2008). Retrieval processes in adults with ADHD: A RAVLT study. Developmental Neuropsychology, 33, 62–73.Reuter, M., Kirsch, P., & Hennig, J. (2006). Inferring candidate genes for Attention Deficit Hyperactivity Disorder (ADHD) assessed by the World Health

Organisation Adult ADHD Self-Report Scale (ASRS). Journal of Neural Transmission, 113, 838–929.Rossini, E. D., & O’Connor, M. A. (1995). Retrospective self-reported symptoms of attention-deficit hyperactivity disorder: Reliability of the Wender Utah Rating

Scale. Psychological Reports, 77, 751–754.Solanto, M. V., Etefia, K., & Marks, D. J. (2004). The utility of self-report measures and the Continuous Performance Test in the diagnosis of ADHD in adults. CNS

Spectrums, 9, 649–659.Stein, M. A., Sandoval, R., Szumowski, E., Roizen, N., Reinecke, M. A., Blondis, T. A., et al. (1995). Psychometric characteristics of the Wender Utah Rating Scale

(WURS): Reliability and factor structure for men and women. Psychopharmacology Bulletin, 31, 425–433.Triolo, S. J., & Murphy, K. R. (1996). Attention Deficit Scales for Adults (ADSA): Manual for scoring and interpretation. UK: Bristol, Taylor and Francis.Ward, M. F., Wender, P. H., & Reimherr, F. W. (1993). The Wender Utah Rating Scale: An aid in the retrospective diagnosis of childhood attention deficit

hyperactivity disorder. American Journal of Psychiatry, 150, 885–890.Weiss, G., Hechtman, L., Milroy, T., & Perlman, T. (1985). Psychiatric status of hyperactives as adults: A controlled prospective 15-year follow-up of 63 hyperactive

children. Journal of the American Academy of Child Psychiatry, 24, 211–220.West, S. L., Mulsow, M., & Arredondo, R. (2003). Factor analysis of the Attention Deficit Scales for Adults (ADSA) with a clinical sample of outpatient substance

abusers. American Journal of Addiction, 12, 159–165.West, S. L., Mulsow, M., & Arredondo, R. (2007). An examination of the psychometric properties of the Attention Deficit Scales for Adults with outpatient substance

abusers. American Journal of Drug and Alcohol Abuse, 33, 755–764.Weyandt, L. L., Linterman, I., & Rice, J. A. (1995). Reported prevalence of attentional difficulties in a general sample of college students. Journal of Psychopathology

and Behavioural Assessment, 17, 293–304.Whiting, P., Harbord, R., & Kleijnen, J. (2005). No role for quality scores in systematic reviews of diagnostic accuracy studies. BMC Research Methodology, 5, 19–25.Whiting, P., Rutjes, A. W., Reitsma, J. B., Bossuyt, P. M., & Kleijnen, J. (2003). The Development of QUADAS: A tool for the quality assessment of studies of diagnostic

accuracy included in systematic reviews. BMC Medical Research Methodology, 2, 1–13.Wierzbicki, M. (2005). Reliability and validity of the Wender Utah Rating Scale for college students. Psychological Reports, 96, 833–839.Wilens, T. E., Spencer, T. J., & Biederman, J. (2001). A review of the pharmacotherapy of adults with attention-deficit/hyperactivity disorder. Journal of Attention

Disorders, 5, 189–202.Young, S. (2004). The YAQ-S and YAQ-I: The development of self and informant questionnaires reporting on current adult ADHD symptomatology, comorbid and

associated problems. Personality and Individual Differences, 35, 1211–1223.Zucker, M., Morris, M. K., Ingram, S. M., Morris, R. D., & Bakeman, R. (2002). Concordance of self- and informant ratings of adults’ current and childhood attention-

deficit/hyperactivity disorder symptoms. Psychological Assessment, 14, 379–389.Zwi, M., & York, A. (2004). Attention deficit hyperactivity disorder in adults; validity unknown. Advances in Psychiatric Treatment, 10, 248–269.


Date post:	27-Oct-2016
Category:	Documents
Upload:	abigail-taylor
View:	215 times
Download:	1 times

Scales for the identification of adults with attention deficit hyperactivity disorder (ADHD): A...

Documents