+ All Categories
Home > Documents > CHAPTER 4 PASSING SCORE AND LENGTH OF A MASTERY TEST · 2013. 7. 13. · CHAPTER 4 PASSING SCORE...

CHAPTER 4 PASSING SCORE AND LENGTH OF A MASTERY TEST · 2013. 7. 13. · CHAPTER 4 PASSING SCORE...

Date post: 18-Feb-2021
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
16
Evaluation in Education. 1982, Vol. 5, pp. 149-164 0191-765X/82/020149-1658.00/0 Printed in Great Britain. All rights reserved Copyright © 19~2 Pergamon Press Ltd. CHAPTER 4 PASSING SCORE AND LENGTH OF A MASTERY TEST Wim J. van der Linden Technische Hogeschool Twente, Postbus 217, 7500A E Enschede, The Netherlands ABSTRACT A classical problem in mastery testing is the choice of passing score and test length so that the mastery decisions are optimal. This problem has been addressed several times from a variety of view-points. In this paper the usual indifference zone approach is adopted with a new criterion for optimizing the passing score. It appears that, under the assumption of the binomial error model, this yields a linear relationship between optimal passing score and test length, which subsequently can be used in a simple procedure for optimizing the test length. It is indicated how different losses for both decision errors and a known base rate can be incorporated in the procedure, and how a correction for guessing can be applied. Finally, the results in this paper are related to results obtained in sequential testing and in the latent class approach to mastery testing. The notion of a mastery test has arisen in the context of modern learning strategies such as learning for mastery and individualized instruction, where at several points in the instructional process decisions have to be made whether students have reached certain learning objectives or not. In most instances, this involves the administration of criterion-referenced tests and the use of decision rules assuming the form of passing scores on the test. Students with test scores exceeding the passing score are considered having reached the learning objectives (the "masters"); they are allowed to proceed with the unit or to take up a subsequent course. Students below the passing score (the "nonmasters") have to relearn the unit and to prepare for a new test. A usual conceptualization in the area of mastery testing is that of tests as samples randomly drawn from a domain of tasks covering a well-defined learning objective. Mostly, the concern is then with the proportion of correct item responses, E, say, to be expected when the entire domain would have been administered. Let ~m denote the passing score on this domain score variable ("mastery score"j, X the number of items correct, and c the passing score on the test. A student is a true master if ]I > ~rn and a nonmaster otherwise, but mastery is declared if X > c and nonmastery if 149
Transcript
  • Evaluation in Education. 1982, Vol. 5, pp. 149-164 0191-765X/82/020149-1658.00/0 Printed in Great Britain. All rights reserved Copyright © 19~2 Pergamon Press Ltd.

    CHAPTER 4

    PASSING SCORE AND LENGTH OF A MASTERY TEST

    W i m J. van der Linden

    Technische Hogeschool Twente, Postbus 217, 7500 A E Enschede, The Netherlands

    ABSTRACT

    A classical problem in mastery test ing is the choice of passing score and test length so that the mastery decisions are optimal. This problem has been addressed several times from a var ie ty of v iew-points. In this paper the usual indi f ference zone approach is adopted with a new cr i ter ion for opt imizing the passing score. It appears that , under the assumption of the binomial e r ro r model, this yields a l inear relat ionship between optimal passing score and test length, which subsequent ly can be used in a simple procedure for opt imizing the test length. It is indicated how d i f fe ren t losses for both decision er rors and a known base rate can be incorporated in the procedure, and how a correct ion for guessing can be appl ied. Final ly, the results in this paper are related to results obtained in sequential test ing and in the latent class approach to mastery test ing.

    The notion of a mastery test has arisen in the context of modern learning strategies such as learning for mastery and ind iv idual ized inst ruct ion, where at several points in the instruct ional process decisions have to be made whether students have reached certain learning objectives or not. In most instances, this involves the administrat ion of c r i te r ion- re ferenced tests and the use of decision rules assuming the form of passing scores on the test. Students with test scores exceeding the passing score are considered having reached the learning objectives (the "masters") ; they are allowed to proceed with the uni t or to take up a subsequent course. Students below the passing score (the "nonmasters") have to relearn the uni t and to prepare for a new test.

    A usual conceptual ization in the area of mastery test ing is that of tests as samples randomly drawn from a domain of tasks cover ing a wel l -def ined learning object ive. Mostly, the concern is then with the propor t ion of correct item responses, E, say, to be expected when the ent i re domain would have been administered. Let ~m denote the passing score on this domain score var iab le ("mastery score" j , X the number of items correct , and c the passing score on the test. A student is a t rue master if ]I > ~rn and a nonmaster otherwise, but mastery is declared if X > c and nonmastery if

    149

  • 150 Wire J. van der Linden

    X < c. A c lassical p rob lem in mas te ry t e s t i n g is to choose a va lue n* f o r the tes t l eng th n and a va lue c* f o r the pass ing score on the tes t c such t h a t , f o r a g i ven va lue of Tm, the m a s t e r y dec is ions are op t ima l .

    Severa l au tho rs have add ressed the above p rob lem, all us ing one of t he b inomia l models f o r r e l a t i n g tes t scores , X , to domain scores, I t . Mi l lman (1972, 1973), f o r examp le , assumes t h a t the s imple b inomia l model

    n x x p (x ) = (x)~T (1-~T) n- ( I )

    can be used fo r th is p u r p o s e and p r o v i d e s tab les wh ich f o r a chosen va lue of ~i;!, t es t l eng th , and pass ing score, d i s p l a y the p r o b a b i l i t y t ha t a person w i t h a g i ven domain score is c lass i f ied c o r r e c t l y or" i n c o r r e c t l y . Us ing these tab les , i t is poss ib le to op t im ize pass ing score and tes t l eng th s imu l t aneous l y f o r a se lected domain score. Comparab le approaches have been fo l l owed by K lauer (1972) and Kr iewa l l (1972).

    Fhaner (1974) i n t r o d u c e d the not ion of an i n d i f f e r e n c e zone in t i le p r e s e n t p rob lem. An i n d i f f e r e n c e zone ar ises when the mas te ry score , Rm ' is rep laced by an i n t e r v a l , ( ~ 0 , ~ 1 ) , so t ha t examinees w i t h 11 > ~1 are cons ide red a mas te r , those w i th II < T 0 a nonmas te r , and we are i n d i f f e r e n t w i t h respec t to examinees w i th 70 < 11 < ~1. The i n t e r v a l may be taken symmet r i c abou t ~m, b u t th is is not necessary . For t r u e masters and nonmasters t he p r o b a b i l i t y of a m isc lass i f i ca t ion is l a rges t f o r domain scores ~i and ~0, r e s p e c t i v e l y . Fhaner p roposed as a so lu t ion to choose the min imum va lue of ~l and a va lue of c f o r wh ich the p r o b a b i l i t i e s of m isc lass i f i ca t ion

    n n x l_~To)n-x

    ,~ : ~ (x)~o ( (2) X=C and

    c-i lX(I-~T : ~. (n)~T i )n -x

    x=O (3)

    As van den B r i n k and Koele (1980) have po in ted ou t , i t is poss ib le to c o r r e c t the above so lu t ion f o r the p o s s i b i l i t y of guess ing on m u l t i p l e - c h o i c e o r t r u e - fa lse i tems. To p e r f o r m th is c o r r e c t i o n , t h e y adop t the know ledge o r random guess ing model and s imp ly rep lace the p a r a m e t e r ~ in the b inomia l model by

    ~g + g(1 - ~g), (4)

    are not l a r g e r than p r e a s s i g n e d va lues ~* and 13". Th is is no c l o s e d - f o r m so lu t i on , and b inomia l tab les must be e n t e r e d to f i nd the opt imal va lues of n and c. I t is poss ib le to use a normal a p p r o x i m a t i o n , h o w e v e r , and in t h a t case a c l o s e d - f o r m so lu t ion is ob ta ined (Fhaner, 1974). Wilcox (1976) has a d o p t e d the same approach and has sugges ted c o m p u t e r search rou t i nes us ing the incomp le te beta f u n c t i o n to f i nd the so lu t ion fo r the b inomia l case.

  • Passing Score and Length of a Mastery Test 151

    7 being the domain score corrected for guessing and g the guessing p~Jrameter. For the lat ter they suggest the use of the reciprocal of the number of item al ternat ives, q-1

    Novick and Lewis (1974) and de Gru i j te r (1979) present a Bayesian approach to the present problem extending the model with the beta d is t r ibu t ion as a p r io r for the binomial parameter 7.

    An extensive review of methods for determining the length of a master test is given in Wilcox (1980).

    As several authors ( e . g . , Wilcox, 1976) have noted, d i f fe ren t preassigned values ~* and 8* can be selected to allow for di f ferences in loss between misclassifying a t rue master and a nonmaster. In this paper, we wil l take a somewhat d i f fe ren t approach and represent possible di f ferences in loss not i nd i r ec t l y - -by manipulat ing both probabi l i t ies of misc lassi f icat ion--but via the in t roduct ion of exp l ic i t parameters. But before doing so, we wil l prepare this approach and consider the case of a decision-maker who is ind i f fe ren t to both classif ication er rors . It appears that in this case there is a simple l inear relat ion between the optimal passing score and test length. This can be ut i l ized to f ind the solution in this par t icu lar case but also plays an important par t in the more general case of d i f fe ren t losses.

    INDIFFERENCE TO BOTH CLASSIFICATION ERRORS

    In the Wilcox solution a number P* , ½

  • 152 Wire J. van der Linden

    We f i r s t assume n to be f i xed and look for" the va lue of c min imiz ing the new t a r g e t func t ion de f ined in (5) . Doing so, the constant fac to r 1/2 may be ignored . Add ing terms to the f i r s t sum in (5) and sub t rac t i ng these from the second y ie lds

    n (6) ( ~ ) ~ i X ( l - ~ l ) n - x

    x=O

    n [ )n-x X=C

    x ] - :T O ( l - ~ O ) n - x .

    Since the f i r s t sum is equal to 1, the va lue of c fo r which (6) is minimal depends on l y on the b racke ted fac to r in the second sum. We know tha t the binomial p r o b a b i l i t y func t ion has monotone l i ke l ihood in x (Fe rguson , 1967, sect ion 5 .2 ) , which implies tha t the rat io ~ l X ( 1 - ~ l ) n - x / ~ 0 x ( 1 -~0 )n-x is monotone increas ing in x So the re is a va lue of x for" which the sign of th is fac to r changes from negat ive to pos i t i ve . If we set c equal to th is va lue , the second sum in (6) conta ins all pos i t i ve terms and (6) is minimal. Thus , (G) is minimal fo r the va lue c* obey ing

    "~ ~ )n_C ~ C,,~ ,,n~c c ~Ot ,-T~o) = "TT i ( l-Tr I

    I oga r i t hm iz ing both sides and s imp l i f y i ng , i t appears tha t

    In 1-~0

    c • I -~ 1

    n ~i(i_~0) I n - -

    TO(I -~ I )

    (7)

    Th is resu l t is most i n te res t i ng : The l e f t - h a n d side is the opt imal va lue of c expressed as a re la t i ve score. The r i g h t - h a n d side is a cons tan t which is i ndependen t of test length and on l y a func t i on of the b o u n d a r y values of the i nd i f f e rence zone. Thus , w h e n e v e r an i nd i f f e rence zone is es tab l i shed, we can eas i ly compute (7) f rom its b o u n d a r y values and i m m e d i a t e l y know the o p t i m a l p a s s i n g score f o r any tes t l e n g t h .

    We now use the l inear re la t ion between c* and n to f i nd an opt imal va lue , n* , fo r the l a t t e r and t h e r e b y fo l low a simple p rocedure analogous to the one in the Wilcox so lu t ion . F i rs t , a number p* , ½< p*

  • Passing Score and Length of a Mastery Test 153

    smallest value of n for which (5) is not larger than P*. Second, the ratio c * / n is computed from the ind i f ference zone via (7). T h i r d , a t r ia l value for n* is chosen, and (de)cumulat ive binomial tables are entered wi th th is value and the implied value of c* to compute (5). Fourth, i f th is computation yields a value smaller than P*, a lower t r ia l value for n* is selected and step three is carr ied out again. For values of (5) larger than P*, a larger t r ia l value is selected. This process is repeated unt i l the smallest value of n is found for which (5) is not larger than P*. This is n*.

    In the above procedure, t r ia l values for n* may be chosen not invo lv ing an in teger value for c * . As fol lows from (6), in that case the f i r s t in teger value above c* must be used. (The choice of integer" value below c* would imply adding negative terms to the second sum in (6) making th is subopt imal . )

    A l though binomial tables for values of n up to 150 are avai lable (Ordnance Corps, 1952), tables in most textbooks do not go f u r t h e r than n =20 . I t is known, however, that ind i f ference zone methods are ra ther conservat ive and that , for st rong requirements on (5) or narrow ind i f ference zones, values of n* larger than 20 can be expected. (For an impression, see Table 1 in Fhaner, 1974.) When longer" tests are needed and no special tables are avai lable, one has to resort to a computer or a calculator for the above procedure. The programming involved is comparat ively simple, though, and some calculators possess even faci l i t ies for computing binomial probabi l i t ies.

    Another poss ib i l i t y is to use an approximat ion to the binomial d i s t r i bu t ion func t ion which is simple enough for hand calculat ion. A s t ra igh t fo rward approximat ion, based on the central l imit theorem, is to replace (5) by

    @{ [Trl ( 1_.n_1)/n ] i/2 J [ ZO (1---_~0)/'--n ] i/2~1 / 2 ' (8)

    denot ing the standard normal d i s t r i bu t ion func t ion . Using this normal approximat ion, we need not compute (8) completely for each t r ia l value for n*. Subs t i tu t ing (7) into (8), i t appears that th is can be wr i t ten as

    [ @(an I / 2 ) + @ ( b n l / 2 ) ] / 2 , (9) with

    a 1,1 1t1/2f 1 n 0,1 1 10,

  • 154 Wire J. van der Linden

    and

    n In - - - - (11) b - o(i- o o(1- l)J

    The last two expressions depend only upon the ind i f ference zone boundar ies, ~0 and ~1 • Once a and b are calculated from these boundar ies, the i tera t ive procedure can be applied d i rec t l y to (9). The reader who is famil iar wi th the cumulat ive normal d i s t r i bu t ion can use wel l -known reference values as, for example, ~ ( -1 )= .1587, ~(0) = .5000, and ~ ( i ) = .8413, to qu ick ly establ ish whether t r ia l values for n* meet the res t r ic t ion P* imposed on (9).

    I t is known that the normal approximation in (8) can be less accurate, notably when it is used for approx imat ing tail probabi l i t ies of skew binomial d i s t r i bu t i ons . This si tuat ion arises when both ind i f ference zone boundaries are larger than .70, say, and strong requirements are imposed on (5). A va r ie ty of bet ter approximat ions are given in Ivlolenaar (1973). When choosing one, we are, however, faced with a diletnma. General ly , the more accurate the approximat ion, the more cumbersome its calculat ion. Most approximat ions can be used in combination wi th i te ra t ive procedures for test length determinat ion only i f one has access to a computer, but in that case the procedure can as well be carr ied out d i rec t l y wi th (5). A reasonable accurate approximat ion to (5), which is not too complex, is

    [¢ I ( 4c + 3) 1/2 ( I - ~ i ~ / 2 - ( 4 n - 4 c - 1 ) I / 2 ~11/2 I +

    ~> i ( 4 n - 4c - 1 ) I / 2 ~ 0 1 / 2 - ( 4 c + 3 ) i / 2 ( I - ~ 0 ) i / 2 I ] / 2

    (12)

    ( for a discussion and some numerical resul ts , see Molenaar, 1973, eq. 3.20, pp 111-114). I t is recommended that th is approximat ion be used when strong requirements are imposed on (5) and the choice of the ind i f ference zone entai ls skew binomial d i s t r i bu t ions . In order to reduce the calculat ions, a good s t ra tegy is to use (8) unt i l i t gives a solut ion and next to use (12) to f ind out whether i t can be improved.

    As noted before, the wor th of the procedure proposed in th is paper lies in the ease wi th which binomial tables can be consul ted. I t ut i l izes a simple l inear relat ion between optimal passing score and test length so that for each t r ia l value fo r n* resul ts for only o n e passing score need to be obtained. In

  • Passing Score and Length of a Mastery Test 155

    the other indi f ference zone methods several t r ia l values for c* must be tested for each t r ia l value for n* unt i l the combination (c*, n * ) meeting the requirements is found. The fact that c* has a simple relation to n gives (7) value in its own r ight . I t can be used, for example, to f ind optimal passing scores on new tests when test length has to be f ixed for some practical reason, or to establish whether passing scores that have already been used in practice sat isfy the opt imal i ty condition considered in this paper. As wi l l be shown in the next section, another advantage of the present procedure is the possib i l i ty of incorporat ing d i f ferent losses for false posit ive and false negative decisions.

    DIFFERENT LOSSES FOR BOTH DECISION ERRORS

    So far i t has been assumed that the loss incurred for a false posit ive decision (grant ing the mastery status to a nonmaster) is equal to the loss for a false negative decision (grant ing the nonmastery status to a master). We now assume that both losses take on d i f fe rent values and incorporate this in the procedure by replacing (5) by

    c~I n x X~C n)ITo(1-TrO )n-x] (CO + "tl )-I' £1 x=O (x)~T1 (1-TT1)n-x + £0 = (X (13)

    where ~0 is the loss of misclassifying a nonmaster and £1° f a master.

    Following the same der ivat ion as before, (13) is minimal for the value c ° given by

    I-~T 0 I n - I n k

    c ° I-~T 1 n- = + ' (14)

    In ~TI(I-ITo) n]n ~TI(I-TTo)

    ITo ( I-~ 1 ) ~To ( I-~T 1 )

    X denoting the loss ratio ~0/£1.

    Comparing (14) with (7), several things can be noted: The r ight -hand side of (14) displays an addi t ive s t ruc ture consisting of two d i f ferent parts. The

  • 156 Wim J. van der Linden

    f i r s t part is equal to (7), and thus again a constant dependent only upon the indi f ference zone boundar ies; the second part represents the inf luence of the loss ratio on the optimal passing score and is, as opposed to the f i r s t part , dependent on test length. YVhen d i f fe rent losses for both misclassifications have to be reckoned wi th, the optimal passing score (expressed as a relat ive score) is thus equal to the one for the case of equal losses plus a test length dependent correct ion. For loss ratios larger than one this correct ion is posi t ive, whi le it is negat ive for ratios smaller than one.

    It should be noted that in the second term of (14) test length f igures only in the denominator. This implies that the longer the test is the smaller the absolute size of the inf luence of the loss rat io on the optimal re lat ive passing score wil l be. Table 4.1 shows this for loss rat io values from 1:4 to 4:1. For example, the relat ive passing score on a ]0-i tem test must be raised by .173 to account for a loss ratio ~-- 3, while this is only .035 for a 50-item test.

    TABLE 4.1: Increase of Optimal Relative Passing Score Produced by Loss Ratios Unequal to One for Some l~lues of n and (~O,TI) : ( .75, .85)

    ~:~0/~i

    n .25 .33 .50 l 2 3 4

    I0 -.218 -.173 -.109 .000 .I09 .173 .218

    20 -.109 -.086 -.054 .000 .054 .086 .109

    30 -.073 -.056 -.036 .000 .036 .056 .073

    40 -.054 -.043 -.027 .000 .027 .043 .054

    50 -.044 -.035 -.022 .000 .022 .035 .044

    60 -.036 -.029 -.018 .000 .018 .029 .036

  • Passing Score and Length of a Mastery Test 157

    In view of the determinat ion of optimal test length, i t is helpfu l to rewr i te (14) into

    ( I -~ 0) In ~ In

    o (1-~ I ) c = n + (15)

    ]n ~i(1-~0) In ~i(i-~0) ~ 0 ( i - ~ i ) ~ 0 ( I - ~ i )

    This expression again shows a l inear relation between c ° and n. It has (7) as slope and this time a non-zero in tercept which is a funct ion of the loss ratio.

    Table 4.2 shows values of th is in tercept for loss ratio values from 1:4 to 4:1 and indi f ference zones which can often be encountered in the pract ice of mastery test ing. The entr ies in th is table are thus to be added to the optimal passing score for the equal loss case, c * , when loss ratios unequal to one are used.

    The determinat ion of optimal test length proceeds along the same lines as in the previous section. F i rs t , the number P° is selected as the upper bound to (13). Its minimum value is no longer equal to 1/2. ( In the previous section, th is value could always be realized by randomly assigning the examinees to the mastery and nonmastery s tate. ) Now it is equal to

    max {~0/(~0+~i) , -~ i / ( .~0+~! i )_}, (16)

    these two values being obtained by always assigning the examinees to the mastery and nonmastery state, respect ive ly . Second, the slope and in tercept in (15) are computed. (For the lat ter Table 4.2 can be used.) T h i r d , a t r ia l value for n °, the optimal test length, is chosen, and the associated value of c ° is computed from (15). Binomial tables are entered wi th the value of c ° to obtain the (de)cumulat ive probabi l i t ies in (13), and once these are found (13) is computed. Four th, the value computed for (13) is compared wi th P° . I f i t is smaller ( la rger ) than P° , a lower- ( larger) t r ia l value fo r n ° is selected, and the previous step is repeated. The process is stopped when the smallest value for n is met for" which (13) is not larger than P°. This is n o .

  • 158 Wire J. van der Linden

    TABLE 4.2: Increase of Optima~ Passina Scone Produced by Lose :~ati.os Uneaual to One for Some Indifference Zones

    ~=CO/Cl .25 .33 .50 l 2

    (

    (

    (

    (

    (

    ( 65,

    ( 70,

    ( 70,

    ( 70,

    (.75,

    (.75,

    (.75,

    ( .80,

    (.80,

    ( .80,

    60, . 65 ) - 6 . 4 9 1 - 5 . 1 4 4 -3.245 .000 3.245 5 . 1 4 4 6.491

    60, . 7 0 ) -3.138 - 2 . 4 8 6 -I.569 .000 1.569 2 . 4 8 6 3.138

    60, . 75 ) -2.000 -I.585 -l.O00 .000 l.O00 1 . 5 8 5 2.000

    65, . 70 ) - 6 . 0 7 3 - 4 . 8 1 3 -3.037 .000 3.037 4 . 8 1 3 6.073

    65, . 75 ) -2.891 -2.291 -I.445 .000 1.445 2.291 2.891

    .80) -I.807 - 1 . 4 3 2 -0.903 .000 0.903 1 . 4 3 2 1.807

    .75) - 5 . 5 1 6 - 4 . 3 7 1 -2.758 .000 2.758 4 .371 5.516

    .80) -2.572 -2.038 -I.286 .000 1.286 2 . 0 3 8 2.572

    .85) -I.562 -I.238 -0.781 .000 0.781 1 . 2 3 8 1.562

    .80) -4.819 - 3 . 8 1 9 -2.409 .000 2.409 3 . 8 1 9 4.819

    .85) -2.180 - 1 . 7 2 7 -I.090 .000 1.090 1 . 7 2 7 2.180

    .90) -I.262 -l.O00 -0.631 .000 0.631 l . O 0 0 1.262

    .85) -3.980 -3.154 -I.990 .000 1.990 3 . 1 5 4 3.980

    .90) -l.710 -I.355 -0.855 .000 0.855 1 . 3 5 5 1.710

    .95) -0.890 - 0 . 7 0 5 -0.442 .000 0.442 0 . 7 0 5 0.890

  • Passing Score and Length of a Mastery Test 159

    When a normal approximation is needed, we replace (8), analogous to (13), by the weighted average of (de)cumulat ive normal probabi l i t ies, Subst i tut ion of (14) in this new target funct ion results in

    [~1@(anl/2 cn-1/2) Lo@(bnl/2 dn-I/2~ + )-1 + + - (40 C1 (17)

    with a andb given by (10) and (11), respect ively, and c and d by

    70( 1-z I) ] and

    I d = I 0(1- 0) I In 1TI(1-Tr0) ]n t (19) ~0( i -~ I )

    In the above procedure, we f i rs t compute a, b, c and d from ~G,=I , and >,, and next subst i tu te our tr ial values for n ° d i rec t ly into (17). If necessary, we can use t i le approximation in (12) to f ind out whether the solution thus obtained can be made more accurate.

    INCORPORATING A KNOWN BASE RATE

    If a pr ior i knowledge is available, for instance, from previous test ing programs or experiences with comparable groups of students, it may be p rudent to incorporate this in the decision procedures as well.

    Ignor ing the examinees in the indi f ference zone for a while, let u denote the proport ion of masters so that 1 - u equals the proport ion of nonmasters. A possible approach is to use I~ and 1 - ~ as weights in the target funct ion and to replace (13) by

    C-1 ( ) ~ i X ( l - ~ l )n-x +

    PLI x=O

    n x ] [ O] - I " P~'I + ( l - p ) ~, (I-P)LO x~c (n)zoX( l -zO)n"

    (zo)

  • 160 Wim J. van der Linden

    Following the same der iva t ion as before, the optimal passing score c' proves to be given by

    1-TF 0 I-U In In ~. I n - - -

    c' I-TTI _ = + + ( 2 1 )

    n ~ l ( l _ ~ 0 ) TTI(I_~TO) .~I(I_~TO ) In nln nln

    ITo ( I-TT 1 ) 70 ( 1-~ I ) ~0 ( I-~T i )

    This resu l t is equal to (14) extended wi th a term conta in ing the base rate ( 1 - u ) / p. For base rate values la rge r than one, th is term is pos i t i ve , whi le i t is negat ive fo r values smaller than one.

    The roles p layed by ( 1 - p) /p and ~ in (21) are f u l l y ident ica l . For a quan t i t a t i ve impression of the last term in (21), Tables 4.1 and 4.2 can be consul ted wi th (1 - u)/lJ subs t i t u ted fo r ~.

    To f ind the optimal test length in the present case with an exp l i c i t base rate, the same procedure as in the prev ious section can be fo l lowed. Even the same formulae (and Table 4.2) may be used. This stems from the fact tha t the last two terms in (21) can be reduced to the same denominator , whereupon (21) has the same s t r u c t u r e as (14). The on ly modi f icat ions needed are the subs t i tu t ion of ~(1-p)/p fo r ~ and the replacement of (16) by

    max { (1-~)~ ,O/ [ ( l - l J )~ ,O÷~ ~ , u~iE(1-!J)£O+P~.l] } , (22)

  • Passing Score and Length of a Mastery Test 161

    which now is the minimum value of the upper bound P' to (20):

    Guessing

    As noted ear l ier , van den Br ink and Koele (1980) proposed to use the knowledge or random guessing model to correct Fhaner's (1974) approach for the poss ib i l i t y of guessing on mul t ip le-choice or t rue- fa lse items. The same can be done in the approach given in th is paper. We then f i r s t establ ish the ind i f ference zone on the ab i l i t y scale corrected for guessing, i . e . , as (~g0, ~gl), and next apply t ransformat ion (4) to obtain the values ( ~ 0 , ~ 1 ) wi th which we enter the formulae given in th is paper.

    It should be noticed, however, that exper ience with the knowledge or random guessing model in item response theory shows guessing parameter values somewhat less than the reciprocal of the number of a l ternat ives, q - l ( L o r d , 1980). For example, items wi th four a l ternat ives t yp ica l l y resu l t in values of .22 or .23 rather than .25. I t is recommended that th is be taken into account when set t ing the guessing parameter value.

    Discussion

    The results presented in th is paper relate to results obtained in two other areas.

    The f i r s t is the latent class approach to mastery test ing. In th is approach i t is assumed that mastery and nonmastery are two latent states under l y ing the test score, each enta i l ing d i f fe ren t probabi l i t ies of a successful rep ly to the items. In Emrick's latent class model (Davis, Hickman, E, Novick, 1973, pp. 32-47; Emrick, 1971; Emrick & Adams, 1969; Fr ick, 1974; Macready ~ Dayton, 1977), two latent success probabi l i t ies are assumed, one represent ing the mastery and the other the nonmastery state. Emrick and Adams (1969) give an optimal passing score which, a l though der ived and presented in a d i f fe ren t way, is qu ick l y seen to be equiva lent to c' g iven in (21).

    Th is equivalence is only formal, however. In Emrick's model the latent success probabi l i t ies , which correspond wi th 70 and ~1 in (21), must be estimated from the test data. (For estimation procedures for Emrick's model and constrained versions thereof , see van der Linden, 1981a, 1981b, 1981c). In th is paper, ~0 and ~1 represent no latent classes and need not be estimated; they are boundary values of an ind i f ference zone on the domain score cont inuum which are set on educational grounds.

    Fricke (1974) has given proofs that the correct ion of Emrick's passing score needed for loss ratio and base rates unequal to one are independent of the base rate and the loss ratio, respect ive ly , and of the test length. However, th is fol lows immediately from inspect ing the s t ruc tu re of (21) which can be viewed as a l inear decomposition of c ' . Van der Linden (1978) has proposed a correct ion for guessing for Macready and Dayton's (1977) version of Emrick's model which corresponds wi th the correct ion for guessing proposed in the previous section.

  • 162 Wim J. van der Linden

    The formal correspondence between Emrick's passing score and (21) suggests the use fo r Emrick's model of the procedure for test length opt imizat ion developed in th is paper. The only d i f ference is then, of course, that the success parameters 79 and ~1, in (20) (21) must be estimated before the procedure can be appl ied and that , consequent ly , no exact but estimated results are obtained.

    The second area to which the resul ts in this paper relate is Wald's sequential p robab i l i t y rat io test for binomial populat ions. Several expressions in Wald (1947) are reminiscent of the formulae given in th is paper. For example, formula (15) is equ iva lent to the cr i t ica l numbers in the test of ~ ,< ~0against

    >-71 (Wald, 1974, eqs. 5.1 5 .2) . The on ly exception is that the loss ratio :k is replaced by a ratio based on the probabi l i t ies of errors of type I and I I . I t must be borne in mind, however, that , just as in the previous case, th is equivalence is on ly formal and that d i f fe ren t in te rpre ta t ions are invo lved. In sequential test ing test length , or, genera l ly , the number of observat ions, is a random var iable, and sampling is not stopped unt i l one of the cr i t ical numbers is exceeded. T i le purpose of th is paper was to f ind an optimal test length which is f ixed p r io r to t i le test adminis t rat ion. I t should be real ized, however, that when sequential test ing strategies are possible th is is cer ta in ly wor th cons ider ing, since substant ia l savings in the number of test items needed can be expected (Wald, ]947, section 3 .6 ) .

    REFERENCES

    de Gru i j te r , D.N.M. On the minimum number of items for pass~fail decisions in cr i ter ion- re ferenced test ing (Memorandum 514-79). Leiden, The Nether lands: R i j ksun ive rs i te i t Leiden, Bureau Onderzoek van Onderwi js , 1979.

    Emrick, J .A . An evaluat ion model for mastery test ing. Educational Measurement, 8, 321-326, 1971.

    Journal of

    Emrick, J . A . , ~, Adams, E.N. An evaluation model for indiv idual ized ins t ruct ion (Report RC 2674). Yorktown Hts, N . Y . : IBM Thomas J. Watson Research Center , October 1969.

    Ferguson, T .S . Mathematical statist ics: A decision-theoret ic approach. New York : Academic Press, 1967.

    Fhan~r, S. Item sampling and decis ion-making in educational tes t ing. Br i t ish Journal of Mathematical and Statist ical Psychology, 27, 172-175, 1974.

    Fr icke, R. Zum Problem von Cut -o f f Formeln bei lehrz ie lo r ien t ie r ten Tests. Unterr ichtswissenschaft , 3, 43-56, 1974.

  • Passing Score and Length of a Mastery Test 163

    Klauer, K.J. Zur Theorie und Praxis des binomialen Modells lehrzielorientierter Tests. In K.J. Klauer, R. Fricke, M. Herbi9, H. Rupprecht, ~, F. Schott, L e h r z i e l o r l e n t i e r t e Tests . Dusseldorf: Schwann, 1972.

    Kriewall, T.E. Aspects of applications of criterion-referenced tests. Illinois School Research, 9, 5-18, 1972.

    Lord, F.M. Applications of item response theory to practical testing problems. Hillsdale, N.J.: Erlbaum, 1980.

    Macready, G.R. 8 Dayton, C.M. The use of probabil istic models in the assessment of mastery. Journal of Educational Statistics, 2, 99-120, 1977.

    Millman, J. Tables for determining number of items needed on domain- referenced tests and numbers of students to be tested (Technical Paper No. 6). Los Angeles, California: Instructional Objectives Exchange, Apri l 1972.

    Millman J. Passing scores and test lengths for domain-referenced measures. Review of Educational Research, q], 205-216, 1973.

    Molenaar, W. Approximations to the Poisson, binomial and hypergeometric distribution functions (Mathematical Centre Tract No. 31). Amsterdam, The Netherlands: Mathematisch Centrum, 1973.

    Novick, M.R. 8 Lewis, C. Prescribing test length for criterion-referenced measurement. In C.W. Harris, M.C. Alkin, ~, W.J. Popham, Problems in criterion-referenced measurement (CSE Monograph Series in Education, No. 3). Los Angeles: Center for the Study of Evaluation, Universi ty of California, 1974.

    Ordnance Corps. Tables of the cumulative binomial probabilities, Document ORDP 20-1 of the Ordnance Corps of the U.S. Army.

    van den Brink, W.P. E, Koele, P. Item sampling, guessing, and decision- making in achievement testing. Brit ish Journal of Mathematical and Statistical Psychology, 33, 104-108, 1980.

    van der Linden, W.J. Forgetting, guessing, and mastery: The Macready and Dayton models revisited and compared with a latent t ra i t approach. Journal of Educational Statistics, 3, 305-318, 1978.

    van der Linden, W.J. Estimating the parameters of Emrick's mastery testing model. Applied Psychological Measurement, 5, 517-530, 1981 (a).

    van der Linden, W.J. The use of moment estimators for mixtures of two binomials with one known success parameter. Submitted for publication, 1981 (b).

  • 164 Wim J. van der Linden

    van der Linden, W.J. On the estimation of the proport ion of masters in c r i te r ion- re ferenced test ing. Submitted for publ icat ion, 1981 (c).

    Wald, A. Sequential analysis New York: John Wiley & Sons, 1947.

    Wilcox, R.R. A note on the length and passing score of a mastery test. Journal of Educational Statistics, 1, 359-364, ]976.

    Wilcox, R.R. Determining the length of a c r i te r ion- re ferenced test. Appl ied Psychological Measurement, q, 425-446, 1980.


Recommended