+ All Categories
Home > Documents > Chapter 2(principles of language assessment)

Chapter 2(principles of language assessment)

Date post: 15-Jul-2015
Category:
Upload: kheang-sokheng
View: 231 times
Download: 6 times
Share this document with a friend
Popular Tags:
26
Build Bright University Build Bright University Language Testing and Assessment Language Testing and Assessment Chapter-2 Chapter-2 Principles of Language Principles of Language Assessment Assessment Prepared by Kheang Sokheng Prepared by Kheang Sokheng Ph.D Candidate and MEd in Ph.D Candidate and MEd in TESOL TESOL
Transcript
Page 1: Chapter 2(principles of language assessment)

Build Bright UniversityBuild Bright University

Language Testing and AssessmentLanguage Testing and Assessment

Chapter-2Chapter-2Principles of Language Principles of Language AssessmentAssessment

Prepared by Kheang SokhengPrepared by Kheang SokhengPh.D Candidate and MEd in Ph.D Candidate and MEd in

TESOLTESOL

Page 2: Chapter 2(principles of language assessment)

Principles of Language AssessmentPrinciples of Language Assessment

Five cardinal criteria for “testing a test” are Five cardinal criteria for “testing a test” are as follows:as follows:

PracticalityPracticality ReliabilityReliability ValidityValidity AuthenticityAuthenticity WashbackWashback

Page 3: Chapter 2(principles of language assessment)

PracticalityPracticality

An effective practical test. This means An effective practical test. This means that itthat itis not excessively expensive,is not excessively expensive,stays within appropriate time constraint,stays within appropriate time constraint,is relatively easy to administer,is relatively easy to administer,has a scoring/evaluation procedure that has a scoring/evaluation procedure that is specific and time-efficient.is specific and time-efficient.

Page 4: Chapter 2(principles of language assessment)

Examples of Practicality checklistExamples of Practicality checklist

1. Are administrative details clearly 1. Are administrative details clearly established before the test?established before the test? 2. Can students complete the test 2. Can students complete the test reasonably within the set time frame?reasonably within the set time frame? 3. Is the cost of the test within budget 3. Is the cost of the test within budget limits?limits?

Page 5: Chapter 2(principles of language assessment)

ReliabilityReliability Reliability means the degree to which an Reliability means the degree to which an

assessment tool produces assessment tool produces stablestable and and consistentconsistent results.results.

A reliable test is consistent and dependable.A reliable test is consistent and dependable. A test is reliable if: A test is reliable if: “ “You give the same test to the same student You give the same test to the same student

or matched students on two different or matched students on two different occasions, the test should yield similar occasions, the test should yield similar results.” (Brown, 2004)results.” (Brown, 2004)

Page 6: Chapter 2(principles of language assessment)

Student-Related ReliabilityStudent-Related Reliability

The most common learner-related issue The most common learner-related issue

in reliability is caused by temporary in reliability is caused by temporary

illness, fatigue, a “bad day”, anxiety, illness, fatigue, a “bad day”, anxiety,

and other physical or psychological and other physical or psychological

factors. factors.

Page 7: Chapter 2(principles of language assessment)

Rater ReliabilityRater Reliability Inter-rater reliability:Inter-rater reliability: When two or more scorers yield When two or more scorers yield inconsistent scores of the same test.inconsistent scores of the same test.Factors: lack of attention to scoring, Factors: lack of attention to scoring,

inexperience, inattention, etc.inexperience, inattention, etc. Intra-rater reliability:Intra-rater reliability: Scoring criteria, fatigue, bias toward Scoring criteria, fatigue, bias toward

particular “good” and “bad” students, particular “good” and “bad” students, or simple carelessness.or simple carelessness.

Page 8: Chapter 2(principles of language assessment)

Test Administration ReliabilityTest Administration Reliability

This involves the condition in which the This involves the condition in which the test is administered.test is administered.

Unreliability occurs due to outside Unreliability occurs due to outside interference like noise, variations in interference like noise, variations in photocopying, temperature variations, photocopying, temperature variations, the amount of light in various parts of the amount of light in various parts of the room, and even the condition of the room, and even the condition of desk and chairs.desk and chairs.

Page 9: Chapter 2(principles of language assessment)

Test Administration ReliabilityTest Administration Reliability Brown (2010) stated that he once Brown (2010) stated that he once

witnessed the administration of a test of witnessed the administration of a test of aural comprehension in which an audio aural comprehension in which an audio player was used to deliver items for player was used to deliver items for comprehension, but due to street noise comprehension, but due to street noise outside the building, test-taker sitting outside the building, test-taker sitting next to open windows could not hear next to open windows could not hear the stimuli clearly.the stimuli clearly.

Page 10: Chapter 2(principles of language assessment)

Test ReliabilityTest Reliability

Factors cause unreliability: Factors cause unreliability: If a test is too long, test takers may If a test is too long, test takers may become fatigued by the time they reach become fatigued by the time they reach the later items and hastily respond the later items and hastily respond incorrectly.incorrectly.Ambiguous itemsAmbiguous items

Page 11: Chapter 2(principles of language assessment)

ValidityValidity Validity is the extent to which inferences Validity is the extent to which inferences

made from assessment results are made from assessment results are appropriate, meaningful, and useful in terms appropriate, meaningful, and useful in terms of the purpose of the assessment” (Gronlund, of the purpose of the assessment” (Gronlund, 1998, p.226).1998, p.226).

“ “ Measuring what should be measured”Measuring what should be measured” Content-related evidenceContent-related evidence Criterion-related evidenceCriterion-related evidence Construct-related evidenceConstruct-related evidence Consequential validityConsequential validity Face validityFace validity

Page 12: Chapter 2(principles of language assessment)

Content-Related EvidenceContent-Related Evidence

If a test samples the subject matter If a test samples the subject matter about which conclusions are to be about which conclusions are to be drawn.drawn.

If a test requires the test-taker to If a test requires the test-taker to perform the behavior that is being perform the behavior that is being measured. measured.

Page 13: Chapter 2(principles of language assessment)

Criterion-Related EvidenceCriterion-Related Evidence Criterion-Related Evidence is used to Criterion-Related Evidence is used to

demonstrate the accuracy of a measure or demonstrate the accuracy of a measure or procedure by comparing it with another procedure by comparing it with another measure or procedure which has been measure or procedure which has been demonstrated to be valid.demonstrated to be valid.

For instance, imagine a hands-on driving For instance, imagine a hands-on driving test has been shown to be an accurate test test has been shown to be an accurate test of driving skills. By comparing the scores on of driving skills. By comparing the scores on the written driving test with the scores from the written driving test with the scores from the hands-on driving test, the written can be the hands-on driving test, the written can be validated by using a criterion related validated by using a criterion related

Page 14: Chapter 2(principles of language assessment)

Criterion-Related EvidenceCriterion-Related Evidencestrategy in which the hand-on driving test is strategy in which the hand-on driving test is compared to the written test.compared to the written test.1.1.Concurrent validity/empiric validity if a test Concurrent validity/empiric validity if a test result is supported by other concurrent result is supported by other concurrent performance beyond assessment itself; for performance beyond assessment itself; for example, the validity of a high score on the example, the validity of a high score on the final exam of a foreign language course will final exam of a foreign language course will be substantiated by actual proficiency in the be substantiated by actual proficiency in the language.language.

Page 15: Chapter 2(principles of language assessment)

Criterion-Related EvidenceCriterion-Related Evidence

2. 2. Predictive validity is used to assess Predictive validity is used to assess (and predict) a test-taker’s likelihood of (and predict) a test-taker’s likelihood of future success.future success.E.g. Placement tests, admissions E.g. Placement tests, admissions assessment batteries, language aptitude assessment batteries, language aptitude tests.tests.

Page 16: Chapter 2(principles of language assessment)

Consequential validityConsequential validity It encompasses all the consequences of It encompasses all the consequences of

a test, including such considerationsa test, including such considerations as its accuracy in measuring intended as its accuracy in measuring intended criteria, its impact on the preparation of criteria, its impact on the preparation of the test-takers, its effect on the learner, the test-takers, its effect on the learner,

and the (intended and unintended) social and the (intended and unintended) social

consequences of a test’s interpretation consequences of a test’s interpretation and use.and use.

Page 17: Chapter 2(principles of language assessment)

Face ValidityFace Validity ““It refers to the degree to which a test It refers to the degree to which a test

looks right, and appears to measure the looks right, and appears to measure the knowledge or abilities it claims to knowledge or abilities it claims to measure, based on the subjective measure, based on the subjective judgment of the examinees who take it, judgment of the examinees who take it, the administrative personnel who decide the administrative personnel who decide on its use, and other psychometrically on its use, and other psychometrically unsophisticated observers” (Mousavi, unsophisticated observers” (Mousavi, 2002, p.244) 2002, p.244)

Page 18: Chapter 2(principles of language assessment)

Face ValidityFace Validity

Sometimes students don’t know what is Sometimes students don’t know what is being tested when they tackle a test. They being tested when they tackle a test. They may feel, for a variety of reasons, that is a may feel, for a variety of reasons, that is a test isn’t testing what it is “ supposed” to test. test isn’t testing what it is “ supposed” to test. Face validity means that the students Face validity means that the students perceive the test to be valid.perceive the test to be valid.

Face validity will likely be high if the learners Face validity will likely be high if the learners encounter:encounter:

a well-constructed, expected format with a well-constructed, expected format with familiar tasks,familiar tasks,

Page 19: Chapter 2(principles of language assessment)

Face ValidityFace Validity a test that is clearly doable within the allotted a test that is clearly doable within the allotted

time limit,time limit, Items that are clear and uncomplicated,Items that are clear and uncomplicated, Directions that are crystal clear,Directions that are crystal clear, Tasks that relate to their course work Tasks that relate to their course work

(content validity), and (content validity), and a difficulty level that presents a reasonable a difficulty level that presents a reasonable

challenge.challenge.

Page 20: Chapter 2(principles of language assessment)

AuthenticityAuthenticity

Bachman and Palmer(1996,p.23) define as “ Bachman and Palmer(1996,p.23) define as “ the degree of correspondence of the the degree of correspondence of the characteristics of a given language test task characteristics of a given language test task to the features of a target language task,” to the features of a target language task,” and then suggest an agenda for identifying and then suggest an agenda for identifying those target language tasks and for those target language tasks and for transforming them into valid test items.transforming them into valid test items.

Authenticity of a test may be present in the Authenticity of a test may be present in the following ways:following ways:

Page 21: Chapter 2(principles of language assessment)

AuthenticityAuthenticity The language in a test is as natural as The language in a test is as natural as

possible.possible. Items contextualized rather than isolated.Items contextualized rather than isolated. Topics are meaningful (relevant, Topics are meaningful (relevant,

interesting ) for the learner.interesting ) for the learner. Some thematic organization to items is Some thematic organization to items is

provided, such as through a story line or provided, such as through a story line or episode.episode.

Tasks represent, or closely approximate, Tasks represent, or closely approximate, real-world tasks. real-world tasks.

Page 22: Chapter 2(principles of language assessment)

WashbackWashback The term ‘washback’ or backwash refers The term ‘washback’ or backwash refers

to “the effect of testing on teaching and to “the effect of testing on teaching and learning” (Hughes, 2003, p.1)learning” (Hughes, 2003, p.1)

For instance, the extent to which For instance, the extent to which assessment affects a student’s future assessment affects a student’s future language development.language development.

Factors that provide beneficial washback Factors that provide beneficial washback in a test (Brown, 2010):in a test (Brown, 2010):

It can positively influence what and how It can positively influence what and how teachers teach, students learn;teachers teach, students learn;

Page 23: Chapter 2(principles of language assessment)

WashbackWashback offer learners a chance to adequately prepare, offer learners a chance to adequately prepare, give learners feedback that enhance their give learners feedback that enhance their

language development,language development, is more formative in nature than summative, is more formative in nature than summative, Provide conditions for peak performance by Provide conditions for peak performance by

learners.learners. In large-scale assessment, washback refers to In large-scale assessment, washback refers to

the effects that tests have on instruction in the effects that tests have on instruction in terms of how students prepare for the terms of how students prepare for the test−e.g., cram courses and teaching to the test−e.g., cram courses and teaching to the test.test.

Page 24: Chapter 2(principles of language assessment)

WashbackWashback Washback also includes the effects of an Washback also includes the effects of an

assessment on teaching and learning prior to assessment on teaching and learning prior to the assessment itself, i.e. on preparation for the assessment itself, i.e. on preparation for the assessment.the assessment.

The challenge to teachers is to create The challenge to teachers is to create classroom tests that serve as learning classroom tests that serve as learning devices through which washback is devices through which washback is achieved.achieved.

Washback enhances a number of basic Washback enhances a number of basic principles of learning acquisition: intrinsic principles of learning acquisition: intrinsic motivation, autonomy, self-confidence, motivation, autonomy, self-confidence, language, and interlanguage.language, and interlanguage.

Page 25: Chapter 2(principles of language assessment)

WashbackWashback Ways to improve washback:Ways to improve washback: To comment generously and specifically on test To comment generously and specifically on test

performanceperformance Through a specification of the numerical scores on Through a specification of the numerical scores on

the various subsections of the test.the various subsections of the test. Formative versus summative tests:Formative versus summative tests: Formative tests provide washback in the form of Formative tests provide washback in the form of

information to the learner on progress towards information to the learner on progress towards goals.goals.

Summative tests provide washback for learners to Summative tests provide washback for learners to initiate further pursuits, more learning, more goals, initiate further pursuits, more learning, more goals, and more challenges to face.and more challenges to face.

Page 26: Chapter 2(principles of language assessment)

Thank you!Thank you!


Recommended