+ All Categories
Home > Documents > ASSESSMENTutmlead.utm.my/download/course_materials/bc4dcp_2019/... · 2019-07-22 · ASSESSMENT •...

ASSESSMENTutmlead.utm.my/download/course_materials/bc4dcp_2019/... · 2019-07-22 · ASSESSMENT •...

Date post: 12-Jan-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
137
ASSESSMENT ADIBAH BINTI ABDUL LATIF CENTRE OF QUALITY AND RISK MANAGEMENT (QRiM) SCHOOL OF EDUCATION FACULTY OF SOCIAL SCIENCES AND HUMANITIES
Transcript

ASSESSMENT

ADIBAH BINTI ABDUL LATIF

CENTRE OF QUALITY AND RISK MANAGEMENT (QRiM)

SCHOOL OF EDUCATIONFACULTY OF SOCIAL SCIENCES AND HUMANITIES

ACTIVITY 1

• Please write down one thing that you want to knowabout the process of assessment related to your jobscope.

(5 Minutes)

“Assessment is at the heart of student experience” Brown and Knight (1994)

“If you want to change student learning then change the method of assessment”

Brown, Bull & Pendlebury (1997)

4

Give full measure and weight with justice

5

Give just measure and weight

6

• Educational measurement and evaluation areessential to sound educational decision making

• Should be based on accurate, relevantinformation and that the responsibility ofgathering and imparting that informationbelongs to educators

• Will affect many people for example, parent’sor only a single person

Introduction

Which come first? Assessment or Evaluation?

TERMINOLOGY

Testing

Measurement

Evaluation

Assessment

TESTING

• A tool to determine student’s ability to completespecific tasks or demonstrate mastery of a skill orknowledge of content

• The most critical basis in ensuring the validity ofstudents interpretation score

• Examples: Q&A session in class, assignment,performance task, test, quiz and final exam

MEASUREMENT

• A systematic process of assigning numerals(quantitative) to the test administered.

• It can be in raw scores, percentile, standard score,etc.

• Examples: Assignment marks, total sore in a finalexam, mean of PLO, KPI score, mean of elppt, rankingscore.

22 July 2019SPP2032::Educational Measurement and

Evaluation

MEASUREMENT

1

2

n

O

O

O

1

2

n

x

x

x

Ability Score

EVALUATION

• The process of describing, obtaining, and providinguseful information for judging decision alternatives.This process allows one to make a judgment aboutthe desirability or value of something.

• Examples: ABCD, Pass and Fail, HL,TM,MM,Description from the value (Baik, Cemerlang,Sederhana), Description of P1 to P5 in e-LPPT.

ASSESSMENT

• The process of gathering information to monitor andreflect the progress in learning and teaching and tomake educational decisions if necessary.

• Examples: Dr A found out that her students areexcellent in formative assessment, but they cannotperform well in their final exam. What to do?

• Dr B realises that there are two groups ofability/level of skills in his class. How can he do tomake sure the learning environment can helpstudents’ learning?

ACTIVITY 2

• Provide three situations related to testing,measurement, evaluation and assessment. Discuss ingroups and other groups will guess the answer.

(20 minutes)

Differences between ASSESSMENT and EVALUATION

• Let us discuss together!!

Dimensions of Difference Between Assessment and Evaluation

• Timing

• Focus of Measurement

• Relationship Between Administrator & Recipient

• Findings, Uses thereof

• Ongoing Modifiability of Criteria, Measures thereof

• Standards of Measurement

• Relation Between Different Objects of A/E

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Formative: Ongoing to Improve Learning

Timing

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Formative: Ongoing to Improve Learning

Summative: Final to Gauge Quality

Focus of Measurement

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Formative: Ongoing to Improve Learning

Summative: Final to Gauge Quality

Process-Oriented: How Learning Is Going

Focus of Measurement

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Formative: Ongoing to Improve Learning

Summative: Final to Gauge Quality

Process-Oriented: How Learning Is Going

Product-Oriented: What’s Been Learned

Administrator/Recipient Relationship

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Formative: Ongoing to Improve Learning

Summative: Final to Gauge Quality

Process-Oriented: How Learning Is Going

Product-Oriented: What’s Been Learned

Reflective: Internally Defined Criteria/Goals

Administrator/Recipient Relationship

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Formative: Ongoing to Improve Learning

Summative: Final to Gauge Quality

Process-Oriented: How Learning Is Going

Product-Oriented: What’s Been Learned

Reflective: Internally Defined Criteria/Goals

Prescriptive:External-ly Imposed Standards

Findings, Uses Thereof

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Formative: Ongoing to Improve Learning

Summative: Final to Gauge Quality

Process-Oriented: How Learning Is Going

Product-Oriented: What’s Been Learned

Reflective: Internally Defined Criteria/Goals

Prescriptive:External-ly Imposed Standards

Diagnostic: Identify Areas for Improvement

Findings, Uses Thereof

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Formative: Ongoing to Improve Learning

Summative: Final to Gauge Quality

Process-Oriented: How Learning Is Going

Product-Oriented: What’s Been Learned

Reflective: Internally Defined Criteria/Goals

Prescriptive:External-ly Imposed Standards

Diagnostic: Identify Areas for Improvement

Judgmental: Arrive at an Overall Grade/Score

Ongoing Modifiability of Criteria, Measures

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Formative: Ongoing to Improve Learning

Summative: Final to Gauge Quality

Process-Oriented: How Learning Is Going

Product-Oriented: What’s Been Learned

Reflective: Internally Defined Criteria/Goals

Prescriptive:External-ly Imposed Standards

Diagnostic: Identify Areas for Improvement

Judgmental: Arrive at an Overall Grade/Score

Flexible: Adjust As Problems Are Clarified

Ongoing Modifiability of Criteria, Measures

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Process-Oriented: How Learning Is Going

Product-Oriented: What’s Been Learned

Reflective: Internally Defined Criteria/Goals

Prescriptive:External-ly Imposed Standards

Diagnostic: Identify Areas for Improvement

Judgmental: Arrive at an Overall Grade/Score

Flexible: Adjust As Problems Are Clarified

Fixed: To Reward Success, Punish Failure

Standards of Measurement

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Process-Oriented: How Learning Is Going

Product-Oriented: What’s Been Learned

Reflective: Internally Defined Criteria/Goals

Prescriptive:External-ly Imposed Standards

Diagnostic: Identify Areas for Improvement

Judgmental: Arrive at an Overall Grade/Score

Flexible: Adjust As Problems Are Clarified

Fixed: To Reward Success, Punish Failure

Absolute: Strive for Ideal Outcomes

Standards of Measurement

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Reflective: Internally Defined Criteria/Goals

Prescriptive:External-ly Imposed Standards

Diagnostic: Identify Areas for Improvement

Judgmental: Arrive at an Overall Grade/Score

Flexible: Adjust As Problems Are Clarified

Fixed: To Reward Success, Punish Failure

Absolute: Strive for Ideal Outcomes

Comparative: Divide Better from Worse

Relation Between Objects of A/E

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Reflective: Internally Defined Criteria/Goals

Prescriptive:External-ly Imposed Standards

Diagnostic: Identify Areas for Improvement

Judgmental: Arrive at an Overall Grade/Score

Flexible: Adjust As Problems Are Clarified

Fixed: To Reward Success, Punish Failure

Absolute: Strive for Ideal Outcomes

Comparative: Divide Better from Worse

Coöperative: Learn from Each Other

Relation Between Objects of A/E

Assessment n Evaluation (various sources, but especially Dan Apple 1998)

Reflective: Internally Defined Criteria/Goals

Prescriptive:External-ly Imposed Standards

Diagnostic: Identify Areas for Improvement

Judgmental: Arrive at an Overall Grade/Score

Flexible: Adjust As Problems Are Clarified

Fixed: To Reward Success, Punish Failure

Absolute: Strive for Ideal Outcomes

Comparative: Divide Better from Worse

Coöperative: Learn from Each Other

Competitive: Beat Each Other Out

Summary of Differences

Dimension of Difference Assessment Evaluation

Timing Formative Summative

Focus of Measurement Process-Oriented Product-Oriented

Relationship Between Administrator and Recipient

Reflective Prescriptive

Findings, Uses Thereof Diagnostic Judgmental

Ongoing Modifiability of Criteria, Measures Thereof

Flexible Fixed

Standards of Measurement Absolute Comparative

Relation Between Objects of A/E Coöperative Competitive

• Discuss some test/task that called as evaluation andcalled as assessment.

• Can evaluation and assessment happen concurrentlyin one test/task?

• Are there any differences between summativeassessment and evaluation?

PRINCIPLE OF ASSESSMENT

• Well-aligned with educational learning outcomes.

• Assessment should be valid and reliable

• Formative assessments needs to scaffold students inthe summative assessment

• Student should receive feedback on their work intimely manner.

• Assessment should be inclusive and equitable for allstudents

• Assessment is not used to threaten and intimidatestudents.

• Assessment should help student mastery learning.

ASSESSMENT IN OBE

Identify outcomes

Determine assessment

Learning activities

Curriculum

TYPES of ASSESSMENT

■Assessment Of Learning (AoL)

■Assessment For Learning (AfL)

■Assessment As Learning (AaL)

Purpose of the Test

Measure of maximum performance

To determine the students’ ability

The students are motivated to obtain as high a score as possible

Examples:

IQ test

Subject/course testAptitude/achievement test

Purpose of the Test

Measure of typical performance

To measure students’ interest, personality and attitude

Responses are classified thru the preset criteria, not the highest marks

Examples:Affective Test

Personality TestCareer Test

TYPES OF TEST

PLACEMENT

Befor the class

FORMATIVE DIAGNOSTIC SUMMATIVE

On going during the process oflearning.

At the end of the course

Takes up where the formative leaves off

Norm Referenced Test and Criterion Referenced Test

41

Norm Referenced Test

Criterion Referenced Test

VALIDITY

USABILITYADMINISTRABILITY

RELIABILITYINTERPRETABILITY

OBJECTIVITY

HOW YOU MEASURE YOUR VALIDITY AND RELIABILITY OF YOUR TEST?

43

HOW YOU CHOOSE YOUR ITEM IN YOUR ITEM BANK?

• VALIDITY (KESAHAN)

• Measuring what should be measured

• The appropriateness of the interpretations made from test scores and other evaluation results with regard to a particular use.

CHARACTERISTIC OF A GOOD TEST

VALIDITY

• CONTENT

• CONSTRUCT

• CRITERION RELATED

CONTENT VALIDITY

• Most related with achievement test

• The test represent the topic and cognitive process

towards the syllabus.

• Does it measuring learning objectives? –cognitive /

affective / psychomotor

• Table of specification

• Subject matter expert

KUALITI INSTRUMEN

1. KESAHAN KANDUNGAN (CONTENT VALIDITY)

Boleh dilakukan melalui semakan pakar dalam kursusberkenaan dan juga melalui Jadual Spesifikasi Item(JSI).Kesahan kandungan adalah sesuatu yang bersifatkualitatif.

JADUAL SPESIFIKASI ITEM PEPERIKSAAN AKHIR TAHUN

SPPP 2032

Table of Spesification

KUALITI INSTRUMEN

2. KESAHAN KONSTRUK

Merujuk kepada kecukupan dan ketepatan item dalammenguji sesuatu pembolehubah / konstruk yang dikaji.

Contoh: Cukupkah item yang digunakan dalam mengujipengetahuan pelajar dalam kursus anda?

Dilakukan secara kuantitatif. Menggunakan RawVariance Explained by Measure.

Kesahan konstruk

Rubrik ini dapat menguji 51.6% kemahiran komunikasipelajar

Kesahan konstruk

Rubrik ini dapat menguji 24.6% kemahiran komunikasipelajar

KUALITI INSTRUMEN

3. KEBOLEHPERCAYAAN

Merujuk kepada kebolehulangan ujian untukditadbirkan kepada pelajar lain yang homogen denganpelajar yang sedang diuji.

Menggunakan analisis kebolehpercayaan Item.

RELIABILITY

Test-retest reliability.

• reliability coefficient is obtained by administering the same test twice and correlating the scores.

• an excellent measure of score consistency as one is directly measuring consistency from administration to administration.

RELIABILITY

Split Half Test

• Coefficient is obtained by dividing a test into halves, correlatingthe scores on each half, and then correcting for length (longertests tend to be more reliable).

• The split can be based on:

odd versus even numbered items, randomly selecting items,or manually balancing content and difficulty.

• Advantage: only requires a single test administration.

• Weakness: - the resultant coefficient will vary as afunction of how the test was split.

- not appropriate on tests where speed is a factor

RELIABILITYInternal Consistency

• Internal consistency focuses on the degree to which the individual items are correlated with each other and is thus often called homogeneity.

• The Coefficient is determined by

- Cronbach’s alpha,- Kuder- Richardson Formula 20 (KR-20)- Kuder-Richardson Formula 21 (KR-21)

• The advantages: only require one test administration and they do not depend on a particular split of items

• The disadvantage: They are most applicable when the test measures a single skill area

RELIABILITYAlternate Form Reliability

Most standardized tests provide equivalent forms that can be used interchangeably.

These alternative forms are typically matched in terms of content and difficulty.

Scores on pairs of alternative forms for the same examinees are correlated to provide a measure of consistency or reliability.

RELIABILITY

CORRELATION = RELIABILITY

Kebolehpercayaan Item

98% pengulangan keputusan boleh Berlaku jika ditadbir kepada kumpulan

Pelajar lain

Activity

60

WHAT CAN BE CONCLUDED FROM THE GIVEN DIAGRAM?

Validity and Reliability

Validity and Reliability

Validity and Reliability

63

TABLE OF SPECIFICATION (TOS)JADUAL SPESIFIKASI ITEM (JSI)

PENGENALAN

• JSI ialah satu jadual yang menunjukkan perwakilansecara grafik berkaitan silibus pelajaran dengan hasilpembelajaran topikal (TLO) dan selari dengan hasilpembelajaran kursus [CLO]

PENGENALAN

• JSI juga menunjukkan jumlah item dan pembahagianmarkah untuk setiap item. Jenis item juga bolehditentukan.

PENGENALAN

• JSI dirancang oleh pensyarah sebagai asas dalampembinaan ujian samada peperiksaan akhir tahun /peperiksaan pertengahan tahun / ujian / kuiz.

PENGENALAN

Kubiszyn & Borich, (2003) emphasized the followingsignificance and components of TOS:

1. A Table of Specifications consists of a two-way chartor grid relating instructional objectives to theinstructional content.The column of the chart lists the objectives or"levels of skills" (Gredlcr, 1999) to be addressed;The rows list the key concepts or content the test isto measure.

AKTIVITI 1

Bilakah Jadual Spesifikasi Item dibina?

TUJUAN JSI

• Menjamin content validity

• Memastikan sample item yang representative secara adil.

• Ujian memfokuskan kepada kandungan yang penting

• Menentukan pemberat / masa yang akan ditetapkan dalam kuliah.

TUJUAN JSI

• JSI juga dapat membantu pensyarah sebagaipanduan dalam perancangan menetapkan topik yanglebih penting, masa yang diperlukan untuk topiktertentu dan apakah tugasan / projek yang bolehdilakukan untuk membantu pelajar belajar topiktersebut lebih bermakna.

TUJUAN JSI

According to Bloom, et al. (1971),"We have found it useful to represent the

relation of content and behaviors in the form of a two dimensional table with the objectives on one axis, the content on the other”.

TUJUAN JSI

2. A Table of Specifications identifies not only thecontent areas covered in class, it identifies theperformance objectives at each level of the cognitivedomain of Bloom's Taxonomy.

Teachers can be assured that they are measuring students' learning

across a wide range of content and readings as well as cognitive

processes requiring higher order thinking.

TUJUAN JSI

3. A Table of Specifications is developed before the test is written. In fact it should be constructed before the actual teaching begins.

TUJUAN JSI

The cornerstone of classroom assessmentpractices is the validity of the judgments aboutstudents’ learning and knowledge.

A TOS is one tool that teachers can use tosupport their professional judgment when creatingor selecting test for use with their students.

TUJUAN JSI

In order to understand how to best modify a TOSto meet your needs, it is important to understand thegoal of this strategy: improving validity of a teacher’sevaluations based on a given assessment. Validity isthe degree to which the evaluations or judgments wemake as teachers about our students can be trustedbased on the quality of evidence we gathered(Wolming & Wilkstrom, 2010).

TUJUAN JSI

A Table of Specifications helps to ensure thatthere is a match between what is taught and what istested. Classroom assessment should be driven byclassroom teaching which itself is driven by coursegoals and objectives.

Tables of Specifications provide the link betweenteaching and testing. (University of Kansas, 2013)

KELEBIHAN JSI

Ujian yang sahdan bolehdipercayai

Ujian yang adildan seimbang

Keyakinanpelajar

terhadap ujian

Sample ujianyang

representative

Pemberat yang sesuai bagisetiap topik

LANGKAH PEMBINAAN JSI

• Enam langkah utama ialah:-

1. Menganalisis objektif pembelajaran

2. Mengkaji sukatan / topik pelajaran

3. Penekanan keutamaan topik dalam ujian

4. Menentukan masa yang digunakan untuk sesuatutopik

5. Menentukan jenis soalan

6. Menentukan bilangan soalan

FORMULA

Formula A

Relative weight for the importance of content =

( The number of the TLO / class period for one topic ÷TOTAL number of TLO/ class period ) ×100%

(3/10)*100 = 30

Relative weight of the subjectTLO / Hours spentContent

%303Topic 1

%101Topic 2

%101Topic 3

%202Topic 4

%101Topic 5

%202Topic 6

100%10Total TLO / class periods for teaching the unit

FORMULA

Formula B

Relative weight for the item =

(% of weight in each Bloom level x total

item of the test)

(0.3*20)= 6

Objectives

Totals 100%

Topics

Knowledge and

Comprehension

30 %

Application

and Analysis

50%

Evaluation

and

Synthesize

20%

Totals 100%

Topic 1 (30 %)

Topic 2 (10 %)

Topic 3 (10 %)

Topic 4 (20 %)

Topic 5 (10 %)

Topic 6 (20 %)

Weight for item 6 10 4 20

FORMULA

Formula C

Identify the number of questions in each topic for

each level of objectives =

(The total number of test x relative weight of the

topics x relative weight of Bloom level)

(20*0.3*0.3)= 1.8

Objectives

(Totals 100%)

Topics

Knowledge and

Comprehension

30 %

Application

and Analysis

50%

Evaluation

and

Synthesize

20%

Totals 100%

Topic 1 (30 %) 1.8 (2) 3 (3) 1.2 (1) 6

Topic 2 (10 %) 0.6 (1) 1 (1) 0.4 (0) 2

Topic 3 (10 %) 0.6 (1) 1 (1) 0.4 (0) 2

Topic 4 (20 %) 1.2 (1) 2(2) 0.8 (1) 4

Topic 5 (10 %) 0.6 (0) 1 (1) 0.4 (1) 2

Topic 6 (20 %) 1.2 (1) 2(2) 0.8 (1) 4

Number of questions 6 10 4 20

Kajian Kes

Jumlah jam kredit yang ditetapkan adalah tidak tepat. Apabila dijumlahkan hanya mendapat 38 jam sahaja

CADANGAN

• Menggunakan formula dalam pengiraan excel

Kajian Kes

Tidak sejajar di antara CLO dan aras taksonomi

CADANGAN

• Kuiz, test, dan peperiksaan akhirdirangkum dalam satu CLO sahaja.

• CLO lain memfokuskan pentaksiransecara praktikal dan kemahiran generik

Kajian kes

Bagaimanakah sebenarnya bilangan item ditetapkan?Berasarkan jam kuliah? Pemberat CLO? Ini kerana jumlahItem yang sama sahaja bagi jumlah jam kuliah yang berbeza

CADANGAN

• Aplikasikan formula yang dicadangkan

• Masukkan hanya topik yang paling signifikan terutama bagi soalan berbentuk subjektif

Kajian Kes

Tidak salah sebenarnyabagi peringkat sijil / diplomauntuk mempunyai CLO / aras item yang melebihiaras Aplikasi

CADANGAN

• Gunakan satu sahaja CLO bagi mengukur PLO1 (Pengetahuan)

• Aplikasikan konsep penjajaran konstruktif.

Kajian Kes

Soalan MCQ bukanlah dibina untuk aras mudah sahaja.

Malah bagi ujian berbentuk sumatifseharusnya memenuhi konsep normal distribution

CADANGAN

• Kemahiran dalam pembinaan item peperiksaan

• Memenuhi prinsip pembinaan item

Kajian kes

Antara contoh CLO yang tidak sesuai diukur dengan ujian bertulis

CADANGAN

• Tidak semua CLO perlu diukur dengan peperiksaan

• Bina satu CLO khas untuk pengetahuan (PLO1)

Kajian Kes

Item ini bukanlah item berbentuk struktur

CADANGAN

• Pensyarah perlu dapat membezakan apa yangdimaksudkan dengan item jawapan pendek,respon terhad, respon terbuka dan soalanberstruktur.

Bagaimana penetapan dan pemilihan topik bagi ujian berbentuk esei?Adakah perlu melibatkan semua topik?Bagaimana dari segi aras kesukaran?

Kajian Kes

Instead of letakkan bilangan item, letakkan nombor soalan,Contohnya soalan 1 (a), 1(b), 2(a),

Kajian Kes

Kajian Kes

Tiada jajaran dengan CLOTiada pecahan aras kesukaran item

Practical skills should not be assessed by examination

MASA UJIAN

Carey (1988) pointed out that the time available fortesting depended not only on the length of the classperiod but also on students' attention spans.

MASA UJIAN

Linn & Gronlund (2000):

1. A true-false test item takes 15 seconds to answer unless the student is asked to provide the correct answer for false questions. Then the time increases to 30-45 seconds.

2. A seven item matching exercise takes 60-90 seconds.

MASA UJIAN

3. A four response multiple choice test item that asksfor an answer regarding a term, fact, definition, ruleor principle (knowledge level item) takes 30 seconds.The same type of test item that is at the applicationlevel may take 60 seconds.

MASA UJIAN

4. Any test item format that requires solving a problem,analyzing, synthesizing information or evaluatingexamples adds 30-60 seconds to a question.

MASA UJIAN

5. Short-answer test items take 30-45

seconds.

6. An essay test takes 60 seconds for each point to be compared and contrasted.

TEST THEOREM

If an individual can perform the most difficult aspects of the objective, the instructor can "assume" the lower levels can be done.

However, if testing the lower levels, the instructor cannot "assume" the individual can perform the higher levels.

ARAS KESUKARAN UJIAN

Aras kesukaran

• Memastikan item yang dibina adalah bersesuaiandengan aras keupayaan pelajar.

• Membuktikan aras kesukaran item yang ditetapkandalam JSI.

• Analisis aras kesukaran ini dilakukan untukpenetapan aras item untuk disimpan di dalam bankitem.

• Bagi mengkaji semula aras kesukaran item yangdiletakkan semasa penulisan Jadual Spesfikasi Item.

• Analisis secara CTT boleh dilakukan sebagai asasanalisis item

Item Difficulty Level: Definition

The percentage of students who answered the item correctly.

High

(Difficult)

Medium

(Moderate)

Low

(Easy)

≤= 30% > 30% AND < 80% ≥=80%

0 10 20 30 40 50 60 70 80 90 100

• Menentukan indeks kesukaran bagi item objektif:

pengiraan itu boleh dilakukan dalam bentuk jadual seperti dibawah.

cTT

• Menentukan indeks kesukaran bagi item subjektif:

-Pengiraan dibuat seperti jadual di bawah:

cTT

Item Difficulty Level: Discussion

• Is a test that nobody failed too easy?

• Is a test on which nobody got 100% too difficult?

• Should items that are “too easy” or “too difficult” be thrown out?

KUALITI INSTRUMEN

Indeks Diskriminasi Item

Bagi memastikan item yang dibina berfungsi denganbaik. Boleh dianalisis menggunakan CTT dan IRT. Itemyang baik seharusnya dapat membezakan keupayaanpelajar yang berpencapaian tinggi dan berpencapaianrendah. Indeks diskriminasi membantu penetapan itemdibuang dan disimpan dalam bank item

Bagaimana anda menganalisis indeks kesukaran item?

What is a “good” value?

If the item has Ratio of Students answered the itemcorrectly

Positive Discrimination High achievers >Low achievers

Negative Discrimination High achievers < low achievers

No discrimination High achievers = low achievers

What is a “good” value?

Discrimination Index Item Evaluation

0.40 and above Very good

0.30-0.39 Good and can be improved

0.20-0.29 Marginal and need improvement

Below 0.19 Bad, cant be accepted and need proper checking

• Contoh:Jika terdapat 40 orang murid dalam satu kelas, bahagikan mereka kepada dua kumpulan iaitu 20 murid pencapaian tinggi dan 20 murid pencapaian rendah. Misalnya bagi item 8, 16 murid dari kumpulan berpencapaian tinggi dapat menjawab dengan betul manakala hanya 4 orang murid dari kumpulan berpencapaian rendah yang menjawab betul bagi item tersebut.

Maka:K t = 16 = 0.8 atau ( 80 % )

20K r = 4 = 0.2 atau ( 20 % )

20D = K t – K r = 0.8 - 0.2 = 0.6( Kesimpulannya item 8 adalah item yang baik)

cTT

GRADED ASSIGNMENT

• Develop one table of specification for your final exam, using the formula given in this workshop.

• Send in softcopy (using excel form)

• Individual/ Group Assignment according to your course.

GRADED ASSIGNMENT

• Analyze your final examination item using Classical test theory

• Send in softcopy (using excel form)

• Individual/ Group Assignment according to your course

ALTERNATIVE ASSESSMENT

• Beyond the traditional psychometrically

driven testing. Design to assess learning

tasks that stimulate critical thinking skills

and require students to produce or

demonstrate knowledge rather simply

recall information provided to them by

others

www.presentationgo.com

HUMAN JUDGMENT IN SCORING

REAL WORLD APPLICATIONS

MEANINGFUL INSTRUCTIONAL TASK

HIGHER LEVEL OF THINKING

STUDENTS PERFORMANCE

Characteristics of Alternative Assessment

124

ASSESSMENT

CONVENTIONALALTERNATIVE

AUTHENTICPERFORMANCE

BASED

Examples of authentic assessment

Research Project Debate

Writing Speech / summary

Studio

Work

Portfolio

Article Review

Writing Journal / proposal

Case Study

AUTHENTIC ASSESSMENT

New Academia Learning Innovation

Not only performance based, buthappen in the real setting.

Emphasizing more on process ratherthan product

Soft skills development

Holistic assessment

Rubric

TEACHING PRACTICES

CAPSTONE PROJECT

SERVICE LEARNING

2U2I PROGRAMME

WORK BASED LEARNING

JOB CREATION

SCORING AUTHENTIC ASSESSMENT

METHOD

CHECKLISTRATING SCALE

RUBRIC

HOLISTIC ANALYTIC

133

Table 1: Subjects Without Practical Components

Percentage Parts Assessed

10-20% Soft skills (e.g. communication, teamwork,

problem solving, responsibility)

40-60% Academic coursework (tests, quizzes,

assignments, papers)

30-40% Final examination

Distribution of Marks

Table 2: Subjects With Practical Components

Percentage Parts Assessed

10-20% Soft skills (e.g. discipline, teamwork, problem

solving, ethics)

80-90% Practical knowledge and skills

Distribution of Marks

Look at the measure not the score…..

Emphasize on the outcome not the output….

THANK YOU


Recommended