Table of Contents
Pressing Issues in Current Methods of Assessments .................................................................................... 2
What are assessments? ................................................................................................................................ 3
What Makes a Fair and Effective Assessment System? ................................................................................ 4
Traditional Assessment Tools ....................................................................................................................... 6
Why traditional exams are failing ................................................................................................................. 7
Multiple Choice Questions ........................................................................................................................ 8
True/false ................................................................................................................................................ 11
Essays ...................................................................................................................................................... 12
Short- answer tests ................................................................................................................................. 13
Technology and Assessments ..................................................................................................................... 14
Alternative Method: e-assessments ....................................................................................................... 15
Pressing Issues in Current Methods of Assessments
For many years now, students have only been assessed using testing in traditional
school settings. Currently, due to the changing conditions that higher education faces, the issue
of reviewing and revising the existing assessment strategies has become even more pressing.
While some believe that traditional assessment methods are sufficient and easier to manage,
others suggest that the alternative methods of assessments produce better results. Many
propose that a variety of ways must be used in assessing students’ skills and knowledge or
otherwise the results will be deemed as inaccurate. A lot of arguments have been put forward
by institutions for higher educations concerning the unreliability of the use of conventional ways
of assessments—particularly the paper-based examinations and multiple-choice questions
(MCQs). Furthermore, institutions of higher education believe that the traditional assessments
only test a very limited range of students’ skills, knowledge and ability [1].
The emergence of this issue, incurred a considerable debate amongst people in the
educational community e.g. educators, researchers and leading edge professionals as to
whether or not the traditional methods of assessments used nowadays are still serving their
purpose i.e. Are they assessing what they are supposed to assess? Are the results of
assessment valid and reliable enough? Are the existing methods sufficient to get a whole view
of a students’ real potential or does it overlook some of the students’ skills and knowledge? With
these questions in mind, the institute of higher education continues to decide whether this is the
time for them to venture into innovations in assessments, which brings with it its own risks, or
should they focus on maintaining the existing “tried and tested” traditional forms of
assessments, which presents major loopholes in its operations. According to Sally Brown and
Angela Glasner [1], innovations in assessments should be give much thought, carefully
delineated and not just take lightly as it has the capability to jeopardize a student’s learning
practice and probably from a wider perspective, his immediate future. Furthermore, as a
response to those who opposes innovations and prefers the use of traditional methods of
assessments over the alternative ones [1].
A key analysis in this paper is that the methods used for assessments may not all the
time, give a complete and authentic summary of the student’s capabilities or sometimes may not
match the assessment objectives. There is a danger that sometimes assessment methods used
may be limited only to assessing narrow area of cognitive skills, falling all too heavily on one
scale -- such as only written examinations—and it may be over-weighted and focused too much
on a particular skill or method [2].
Before vindicating that there is indeed a need for innovations, it is important to be
convinced that the traditional methods are not serving the purposes for which they were made.
In this present paper, the traditional methods used in assessments and how they fall short of
their defined purpose are investigated.
What are assessments?
An extensive body of literature exists on educational assessment. Assessments are
described by experts in various ways. Kulieke, Bakker, et al, describe assessment as the
everyday activities that can be treated as “authentic demonstrations” of a person’s ability to
apply their learning of a certain discipline in real life context [3]. Deitel, Herman and Knuth
shares the same opinions about assessments. They note that assessments can help the
educators, policy makers, administrators and parents to gain a better understanding of a
student’s current knowledge [3]. Throughout the literature, the term “assessment”, “test” and
“evaluation” are used interchangeably.
In his book ‘Assessent in Schools’ David Satterly [4] draws a significant distinction
between the two dominant approaches to assessment and testing of a student, which are
referred to as: ‘criteria-referenced’ and ‘norm-referenced’. He differentiated the two approaches
by defining their specific purposes, type of score and interpretation and the underlying
educational philosophies which they embody. As denoted from its name, in a criteria-
referenced assessment, certain criteria has to be met in order to be successful—the only
concern is whether a student ‘passes’ or ‘fails’. Whereas, in a norm-referenced assessment,
students are ranked based on the distribution of scores (e.g. ‘A’ for a 97-100%), and receives
feedback that has already been predefined prior to conducting the assessment. These two
types of assessments are used primarily for a wide range of purposes within the education
practice-- from selection and certifying competence, to accountability and identifying special
needs [1] [5]. Sally Brown and Angela Glasner [1], states that the choices an assessor makes in
deciding on which method or instrument to use in assessment and making an appropriate
design for it is governed by the particular purpose on any individual situation.
George Brown, Joanna Bull and Malcom Pendlebury [5] claim that the purpose for which
an assessment is used can be categorized as either summative or formative. If a task was used
to measure the extent of learning at the end of the course, then the assessment would have
been described as summative (also known as ‘assessment of learning). However, if it was used
for the purpose of giving feedback to the student and with the intention of improving
performance, then the assessment would have been described as formative(also known as
‘assessment for learning’). The apparent difference between the two functions is that the
summative tends to be at the end-point and is concerned primarily on evaluation or judgment of
actual achievement, whereas the formative is a continuous form of assessment that underlines
potential of a student. An achievement test is a striking example of a summative assessment
while a written essay with a feedback exemplifies a formative assessment [1].
Elsewhere, George Brown, together with other authors [5] claims that a common error in
this aspect is not having a clear grasp of the set of purposes and the corresponding uses of
their results. Assessors have the tendency to use an assessment task for one set of purposes
and assume that the results from can be used for others.
To avoid this error, Sally Brown [1] proposed that the following factors to be considered
when designing a system that is deemed to be fit for purpose.
The reason for assessing
What are the skills or knowledge to be tested
The method to be used for assessment
The person to give the assessment
The time when the assessment should be conducted
According to Sally [16], the factors aforementioned above will help provide the assessors
with a helpful agenda, which in turn will help them devise an appropriate method of assessment.
What Makes a Fair and Effective Assessment System?
As assessments represent a student’s current knowledge and skills, it is reasonable to
propose that the ways of which it is conducted must be fair. Certain standards that will serve as
the basis for determination of whether or not a student has achieved intended learning
outcomes of a course or program must be clearly defined.
JISC states that a good assessment is both valid and reliable. It should be able to achieve
the intended learning outcomes for the course though conducting a series of actions but not at
the expense of the skills to be assessed.
Validity
Lee Conbach (1971) explains that the validity of a test does not lie in the property of the
test but rather in the inferences made. He stresses that it is not the test that is validated but
rather the interpretation of the data. Ashcroft and Palacio [23] shares the same view in the
validity of assessments as he notes that an assessment is said to be valid if the assessor was
able to measure what they claim to measure. Oftentimes, it requires assessment conducted in
real life, using real life settings. Furthermore, Garett [] defines validity as a property of an
assessment however, this claim was discredited by Cronbach and Meehl, 1955 [] discredits the
argument on the grounds that, validity cannot be considered as a property of an assessment
since it may be true for some students but not for others. For instance, a traditional timed written
examination in history requires a student to write legibly at high speed for a period of time. for
students who can do this, then the scores they may get will be accurate representations of their
capabilities with regard to history, but for others who cannot, the scores may reflect their lack of
writing ability which may be because of lack of historical knowledge, or lack of writing skills or
sometimes both. Therefore validity cannot be considered a property of the assessment but
rather of the inferences that can be made.
Reliability
George Brown et al [5] describes the two primary means to test the reliability of an
assessment as one: through measuring the extent to which one assessor agrees with another
and two: the variation in student performance.
Basically, an assessment result is considered to be reliable if, when the assessment is
repeated, the results will still be consistent-- a reliable assessment should allow markers to
reach the same conclusion about the performance of a particular group. [5]
For the first means of testing reliability, much evidence associated with the disagreement
of assessors has been reported throughout the literature. The most popular ones included the
study by Edgeworth [6], Hartog and Rhodes [7], who tested and marked using an assessment
grade from 40 to 100; and Diedrich [8], Bell [9], Newstead and Dennis [10], who used a broader
assessment grade. In Edgeworth, Hartog and Rhodes’ study, they both got a remarkable
inconsistency between the assessors’ marks for the 28 candidates. The assessors in Hartog
and Rhodes’ study obtained marks that were either failed, passed or merit. Whereas, in
Diedrich, Bell, Newstead and Dennis’ study, one-third of the essays obtained all the grades and
no essays were reported to have received less than the total of nine grades possible. Taking
into account all these findings, George Brown et al, proves that when a broader assessment
grade is used, it is likely that assessors will get the same marks [5].
Another means of determining the reliability of the assessment is through students’
performance. George Brown et al [5] describes how the performance of one student may vary
from task to task. According to him, some students may perform better in one aspect or in one
context than in others. To this end, he proposed to increase the range of assessments to
improve reliability of results.
Traditional Assessment Methods
In literature and practice various traditional methods have been used in order to assess
the students’ multi-faceted base of skills and knowledge. According to Rowntree [11], such
techniques include conversations, observations, multiple choice tests, essays, examinations
and the like. We can run into side effects if we rely too much on one such method, overlooking
the fact that each has distinctive features of its own.
In his book ‘Assessing Students’, Rowntree [11] advances a reasonable argument which
states that: despite one of the most common assumptions made by experts in the field of
education, assessment is not obtained only or even necessarily mainly through tests and
examinations. He explained that a variety of ways can be used to assess a student’s capability,
including a spectrum of assessments situations ranging from the very informal (almost casual),
to highly formal (even ritualistic).Shown in the table below are the range of assessment
situations that can be practiced without any kind of measurement that implies absolute
standards, as suggested by Rowntree.
Table 1- Range of Assessment Situations
Informal Formal
Continuous, unself-conscious assessment that takes place
in casual conversations where each party is given the
opportunity to constantly respond to what is taken to be the
attitudes and understandings of the other as he thinks of
what to say next in consequence. The skills portrayed by
both parties in this situation can be compared to the skills
portrayed by a platform speaker doing a monologue and
steadfastly ignores all responses that might suggest he
More self-conscious assessment
that takes place usually inside the
classroom including formal
quizzes, interviews, practical tests
and written examinations.
should depart from his ‘script’.
According to George Brown et al [5], it is important to select the right method of
assessment for they are the tools that an assessor will use to allow students to demonstrate
their skills and abilities on a particular subject area. Moreover, he stresses the importance of
choosing the most appropriate methods in order to obtain valid results. The current problem
nowadays however, is all too often; the students are offered a very limited and restrictive range
of assessment methods. Let alone, the methods used to assess them are deemed to be
insufficient in the sense that it does not allow students to show an ample range of intellectual
skills. In an Assessment Forum and Function Thomas A. Angelo, director of the Assessment
forum pinpointed “how an individual form of assessment can have as much influence on
outcomes as the construct being assessed” [12].
In practice, the most widely-used assessment methods include: multiple choice tests
(MCQ), true/false tests, short answer questions and essays [13]. In the following section, the
known limitations of the most common methods of assessments or exams are discussed.
Why traditional exams are failing
Much has been written about the weaknesses of traditional unseen examinations. The
criticism of the traditional assessment seems to have dominated amongst the whole educational
community e.g. researchers and leading edge practitioners. The argument was focused on
inadequacy of traditional examinations and the alternative modes of assessments that can be
implemented in place of the traditional ones. The alternatives include open book exams,
computer-delivered assessments, oral examinations and assesse presentation (this paper
draws attention to computer-delivered assessments) [14].
George Brown et al., [5] proposes two other flaws of the traditional forms of
assessments. The primary concern regarding the reliability and validity of traditional unseen
exams is that they are marked far too quickly. Most staff would agree that this task is often
required to be completed in haste. Because of the speed with which it has to be marked and the
pressure to do the task as well staff is not functioning at their best while undertaking the task.
This leads to increased danger in of the assessment being unreliable.
Finally, another more significant downfall of the unseen exams is the rising concern that
they are not measuring the learning outcomes which are intended purposes of higher
education. George Brown et al [5] expands on this concern in the context of assessing practice.
According to him, exams tend to favor candidate who happen to be skilled at doing exams
rather than at more important things i.e. the knowledge of the subject itself. Furthermore, when
students are asked about what, on their point of view, do they consider as the core skills being
measured in unseen examinations, surprisingly, they come up with a consensus-- the
techniques needed in order to do unseen exams. This poses a serious threat to the validity of
exams i.e. what is being measured may be less important than what should have been
measured.
To further understand the limitations of traditional methods of assessment, criticisms of
the most popular methods are presented in the following section.
Multiple Choice Questions
Multiple choice questions are virtually the most common approach to testing [15]. It can
be defined as questions composed of one question called stem and several possible correct
answers called choices. The choices must include one correct answer and several distractors.
In this type of exam, students select the correct answer by encircling their choice [16]. It tests
knowledge, rather than the ability to read and translate what is written. According to Bailey [17]
multiple choice questions are used by assessors for a variety of reasons including:
They can be used to assess a broad range of context since students finish this
quickly [16].
MCQs can be seen as a way of saving staff time in marking in the sense that
they are fast, easy and economical to score. Sometimes they are scored by
machines [14].
They are often scored objectively and thus may resemble a fairer method of
scoring and more reliable than subjectively scored tests.
They reduce the changes of learners getting the right answers.
Although it may well be the case that MCQ testing is widely used in higher education,
many experts in the field of education have criticized the use of such method for the following
aspects:
Firstly, Rowntree [11] argues that results of MCQ are not entirely reliable since it may
not test what the students really know about a particular subject in the sense that it requires
students to select the correct answer from a set of set of alternatives, which stresses the
recognition of answer more than the construction of response.
Students taking the exam may get high marks but this does not indicate their knowledge
of the subject, perhaps it could be a result of guessing, which may have a considerable but
unknown effect on scores [17], Multiple choice questions in examinations encourages the
students to develop styles of thought or intellectual ‘tricks’, which inhibits development of other
skills as proposed by David Satterly [4]. According to him, multiple choice formats often can be
bias in favour of certain students and penalize others for reasons which are not related to their
knowledge of the subject—one student may have had better grades purely brought by sheer
luck in guessing. Rowley [11] further proves this argument by examining the relationship of
willingness to take risks with the final mark of students in the assessment. Rowley reported that
there are some occasions when students who are prepared to take risks in guessing may score
higher than those whose knowledge and ability are equivalent. Additional work by Gronlund [15]
suggests that student’s technique in guessing the right answer through the process of
elimination could possibly help them get higher marks. According to him, in multiple choice
questions where the choices ‘all of the above’ or ‘none of the above’ are used, there is every
likelihood that these alternatives could be exploited by the students and recognize them as the
correct answer by identifying two correct alternatives or one incorrect alternative, respectively.
He concludes that for students to get the right answer, all they need is the ability to rule out
incorrect answers, which does not necessarily mean that they know the correct one.
Secondly, Airasian; Scouller; and Bailey [18] propose that MCQs promote memorization
and factual recall. When students know that they are going to be tested using multiple choice
questions, they have the tendency to scan the learning especially for factual testable items,
gobbets of detail or technical terms rather than just looking for larger encompassing insights that
might be demanded of them in essays. Whitehead agrees with Rowntree’s statement and
further adds to this argument by stating that this kind of assessment discourages high-level
cognitive processes because it leaves out certain operations for instance, the organization of
material as demanded in essays. In addition, it influences the goal or learning of students in a
way that they only strive to have a reproductive memory of material acquired or the production
of what is called ‘inert’ ideas [11]. Furthermore, George Brown, Joanna Bull and Malcolm
Pendleburry [5] claim that unlike open-ended questions that tests at a higher level of cognitive
skills and somehow denotes student’s independence and autonomy of the subject, MCQ
promotes only reproductive styles of learning—more superficial, low-level approach to testing.
Thirdly, poorly constructed MCQs can give hints to students and help them get the right
answer with very little effort on their part. In a recent study on the design, format, validity and
reliability of MVQs in nursing and education, Julie Considine, Mari Botti and Shane Thomas [19]
points out how MCQs have the tendency to suffer from ‘placement bias’ if not constructed
properly. Placement bias is incurred when the examiner uses patterns in correct answers for
items. They made reference to the works of Gronlund and Linn. Steven Burton, Richard
Sudweeks, Paul Merrill and Bud wood [19], who further discussed the limitations of MCQ by
looking at the variety of learning outcomes and student’s ability that it can measure. The
findings catalogued in their report reveals a more explicit picture of high level cognitive skills
which cannot be tested using MCQs. The skills include the following:
Articulate reasoning
Display though processes
Furnish information
Organize personal thoughts
Perform a specific task
Produce original ideas
Provide examples
Jacques Barzun [18] asserts that: A pupil’s knowledge on a particular area is never
really tested until he has organized it and explained it to someone else—an opportunity that is
not provided by MCQ.
Lastly, Feedback provided through MCQs are limited and it is virtually predetermined
during test construction, i.e., a student gets a letter A if he or she gets 80-100 questions correct.
Nevertheless, there is little scope for personalization of feedback that can help students in
regulating their own learning [18]. Furthermore, in MCQs, students play no role in the
assessment process. They are usually in no position to clarify the test questions or ask about
the purpose of the test which directly contradicts current concerns in assessments raised by
Boud, 200 and Yorke, 2003 which includes students being given a chance to actively participate
in assessment processes by identifying criteria and/or standards to apply to their work and
making judgments about the extent to which they have met these criteria and standards [20]
The claims presented above implicitly states that MCQs are mostly good for testing low
level skills of a student such as their ability to memorize facts, terms, methods and principles,
and perform logical deductions. Most of the experts have agreed that MCQ are not the best
form to use for every circumstance. They can only be used when the attainment of educational
objective can be measured by having the student select his or her response from a list of
several alternatives.
Most of these objections and criticisms seem to conceive of MCQs narrowly and
overlook some of its potential. Their claims did not consider the imaginative instruments for
wide-ranging assessment of all types of learning now available. Secondly, Airasian, Scouller
and Bailey’s claim seems to be inconsistent because although students may acquire certain
testing skills as described by Gronlund, there is no solid evidence nor is there any study that
proves that experience of particular types of test prevents the procurement of several styles of
learning. Since every student has a unique way of studying style, it is dangerous to say that
students tend to limit what they study to what they think are possible testable items in the exam.
A good deal of actions can be taken by teachers to minimize the force of objection.
Some researchers including Cox, Johnstone and Arnbusaidi discredited George Brown,
Joanna Bull and Malcolm Pentleburry’s claims that MCQs can only test low-level cognitive skills.
They maintain that it depends on how the tests are constructed. Steven Burton, Richard
Sudweeks, Paul Merrill, Bud wood supports this statement and asserts that MCQs can also be
used to test high-level objectives such as those based in comprehension, application and
analysis. The only problem they see in MCQs is that such items are more difficult to produce
[17].
They compare favorably with other test item types in terms of validity, reliability and
efficiency. Compared to essays, it takes much less time to answer MCQs which allows for
testing of a broader sample of course which in turn will likely be more representative of the
students’ overall achievement in the course. Reliability, they are less susceptible to guessing
the true or false, is more clear-cut than short answer test because there are no misspelled
words or partial answers to deal with. They are objectively scored which means that they are
unaffected by scorer inconsistencies as are essay questions. Efficiency they are amenable to
rapid scoring.
True/false
A specialized form of MCQs is the true/false questions wherein there are only two
possible answers. Similar to multiple choice questions, true or false questions allow are often
used to test a broad range of content. Virtually, it is used to assess whether or not a student is
familiar with a particular course by checking if the statements or facts presented to him or her
are accurate or not. The results of true or false questions can be used to record popular
misconceptions of students about a particular area of study. One widely-recognized limitation of
this method of assessment is that it is susceptible to guessing. True or false questions provide
students with a 50% chance of guessing the right answer. Particularly, if the answer is false, it is
quite hard to find out whether the student really knows the correct response [15]. Simonson et al
[21] suggests a possible solution to this by asking students to provide an explanation for every
item that they regard as ‘false’ and then rewrite the statement in the way they deem to be what
is right. In the field of Medicine, this kind of method is not advised for it is believed to
considerably bias results in favour of males. In a recent study conducted by the Medical
Education Unit at the University of Nottingham [22], they gathered all the final course grades
(from available 359 courses) of the students who took exams with different question formats.
The question formats were categorized as course work, essay, in class assessment, lab
studies, short answers, single phrase, spotter, single word answer, true/false questions, or Viva.
After getting the final statistics of scores of both genders in True or false exams they came up
with the following conclusion: If the exam consisted of at least some true or false questions,
males were 16.7 times more likely to score higher whereas if the exams consisted of only true
or false questions, males were 10.9 times more likely to score higher than females.
Although this study used a reasonable sample size (two years’ worth of examination
papers) in their research to maximize validity, the results may not be completely reliable.
Several factors to consider are: first, the sample exams that they have tested were not
conducted in a controlled environment. Therefore they cannot completely rely on the results.
Second is that the study was only conducted within a medical course. For a study to be truly
reliable, it must have the same results when done repeatedly. The research could have been
extended by conducting more tests not only in medical courses but in other courses as well.
Essays
These are deemed to be effective tools in conducting assessments since the questions
are flexible and it assesses a higher order of learning skills. George Brown, Joanna Bull and
Malcolm Pendleburry describe essays as methods that allow students to showcase
understanding on a particular subject area, integrate knowledge and skills in creative but
coherent ways. This is presumably the main reason why essays are the most favored approach
when it comes to testing higher levels of cognitive skills, which includes analyzing, synthesizing
and evaluating a given topic. Furthermore, essays allow students to develop their
communication skills in a multitude of aspects (e.g. when an assessor asks the student to
address an essay to several, different audiences [5]. As a result, essays are a more favoured
approach to test higher levels of cognition including analysis, synthesis and evaluation. In
Hounsell’s [23] seminal work, he provided distinct perceptions of essays threefold:
A stance or argument, which is reinforced by evidence
A unique viewpoint on a particular issue at hand
Coherent arrangement of thoughts and facts
Hounsell relates the three perceptions to a student’s style of learning. The first one is
akin to a rather deep approach to an issue, searching for evidences i.e. relevant studies and
then deriving conclusions. The next one also encompasses deep learning but it is more
concerned about the reproduction of knowledge. The last one is solely about reproducing
someone else’s work and facts.
Despite its popularity, several loopholes have been recognized in using this method of
assessment. On the one hand, in terms of efficiency, Simonson et al [21] claim that unlike
MCQs, essays are relatively difficult and time consuming to mark. On the other hand, in terms of
reliability, Sally Brown and Angela Glasner [1] states that since essay questions may be
interpreted and answered in different ways, an issue of subjectivity when marking may also
arise.
Moreover, essays and other written offerings are limited only to words and sentences
whilst in real life, stating an argument or viewpoint on a particular issue may be done in various
ways (e.g., conversation, acting, debating, teamwork, etc). Nevertheless, the use of essays
alone in assessment may devalue many invaluable skills and gifts a student has, simply
because it is impossible to assess them in written form. Sally Brown and Angela Glasner [1]
underlines how educators have the tendency to adhere to McNamara’ Fallacy by “taking the
measurable important when they should be better employed attempting to make the important
measurable (or at least discernible)”. Rowntree [11] seems to share the same opinion with Sally
Brown as he suggested that in many cases with assessment, it is much easier to assess the
less important aspects of learning.
Short- answer tests
Closely related to an essay, this refers to tests that require a written answer that may
vary from one or two words to a few sentences. They are primarily used to assess students’
basic knowledge on a particular topic. Despite its popularity in the educational settings, this is
one method of testing that has received very little attention in the literature. According to Haynie
[24] short- answer questions have many advantages, which include the following:
Compared to multiple choice questions and essays, short-answer tests are easier to
construct
Compared to essays, short-answer questions are more structured, often easier to mark
and therefore can test a wider range of topics than essays can
Unlike true/false tests or multiple choice tests, it is difficult to make a guess in short-
answer tests
It allows students to explain their understanding in a flexible and creative manner
Based from the evidence presented above, we can conclude that a lot of items cannot
be tested by traditional exams. Such items include teamwork, leadership, creativity and lateral
thinking, which are also very good indicators of students’ skills and knowledge.
Since each student has a unique set of skills and knowledge base, a variable of
approaches in presenting assessments should be used. It could be argued that all too often, the
level of learning/understanding of one person differs from that of the other. In the English
academic context for instance, some people are good in speaking but are terrible in writing.
There are also some others who have the tendency to perform better when materials are
presented auditory or visually. Therefore, all areas of learning using various assessment
techniques should be tested as this will presumably get more accurate results compared to just
focusing on one area using conventional techniques such as multiple-choice. By testing all
areas of learning, educators can pinpoint accuracy to where the person’s strengths and
weaknesses lie.
Technology and Assessments
Sally Brown, Joanna Bull and Phil Race [25] states that in this day and time, there are
three main areas in which computers can assist the assessment process: the design and
development of examinations, the actual assessment itself and the reporting of the results. This
literature draws primarily on the second of these.
This view was supported by Jim Ridgway, Sean McCusker, et al [26] who states that
concurrently, ICT links learning, teaching and assessments. According to Jim Ridgway and
Sean McCusker, in most schools, ICT is used as a tool to support learning in terms of providing
a better learning environment for students. Several ICT tools including word processors and
graphic calculators are used to aid students in doing their school works.
In their study on ‘techniques of formative evaluation’ Judith George and John Cowan
[27] claim that the use of ICT in learning has since been successful and can be justified based
on the following options they provide, which may not always be provided by a paper-based
method:
a larger volume of assessment examinations can be marked simultaneously and
accurately; more samples can be assessed in a smaller amount of time
feedback on assessments can be distributed quickly and easily
student responses to questions can be recorded; conclusions about the most common
misconceptions shared by student on a particular subject may be identified
assessments are provided in an “open-access” system
unique assessments can be made by randomly selecting a different paper for each
student
Alternative Method: e-assessments
Over the past years, the repertoire of assessment methods used in higher education has
expanded. At the European Schoolnet’s Policy and Innovation Committee meeting held last
October 2006 [28], the conference brought about discussions that were drawn to incorporating
ICT in conducting educational assessments. The perceived problems as well as the benefits of
shifting from the traditional paper based method to current innovative approaches such as e-
assessments were discussed.
In a recent white paper produced by JISC (Joint Information Systems Committee) [29],
the nature and functionalities of an alternative method of assessment known as E-assessment
were identified. JISC described E-assessments as the end-to-end assessment processes that
cover a range of activities, wherein digital technologies (ICT) are used in the presentation of
assessment. Such activities include the designing and delivery of assessments, marking—by
computers, or humans assisted by scanners and online tools—and all processes of reporting,
storing and transferring of data associated with public and internal assessment.
In this day and age, where ICT are becoming more pervasive than ever in the
educational practice, more traditional assessments are becoming obsolete and E-assessments
are starting to take over the whole assessment paradigm. According to Martin Ripley, formerly
from QCA [30] and leading authority in e-assessments, several countries including the UK,
Ireland, Netherlands and more from the EU have already made some good progress with e-
assessments. The use of e-assessments in educational assessment can be justified in many
ways. Several studies have proven E-assessments advantages and dominion over the more
traditional methods of assessment.
Jim Ridgway, Sean McCusker and Daniel Pead [26] in their study of ‘How Assessments
Drive Education’, demonstrates how e-assessments are advantageous over traditional methods.
In their study, they claim that apart from the apparent advantage over logistics in posting and
tracking large volumes of paper, more significant advantages include:
can avoid the total abolishment of paper based examinations through
improvement of authenticity by allowing students to make use of technology to
make their work better.
it allows for more effective assessment of invaluable skills such as problem-
solving, process, understanding and representing, controlling variable and
generating and testing hypothesis. This is done through presentation of exams in
innovative ways e.g. onscreen, interactive displays that changes variables over
time, simulations and interfaces presenting complex data.
More beneficial, particularly for students or users who have other commitments
than studying i.e. working students, in a sense that it can provide on demand
tests (tests that can be taken anytime they are ready) with immediate feedback
and which are not limited to ‘correct’ or ‘incorrect’ marks—sometimes can give
diagnostic feedback.
more accurate results via adaptive testing;
and lastly improvement of the quality of tests by improving reliability of scoring.
In a recent project led by the Scottish Qualifications Authority (SQA) called Scottish
Online Assessment Resources (Solar) [31], it was identified how e-assessment could provide an
opportunity to facilitate a change in assessment practice within teaching and learning in
Scotland colleges. It was recognized that with the use of technology-based assessment
approaches, more flexible assessment delivery will be provided as well as automated marking
or scores [32]. One of the considered significant changes in the assessment practice is unlike
more traditional methods, which delivers tests in a specified date and time, e-assessments
allows assessments to be available on demand, when students are ready.
Additional work of JISC on ‘Effective practice with e-Assessment’ written by Denise
Whitelock with contributions from Martyn Road and Martin Ripley [29], described that e-
assessments widen the range of skills and knowledge being assessed, providing unparalleled
diagnostic information and supporting personalization. The study suggests that technology, if
used with skill and imagination, can add value to assessment practice in such a way that it
increases the range of what is tested. It provides evidence of both cognitive and skills-based
achievements in ways that are durable and transferrable. Furthermore, it enhances the validity
of assessment systems and encourages deeper learning.
In the ASCILITE (Australasian Society for Computers) conference in Sydney [33], Nicol
D, J argued that a well-designed and well-deployed diagnostic and formative assessment
fosters more effective learning for a wider diversity of learners. He further argues that the
effective use of technology in delivering assessments plays a pivotal role in this regard, for they
can encourage students to progress further with their studies provided that it is linked to
appropriate resources, good quality, timed feedback, and to challenging but stimulating ways of
demonstrating understanding and skills.
On the one hand, the conclusion drawn from all the studies presented above advocates
how the implementation of e-assessments addresses the current limitations of more traditional
and conventional methods of assessment. Jim Ridgway, Sean McCusker and Daniel Pead
shares the same view with that of Denise Whitelock, Martin Road and Martin Ripley on the
grounds that e-assessments promote testing of ampler range of intellectual skills. Both of them
advocates that with the proper use of innovative techniques in presenting examinations such as
the ones mentioned above, can eliminate the danger brought about by traditional methods of
assessments, which is giving importance what is easy to assess, rather than assessing what
are considered truly important. Furthermore, because it assesses a wider-range of skills, the
authors agree that the validity of examination results is improved as well.
On the other hand, the results of the SQA project and ASCILITE conference arrived at
the same conclusion that unlike more traditional assessments which are usually delivered in a
specified date and time, e-assessments have the advantage of supporting on-demand
summative assessments which allows learners to take exams when they are available. This
may be particularly beneficial to some learners who find it difficult to cope with the traditional
assessment regime due to several factors including distance, disability, illness or work
commitments. It also increases the participation in learning by allowing learners to progress in a
way appropriate to them.
References
[1] Sally Brown and Angela Glasner , Assessment Matters in Higher Education. Philadelphia, USA:
SRHE and Open University Press, 2003.
[2] Joan Garfield, "Beyond Testing and Grading: Usng Assessment To Improve Student Learning,"
Journal of Statistics Education, vol. 2, no. 1, 1994.
[3] Semire Dikli, "Assessment at a Distance: Traditional vs. Alternative Assessments," Turkish Online
Journal of Educational Technology, vol. 2, no. 3, July 2003.
[4] David Satterly, Assessment in Schools. Oxford, England: Basil Blackwell Publisher Ltd, 1981.
[5] George Brown, Joanna Bull, and Malcolm Pendleburry, Assessing Student Learning in Higher
Education. New Fetter Lane, New York: Routledge, 1997.
[6] F.Y. Edgeworth, "The Elements of Chance in Competitive Competitive Examinations," Journal of
the Royal Statistical Society, 1890.
[7] P. Hartog and E.C Rhodes, The Marks of Examiners. London: Macmillan, 1936.
[8] P Diedrich, The Improvement of Essay Examination. Princeton: Eductaional Testing Service,
1957.
[9] RC Bell, 'Problems in Improving Reliability of Essay Marks' Assessment and Evaluation in Higher
Education., 1980.
[10] S Newstead and I Denns, "Examiners Examined: The Reliability of Exam Marking in Psychology,"
The Psychologist, vol. 7, no. 216-19.
[11] Derek Rowntree, Assessing Students How Shall We Know Them? London, UK: Harper & Row Ltd,
1977.
[12] Cambridge Assessment, "Computer-adaptive testing- the Way to Forward School Exams?," in
Cambridge Assessment, Cambridge, 2010.
[13] Kyong-Jee Kim and Curtis Bonk. (2011) The Future of online teaching and learning in Higher
Education: The survey says.
http://www.educause.edu/EDUCAUSE+Quarterly/EDUCAUSEQuarterlyMagazineVolum/TheFutu
reofOnlineTeachingandLe/157426.
[14] Brown S and Knight P, "Assessing learners in Highe Education," in London, 1994.
[15] Computer Assisted Assessment Centre. (1999, June) Designing Effective Objective Test
Questions: an Introductory Website. online.
[16] Centre For Teaching Excellenece. (2011) Exam Questions: Types, Characteristics, and
Suggestions. Online.
[17] K.M. Bailey, Learning About Language Assessment: Dilemmas, Decisions and Directions. US:
Heinle & Heinle , 1998.
[18] David Nicol, "E-assessment by design: using multiple-choice tests to good effect," Journal of
Further and Higher Education, vol. 31, no. 1, pp. 53-64, February 2007.
[19] Stevern Burton, Richard Sudweeks, Paul Merill, and Bud Wood, "Multiple-choice test items:
Guideline for university fraud," Brigham Young University Testing System, 1991.
[20] David Boud, Enhancing Learning through Self Assessment. Great Brittain, UK: Biddles Lts, 1997.
[21] M. Simonson, M Albright, and S Zvacek , Assessment for distance education. Upper Saddle River,
New Jersey: Prentice-Hall, 1991.
[22] Chris Palmer. Changing the asssessment culture: from paper to online submission and
management of assignments OSMA.
[23] D Hounsell, Learning and Essay Writing, F Marton and N.J. Entwistle, Eds. Edinburgh, UK:
University Press, 1984.
[24] W.J. Haynie, "Effects of in-class tests and post-test revews on delayed retention learning,"
Journal of Industrial Teacher Education.
[25] Sally Brown, Joanna Bull, and Phil Race, Computer-Assisted Assessment in Higher Education.
London, UK: Clays Ltd, St Ives Plc, 1999.
[26] Futurelab Series. (2004) Literature review of e-assessment. http://hal.archives-
ouvertes.fr/docs/00/19/04/40/PDF/ridgway-j-2004-r10.pdf.
[27] Judith George and John Cowan, A Handbook of Techniques for formative evaluation. London,
UK: Kogan Page Limited, 1999.
[28] Association for Learning Technology. (2011, November) The elements of e-assessment.
http://newsletter.alt.ac.uk/2011/11/the-elements-of-e-assessment/.
[29] Joint Information Systems Committee (JISC). (2007) Effective Practice with e-assessment.
http://www.jisc.ac.uk/media/documents/themes/elearning/effpraceassess.pdf.
[30] Xplora.org. (2011) e-assessment.
http://www.xplora.org/ww/en/pub/insight/thematic_dossiers/eassessment.htm.
[31] Scottish Qualifications Authority. (2008, May) SOLAR Innovating Assessment in Scotland.
http://www.sqa.org.uk/files_ccc/SOLARWhitePaperMay2008.pdf.
[32] Oxford Brookes University. (2002, June) Learning and Teaching Briefing paper.
http://www.brookes.ac.uk/services/ocsd/2_learntch/briefing_papers/p_p_assessment.pdf.
[33] Geoffrey Crisp, Di Thiele, Ingrid Scholten, Barker Sandra, and Baron Judi, "The Impact of On-line
Multiple Choice Questions on Undergraduate Studet Nurses' Learning," in Interact Integrate
Impact:, Adelaide, 2003, pp. 237-242.