Date post: | 01-Apr-2015 |
Category: |
Documents |
Upload: | sierra-varty |
View: | 222 times |
Download: | 2 times |
14/01/2004 1
How to set-up a Trial
Dr. Konstantina Vassilopoulou
14/01/2004 2
Structure of Presentation
Usability Definition Usability Evaluation Definition
Criteria Usability Evaluation Techniques
14/01/2004 3
Usability (ISO, 1995)
usability
effectiveness
efficiency
satisfaction
Overall system (context of use)
users task goals equipment environment
14/01/2004 4
Usability definition The environment and equipment: For each user
group, which web browsers will be used, what type of machines, with what type of screens and what speed of web access? What access will users have to assistance if they encounter problems?
User: What are the skills, motivations and previous experience of each anticipated user group?
The tasks: For what identifiable purposes will each user group be visiting the web site? What will be their tasks? How motivated will they be to persist in achieving their task objectives?
14/01/2004 5
ISO (1995) for traditional software user interface
Bevan (1997) for web user interface
Effectiveness Can be measured by the extent (accuracy and completeness) to which the intended goals have been satisfied.
How many of the goals of the intended users can the web site support (e.g. how much of the information required by a potential tourist is available on the site?) Will a typical user accessing the site easily locate all the information relevant to their goal?
Efficiency Can be measured by the resources that have to be expended to achieve the intended goals.
How much time and effort will be required to locate the required information?
Satisfaction Can be measured by the extent to which the users find the overall system acceptable.
How satisfied will the user be, and how much will they enjoy using the site?
14/01/2004 6
Usability Evaluation “…concerned with the collection of
data about the usability of a design or product by a specified group of users for a particular activity within a specified environment or work context” (Preece et al., 1994: #602).
14/01/2004 7
Usability Evaluation Expert vs Users Time - in the development
lifecycle Performance Measures Produced
14/01/2004 8
Time formative evaluation that takes
place before the implementation in order to influence the development of the product
summative evaluation that takes place after the implementation in order to test the proper functioning of the system.
14/01/2004 9
Performance measures objective performance measures, which
are objective measurements or observations of user behaviour and are focused on task performance - that is, how well the users can achieve a specific task and
subjective user preference measures, that measure the users’ opinions of working with the system - that is, how much they like to use the system (Nielsen and Lavy, 1994; and Lewis, 1995).
14/01/2004 10
Heuristic Evaluation (1) Heuristic evaluation is an inspection
technique. Usability inspection:“the generic name for
a set of methods based on having evaluators inspect or examine usability-related aspects of a user interface” (Mack and Nielsen, 1994: #1).
Selection of evaluators and principles.
14/01/2004 11
Heuristic Evaluation (2) Set of evaluators produces lists of
usability problems in a user interface by going through it and noting deviations from accepted usability principles.
Prior to the evaluation the evaluators need to obtain: a description of the objectives, target audiences, expected usage patterns of the system being
tested, list of heuristics.
14/01/2004 12
Sessions Briefing sessions: evaluators are
told what to do Evaluation period: 1-2 hours
independently inspecting the system
Debriefing session: experts come together to discuss their findings
14/01/2004 13
Heuristics (Evaluation Table)
Degree of system conformance with each particular rule [1..5]
Severity Rating [0..1]
Comment
Visibility of System Status
Match between system and the real world - Consistency
User control and freedom
Error prevention
Recognition rather than recall
Flexibility and efficiency of use
Aesthetic and minimalist design
14/01/2004 14
Heuristics
Aesthetic and minimalist design Is color used in a form of coding? Is color used to make the screen bright?
Match between system and the real world – Consistency
Are the same buttons, used across the system? Are they used in the same way?
Visibility of System Status Does the system respond to user’s actions
14/01/2004 15
Degree of Conformance Please rate your degree of conformance with each
particular rule. The range of assigned values is [1..5]. Severity rating - importance weight - is also given to each rule, indicating the relevance of the general principle to the system according to the experts opinion. The range of assigned values is [0..1].
Degree of conformance 1 = Not at all conformance 2 = … 3 = … 4 = … 5 = Absolute conformance with the rule
14/01/2004 16
Severity Rating [weight assignment]
Please rate the severity rating for each particular rule.
0 = this is not a usability problem 0.25 = cosmetic problem only – need not fixed
unless extra time is available on project 0.50 = minor usability problem – fixing this should be
given low priority 0.75 = major usability problem – important to fix, so
should be given high priority 1 = usability catastrophe – imperative to fix this
before product can be released
14/01/2004 17
Formula e = Σ wi ri
ri = average score of rule i (heuristic) - degree of conformance with each particular rule.
wi = the relative weight of this rule according to ALL experts opinions -relevance of the general principle to the system
14/01/2004 18
Interpretation of Results High importance weight (wi) and
low conformance of the system with particular rule (ri) = necessitates corrective actions
14/01/2004 19
Card Sorting “The purpose of card sorting is to
better understand the user’s concept of how information on the web site should be organised” (Fuccella, 1997: #71).
Provides feedback on global questions regarding the organisation and structure of a web site (Levi and Conrad, 1997).
14/01/2004 20
Card Sorting (2) A group of participants (users of the
system) and a stack of standard-sized index cards are required.
Two Categories: Affinity clustering: participants are instructed to
sort the index cards into groups that look similar to them (Fuccella, 1997) and then to provide a description of each group.
Pre-defined categories: sorting of cards based on a predefined set of categories or specific criteria, for example, relative importance, and expected frequency (Constantine and Lockwood, 1999).
14/01/2004 21
When can it be used This technique can be used throughout
the design lifecycle. During the prototype stage, the results of
the technique can suggest structures to design menu trees.
At a later stage it can be used to compare the card sort results to a draft web structural design.
The results will identify specific areas where the underlying hierarchy can be improved so that users can more easily find the information they are looking for (Levi and Conrad, 1997).
14/01/2004 22
Questionnaires The questionnaire and survey type of usability
evaluation technique is one way of measuring the user’s opinion and attitude.
Questionnaires that can be used to measure user satisfaction the Questionnaire for User Satisfaction (QUIS) the Purdue Usability Testing Questionnaire (PUTQ) the Software Usability Measurement Inventory and the Website Analysis and MeasureMent Inventory
(WAMMI).
14/01/2004 23
Questionnaire for User Satisfaction (QUIS) a demographic questionnaire six scales that measure overall reaction
ratings of the system four measures of specific interface factors:
screen factors, terminology and system feedback, learning factors, system capabilities
optional sections to evaluate specific components of the system: technical manuals, on-line help, on-line tutorials, multimedia, Internet access and software installation (Harper, et al., 1997).
14/01/2004 24
Purdue Usability Testing Questionnaire (PUTQ) (i) Compatibility (ii) Consistency
(iii) Flexibility (iv) Learnability (v) Minimal Action (vi) Minimal Memory Load (vii) Perceptual Limitation and (viii) User Guidance.
14/01/2004 25
Software Usability Measurement Inventory Tool (SUMI)
SUMI uses a five-point scale to rate 50 system attributes to measure users, perceived usability. Affect: The degree to which the user feels that the
system is enjoyable and stress-free to use; Learnability: The degree to which the user feels he
can learn new operations with the system; Control: The degree to which the user feels in control
of the system, rather than vice versa; Efficiency: The degree to which the user feels he gets
his work done well with the system; and Helpfulness: The degree to which the user feels the
system helps him along.
14/01/2004 26
Website Analysis and MeasureMent Inventory (WAMMI)
WAMMI uses a five-point scale to rate 20 web site usability characteristics. Attractiveness: The degree to which users like the
site, whether they find the site pleasant to use. Examples of items are: This web site is presented in an attractive way, and You can learn a lot on this web site.
Control: The degree to which users feel ‘in charge’, whether the site allows them to navigate through it with ease, and whether the site communicates with them about what it is doing, for example: Going from one part to another is easy on this web site, and I feel in control when I'm using this web site.
14/01/2004 27
Website Analysis and MeasureMent Inventory (WAMMI)
Learnability: The degree to which users feel they can get to use the site if they come into it for the first time, and the degree to which they feel they can learn to use other facilities or access other information once they have started using it, for example: All the material is written in a way that is easy to understand, and It will be easy to forget how to use this web site.
Helpfulness: The degree to which users feel that the site enables them to solve their problems with finding information and navigating, for example: This web site has not been designed to suit its users, and All the parts of this web site are clearly labeled.
Efficiency: The degree to which users feel that the site has the information they are looking for, whether it works at a reasonable speed and is adapted to their browser, for example: You can find what you want on this web site right away, and This web site works exactly how I would expect it to work.
14/01/2004 28
Questionnaire Development Lifecycle
1. Forming the survey Define Questionnaire scope:1. Survey 2. Psychometric
2. Item samplingSelect items from different sources of literature:1. Guidelines 2. Heuristics3. Questionnaires4. Checklists
.3. Pilot trialTest for:1. Validity2. Reliability
4. Production version The Questionnaire can be used in research
5. Next version Revise Questionnaire based on new evidence
Stages Objectives
14/01/2004 29
Forming the Survey
Psychometric: include questions about attitudes, judgements and predispositions (Kirakowski and Corbett, 1990).
Survey: ask factual questions (Kirakowski and Corbett, 1990).
14/01/2004 30
Item Sampling Psychometric and Survey type of Questionnaires
(Kirakowski, 1994). Set criteria that will allow for the selection of questions. “A range of questions, adjectives, or other prompts is
developed that the researcher hopes might have some bearing on the stated requirements of the study.” (Kirakowski and Corbett, 1990: #212).
Sources: Guidelines Heuristics Questionnaires Checklists Relevant Surveys
14/01/2004 31
Development of Questionnaire
A Questionnaire usually consists of several sections, based on the objectives. ONE SECTION: Representation of
socio-demographic variables. Use of Filtering Questions: split the
population into groups.
14/01/2004 32
Sample Size and Scale Sample size: 5 times the number of questions or
variables. NOT NECESSARY!!! Rating scales: should have betwen 5 and 7
categories (Kirakowski and Corbett, 1990; Tull and Hawkins, 1990; and Lewis, 1995).
The larger the number of scale steps used in a scale the higher the reliability of the scale, but with rapidly diminishing returns. “As the number of scale steps is increased from 2 up through 20, the increase in reliability is very rapid at first. It tends to level off at about 7, and after about 11 steps, there is little gain in reliability from increasing the number of steps” (Nunnally, 1967: #521).
14/01/2004 33
Reliability
To assess the reliability of the factors of the usability test, Cronbach’s Alpha (Cronbach, 1970) statistic is usually utilised.
When performing reliability analysis, some variables may be excluded from the factors or moved to other factors in order to refine: a) the interpretability and b) the internal consistency of that factor.
14/01/2004 34
What is usability (user) testing?
Ελεγχόμενη Πειραματική Λειτουργία - Κλασσική τεχνική αξιολόγησης λογισμικού που παρέχει ποσοτικές μετρήσεις της απόδοσης του συστήματος όταν οι χρήστες εκτελούν προκαθορισμένες εργασίες (Αβούρης, 2000).
Oι χρήστες καλούνται να εκφράσουν μεγαλόφωνα τις σκέψεις, απόψεις και τα συναισθήματά τους ενώ αλληλεπιδρούν με το σύστημα. H μέθοδος απαιτεί σχετικά λίγους πόρους, έχει δε αποδειχθεί ιδιαίτερα αποτελεσματική.
Usability testing is a technique for ensuring that the intended users of a system can carry out the intended tasks efficiently, effectively and satisfactorily.
14/01/2004 35
Usability Testing Σημειώσεις αξιολογητή - λιγότερο δαπανηρή μέθοδος Ηχογράφηση υποκειμένων.
χρήσιμη σε πρωτόκολλα της κατηγορίας "ομιλούντων υποκειμένων". Χαμηλή πληροφορία άλλης μορφής
Βιντεοσκόπηση υποκειμένων απώλεια λεπτομέρειας όπως εκφράσεις προσώπου κλπ που
καταγράφονται μόνο με κοντινότερη λήψη. Ανάγκη συγχρονισμού με εικόνα από την οθόνη
Καταγραφή συμβάντων στον υπολογιστή (computer logging) καταγραφή σε επίπεδο πληκτρολόγησης, υλικό μεγάλου όγκου - ανάλυση τους είναι ιδιαίτερα επίπονη
διαδικασία Καταγραφή συμβάντων από τους χρήστες (user logging)
υποκειμενικού χαρακτήρα (Αβούρης, 2000)
14/01/2004 36
Usability Testing Steps
Plan the usability test: Users - Tasks
Carry out the test Report and Follow-up
Analysis and interpretation
14/01/2004 37
Plan the Usability Test
Define Purpose and audience of site Set the Usability goals Define Tasks Users will perform to observe users’
compliance with the system Define criteria, metrics and a method of collection
of user’s response, and Design Questionnaire that is used to assess user’s
subjective preferences to the system Test Users, Instructors, scheduling, payment Material Do not allow the developer of the site to be in the
same room, especially if s/he has a bad temper…
14/01/2004 38
Carry Out the Test
Introduce Participant Users Describe List of tasks
Watch quietly Record Participant's Interaction – think
aloud, video Provide Help if needed Interact
14/01/2004 39
Report And Follow-up Tabulate Data According to the Listed Tasks Report Results
Time needed to complete the test tasks Need for help during the test
How frequently the instructor had to help the test persons solve a problem?
What kind of user problems needed to be solved? Type of errors
Pressing the wrong button on the interface Wrong numerical value entered Mode error: The correct name or number was entered in
the wrong mode Provide a List with Design Recommendations Suggest Specific Actions for the Designer(s)
14/01/2004 40
Πλεονεκτήματα και Μειονεκτήματα
Πλεονεκτήματα: Ο αξιολογητής συνάγει συμπεράσματα για το νοητικό μοντέλο του χρήστη
Αν η ακολουθία ενεργειών του χρήστη είναι διαφορετική από την αναμενόμενη για την εκτέλεση του έργου, συνάγεται ότι το σύστημα δεν είναι αρκετά σαφές
Καταγραφή ορολογίας που ο χρήστης χρησιμοποιεί, ώστε να ελεγχθεί αν αυτή είναι σε αντιστοιχία με αυτή που έχει χρησιμοποιηθεί στα εγχειρίδια και στη διεπιφάνεια του συστήματος.
Μειονεκτήματα Μεγαλόφωνη έκφραση σκέψεων ίσως διαταράσσει τη συγκέντρωση του
χρήστη, π.χ. μαθητής μικρής ηλικίας που προσπαθεί να λύσει ένα δύσκολο πρόβλημα.
Είναι πιο δύσκολο σε πεπειραμένους χρήστες να εκφράσουν όλες τις σκέψεις τους αφού έχουν αυτοματοποιήσει πολλές ενέργειες τους (Αβούρης, 2000)
14/01/2004 41
References (1) Bevan, N. (1997). Usability Testing of World Wide Web Sites. In Proceedings of Workshop
at CHI'97: Usability Testing of World Wide Web Sites, March 23-24, Atlanta, GA. Online archive available at: [http://www.acm.org/sigchi/webhci/chi97testing/bevan.htm].
Constantine, L. L and Lockwood L. A. D. (1999). Software for use. Addison Wesley, ACM Press.
Constantine, L. L., and Lockwood, L. A. D., (1999). Web Usability Inspections. Presentation in User Interface ’99. Online archive available at: [http://www.foruse.com/Presentations/WebInspectUI99/sld004.htm].
Cronbach, L. J. (1970). Essentials of Psychological Testing. New York: Harper and Row. Fuccella, J. (1997). Using User Centred Design Methods to Create and Design Usable Web
Sites. In Proceedings of SIGDOC ' 97: 15th Annual Conference on Computer Documentation. ACM Press/Addison Wesley, pp.69-77.
Harper, B., Slaughter, L., and Norman, K. (1997). Questionnaire administration via the WWW: A validation and reliability study for a user satisfaction questionnaire. In Proceedings of WebNet ‘97, Association for the Advancement of Computing in Education. Online archive available at: [http://lap.umd.edu/quis/publications/harper1997.pdf].
ISO (1995). ISO/DIS 9241-11 Draft International Standard, Ergonomic Requirements for Office Work with Visual Display Terminals (VDTs). Part 10:Dialogue Principles. International Organisation for Standardisation Genève, Switzerland.
14/01/2004 42
References (2) Kirakowski, J., and Corbett, M. (1990). Effective Methodology for the Study of HCI.
Elsevier. Kirakowski, J., and Corbett, M. (1993). SUMI: The Software Usability Measurement
Inventory. British Journal of Educational Psychology. 24, (3), pp. 211-213. Kirakowski, J., and Cierlik, B. (1998) Measuring the Usability of Web Site. in
Proceedings of Human Factors and Ergonomics Society. Online archive available at: [http://www.ucc.ie/hfrg/questionnaires/wammi/research.html].
Kirakowski J., Claridge N., and Whitehand R., (1998) Human Centred Measures of Success in Web site Design, In Proceedings4th Conference on Human Factors and the Web. Online archive available at: [http://www.research.att.com/conf/hfweb/proceedings/kirakowski/index.html]. Levi, M. D., and Conrad, F. G., (1996). A Heuristic Evaluation of a World Wide Web Prototype, Interactions. 3, (4), pp.50-61.
Lewis, J. R. (1995) IBM Computer Usability Satisfaction Questionnaires: Psychometric Evaluation and Instructions for Use. International Journal of Human Computer Interaction. 7, (1), pp. 57-78.
Lin, H. X, Choong Y., and Salvendy, G. (1997). A Proposed Index of Usability: A Method for Comparing the Relative Usability of Different Software Systems. Behaviour & Information Technology, 16, (4/5), pp.267-278.
14/01/2004 43
References (3) Mack, R. L, and Nielsen, J. (1994). Usability Inspection Methods:
Executive Summary. In J. Nielsen, and R. L. Mack (eds.) Usability Inspection Methods, John Wiley & Sons, pp.1-23.
Nielsen, J. and Lavy, J. (1994). Measuring Usability Preference vs. Performance, Communications of the ACM. 37, (4), pp. 66-76.
Nunnally, J., C., (1967). Psychometric Theory. McGraw-Hill Inc. Tabachnick, B. G., and Fidell, L. S, (1996). Using Multivariate Statistics.
Harper Collins College. Tull, D. S., and Hawkins, D. I., (1990). Marketing Research: Meaning
Measurement and Method. 5th Edition. Macmillan Publishing Co. Inc. Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S., and Carey, T.
(1994). Human Computer Interaction. Addison Wesley.