Date post: | 25-Dec-2015 |
Category: |
Documents |
Upload: | marybeth-morris |
View: | 218 times |
Download: | 1 times |
The Problem…
It takes humans a long time to evaluate written text
Lack of teacher time to assess student writing samples
“As I have proven on the board, all of your essays are terrible and I’m tired of reading them. What
should I do?”
The Solution…
Automated essay evaluation! Criterion Evaluation Service
– Critique: evaluates and provides feedback for grammatical, usage, and mechanical errors
– E-Rater 2.0: gives essays a holistic score
Vantage Learning– Intellimetric
“Thank God for Artificial
Intelligence!”
What is Automated Essay Evaluation?
Teachers assign essays to students Student submit essays online Students get feedback Teachers get summary reports of students’
performance “I love you e-Rater 2.0!”
Nuts and Bolts
Automated essay evaluation relies on four main areas of Artificial Intelligence– Machine Learning – Natural Language Processing– Pattern Matching – Heuristics Integration
BOLT
Machine Learning
Teacher supplies training data– Corpus of edited and graded essays
Uses statistical methods to evaluate essays Ex: word sense disambiguation
– Looks at 2 words to the left and right of word to determine context
“I love Machine
Learning!”
Nuts and Bolts
Automated essay evaluation relies on four main areas of Artificial Intelligence– Machine Learning – Natural Language Processing– Pattern Matching – Heuristics Integration
BOLT
Natural Language Processing
Parse trees used to analyze sentence structure
Compares linguistic style of student essays to training data to evaluate grammar, mechanics, and usage
“Bertha, do your hands hurt from
processing natural
languages all day?”
Nuts and Bolts
Automated essay evaluation relies on four main areas of Artificial Intelligence– Machine Learning – Natural Language Processing– Pattern Matching – Heuristics Integration
BOLT
Pattern Matching
System contains examples of good vocabulary, sentence structure, etc.
Tries to match patterns in student essays and awards corresponding scores
?=
Nuts and Bolts
Automated essay evaluation relies on four main areas of Artificial Intelligence– Machine Learning – Natural Language Processing– Pattern Matching – Heuristics Integration
BOLT
Heuristics Integration
Searches students’ essays for phrases that occur more or less often than expected based on corpus frequencies
Example: repetitious words– If a single word accounts for more than 5% of the
word count in the essay, that word is repetitive
Criterion Diagnostic Analysis Tools
Grammar Usage Mechanics Style Org/Dev
Fragments
Run-ons
Garbled Sentences
S-V agreement
Ill-formed verb
Pronoun error
Missing Possessive
Wrong word
Wrong article
Missing article
Nonstandard verb or word form
Confused words
Wrong word form
Faulty comparisons
Spelling
Capitalization of proper nouns
Initial capitalization in a sentence
Missing apostrophe for contractions
Missing end punctuation
Comma error
Repetition
Inappropriate words
Sentences containing passive voice
Long Sentences
Essay statistics- # of words- # of sentences- Average # of words in sent.
Transitional words and phrases
Introductory material
Thesis statement
Topic sentences
Supporting Ideas
ConclusionOther
E-Rater Score Generation
12 features used when scoring an essay– 11 features reflect essential characteristics in
essay writing and are aligned with human scoring criteria
– 12th feature: word count Weighted less heavily so that longer essays do not
automatically earn higher scores
Trained on a sample of 200-250 scored essays with scores between 1 and 6
Implementation
For GMAT grading using automated essay evaluation…– Both a human and e-Rater grade the essay on a
six-point scale– If scores agree, essay is assigned that score– If scores differ by 1 point, essay is assigned score
of human grader– If scores differ by more than 1 point, automated
score is discarded and another human grader evaluates the essay
“How am I supposed to get into Harvard when e-Rater gave
me a 0.1 on my essay?”
Benefits
Immediate feedback to students
Enables teachers to spend more time with students and less time grading
Provides students with more practice writing
“Thanks a lot e-Rater, now instead of playing soccer I get to stay inside and
practice my writing!”
Limitations
Not always as accurate as teacher feedback– Would rather miss an error than tell a student that
a well-formed construction is ill-formed
Machine cannot understand unique writing styles, humor, irony, etc.
Input sentence: “This presentation deserves an A”
E-Rater Output: “Well-constructed sentence. I concur!”
Limitations - Example
Which sentence do you think is better?1. It is with the greatest esteem and confidence that I write to
support Joey as a candidate for a faculty position. I have known Joey in a variety of capacities for more than five years, and I find him to be one of the most eloquent…
2. It is with chimpanzee greatest esteem and confidence that I write to support Joey as a candidate for a faculty position. I have known Joey in a variety of capacities for more than five years, and I find him to be one of chimpanzee most eloquent…