+ All Categories
Home > Documents > AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

Date post: 02-Jun-2018
Category:
Upload: r-gandhimathi-rajamani
View: 219 times
Download: 0 times
Share this document with a friend

of 31

Transcript
  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    1/31

    AUTOMATED SCORING SYSTEM FOR ESSAYSBy

    ARUNA P 2009103010

    DHIVYA PRIYA R 2009103528DIVYA HARSHINI R 2009103530

    A project report submitted to the

    FACULTY OF INFORMATION AND

    COMMUNICATION ENGINEERING

    in partial fulfillment of the requirements

    for the award of the degree of

    BACHELOR OF ENGINEERING

    in

    COMPUTER SCIENCE AND ENGINEERING

    DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

    ANNA UNIVERSITY

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    2/31

    CHENNAI - 600025

    April 2012

    CERTIFICATE

    Certified that this project report titled Automated Essay Scoring System

    is the bonafide work of Aruna P (2009103010),Dhivya Priya R(2009103528) and Divya

    Harshini R (2009103530) who carried out the project work under my supervision, for the partialfulfillment of the requirements for the award of the degree of Bachelor of Engineering in

    Computer Science and Engineering. Certified further that to the best of my knowledge, the work

    reported herein does not form part of any other thesis or dissertation on the basis of which adegree or an award was conferred on an earlier occasion on these are any other candidates.

    Place: Chennai Prof.Dr.K.S.EaswarakumarDate: Professor and Head ,

    Department of Computer Science and Engineering,

    Anna University,Chennai - 600025

    COUNTERSIGNED

    Head of the Department

    Department of Computer Science and Engineering

    Anna University

    Chennai600025

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    3/31

    ACKNOWLEDGEMENTS

    We express our deep gratitude to our guide, Prof.Dr.K.S.Easwarakumar for guiding us throughevery phase of the project. We appreciate his thoroughness, tolerance and ability to share hisknowledge with us. We thank him for being easily approachable and quite thoughtful. Apart

    from adding his own input, he has encouraged us to think on our own and give form to our

    thoughts. We owe him for harnessing our potential and bringing out the best in us. Without hisimmense support through every step of the way, we could never have done it to this extent.

    We are extremely grateful to Prof.Dr.K.S.Easwarakumar, Professor of Computer Science and

    Engineering, Anna University, Chennai 600025, for extending the facilities of the Departmenttowards our project and for his unstinting support.

    We express our thanks to the panel of reviewers Dr.Arul Siromoney, Dr.Madhan

    Karky, Dr.A P Shanthi and Miss.Suganya for their valuable suggestions and critical reviews

    throughout the course of our project.

    We thank our parents, family, and friends for bearing with us throughout the course of our

    project and for the opportunity they provided us in undergoing this course in such a prestigious

    institution.

    Aruna P Dhivya Priya R Divya Harshini R

    ABSTRACT

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    4/31

    The objective ofautomated essay scoring system is to assign scores to essays written in

    an educational setting. It is a method ofeducational assessmentand an application of Natural

    Language Processing using word based document vector construction method and by adoptingContent Vector Analysis (CVA) . CVA can be used in this case as the distribution of words in

    corporate datasets can be expected to be of random nature. Our system will use Model Based

    approach in order to overcome huge space, storage and training requirements. Here, we calculatethe deviation of the students essay with respect to the ideally scored essays.The system evaluates the essays based on established rubrics viz. Surface Features,

    Spelling errors, Grammar mistakes and correlation with the topic. Then, the individual raw

    scores so determined are taken and weights are assigned depending on the salience. The scoresare then subjected to regression techniques using which the final score is calculated. In this

    manner, we can ensure that the essays are graded uniformly with equity and less fatigue.

    Contents

    Certificate iAcknowledgements ii

    Abstract(English) iii

    Abstract(Tamil) iv

    List of Figures viiiList of Tables ix

    1 INTRODUCTION 1

    1.1 Basic Cryptography . . . . . . . . . . . . . . . . . . . . . . . . .2

    1.1.1 Symmetric Key Cryptography . . . . . . . . . . . . . . .

    2

    1.1.2 Public Key Cryptography . . . . . . . . . . . . . . . . . .3

    1.1.2.1 Encryption . . . . . . . . . . . . . . . . . . . .

    41.2 Other Flavours of Cryptography . . . . . . . . . . . . . . . . . .

    5

    1.2.1 Identity-Based Cryptography . . . . . . . . . . . . . . . .5

    1.3 Provable Security . . . . . . . . . . . . . . . . . . . . . . . . . .

    6

    2 PRELIMINARIES 72.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    2.1.1 Bilinear Pairing . . . . . . . . . . . . . . . . . . . . . . .

    7

    2.1.2 Hardness Assumptions . . . . . . . . . . . . . . . . . . .8

    2.1.2.1 Discrete Logarithm Problem . . . . . . . . . .

    82.1.2.2 Computational Diffie-Hellman Problem . . . .

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    5/31

    8

    2.1.2.3 Decisional Diffie-Hellman Problem . . . . . . . 8

    2.1.2.4 Bilinear Diffie-Hellman problem . . . . . . . .9

    2.1.2.5 Decisional Bilinear Diffie-Hellman Problem . . 9

    3 RELATED WORK 113.1 Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11

    v

    3.1.1 Formal model of Encryption . . . . . . . . . . . . . . . .

    113.1.2 Security of Encryption Schemes . . . . . . . . . . . . . .

    12

    3.1.2.1 IND-CPA Game . . . . . . . . . . . . . . . . . 133.1.2.2 IND-CCA Game . . . . . . . . . . . . . . . . . 14

    3.1.2.3 IND-CCA2 Game . . . . . . . . . . . . . . . . 15

    3.2 Identity-Based Encryption . . . . . . . . . . . . . . . . . . . . .

    173.2.1 Boneh Franklin IBE: . . . . . . . . . . . . . . . . . . . . 19

    3.2.2 Other ID-Based Schemes: . . . . . . . . . . . . . . . . . 21

    3.3 Proxy Re-Encryption . . . . . . . . . . . . . . . . . . . . . . . .23

    3.4 Digital Signatures . . . . . . . . . . . . . . . . . . . . . . . . .

    25

    4 REQUIREMENT ANALYSIS 274.1 Product Perspective . . . . . . . . . . . . . . . . . . . . . . . . .

    27

    4.2 Product Functionality . . . . . . . . . . . . . . . . . . . . . . . .27

    4.3 User Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . 28

    4.4 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . .30

    4.5 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . .

    31

    5 SYSTEM DESIGN 365.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . .

    36

    5.2 Encryption Scheme - A variant of Twin Boneh- Franklin Scheme .

    375.3 Signature Algorithm - Hess Signature Scheme . . . . . . . . . . .

    38

    5.4 Proxy Re-encryption Scheme . . . . . . . . . . . . . . . . . . . .40

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    6/31

    6 SYSTEM DEVELOPMENT 43

    6.1 Scheme Implementation . . . . . . . . . . . . . . . . . . . . . .

    436.1.1 Tools used for implementation . . . . . . . . . . . . . . .

    43

    6.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .457 RESULTS AND DISCUSSIONS 50

    7.1 Screenshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    8 CONCLUSIONS 568.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    vi

    References 57

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    7/31

    vii

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    8/31

    List of Figures4.1 UseCase Diagram . . . . . . . . . . . . . . . . . . . . . . . . . .

    29

    4.2 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . .304.3 Registration Process . . . . . . . . . . . . . . . . . . . . . . . . .

    31

    4.4 Encrypting Questionnaire . . . . . . . . . . . . . . . . . . . . . .32

    4.5 Conducting Exam . . . . . . . . . . . . . . . . . . . . . . . . . .

    33

    4.6 Encrypting Answerscript . . . . . . . . . . . . . . . . . . . . . .34

    4.7 Re-encryption and Evaluation Phase . . . . . . . . . . . . . . . .

    355.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . .

    36

    6.1 Accessing Different Centers Question . . . . . . . . . . . . . . .

    476.2 Tampering Marks . . . . . . . . . . . . . . . . . . . . . . . . . .

    48

    6.3 Unsuccessful Decryption . . . . . . . . . . . . . . . . . . . . . .49

    7.1 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    7.2 Login Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    517.3 Student Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    7.4 Questions Display . . . . . . . . . . . . . . . . . . . . . . . . . .

    537.5 AnswerScript with DUMMY ID . . . . . . . . . . . . . . . . . .

    53

    7.6 Faculty Interface . . . . . . . . . . . . . . . . . . . . . . . . . .54

    7.7 View Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    9/31

    viii

    List of Tables

    3.1 Encryption schemes . . . . . . . . . . . . . . . . . . . . . . . . .22

    3.2 Signature schemes . . . . . . . . . . . . . . . . . . . . . . . . .

    25

    6.1 Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    10/31

    ix

    CHAPTER 1

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    11/31

    INTRODUCTION

    Writing tests are increasingly being included in large-scale assessmentprograms and high stakes decisions. However, Automated Essay Scoring(AES) systems developed to overcome issues of marker inconsistency,volume, speed, cost and so on, also raise issues of score validity. In orderto fill a crucial gap identified in the current approaches used to evaluate

    AES systems, we propose a framework that draws upon the current theoryof validation, for assessing the validity of scores produced from AutomatedEssay Scoring systems (AES) in a systematic and comprehensive manner.

    MOTIVATION:With the advent of online examinations like GRE, GMAT and CET4 there had

    been an increasing call for automation in the scoring process. Scoring of objective

    questions has been considerably simple and it has been existent since years back.But the evaluation of essays has been in practice only manually in most of the

    cases. This is because of the high complexity involved in programming such a

    system that will perform as good as a human in its cognition. With the evolution of

    advanced text database practices and Natural Language Processing (NLP)

    techniques, this has become possible off late.

    Any automated essay grading system should offer several salient features, most

    importantly, the following:

    1. Speed: Score generated in a matter of seconds as against the time-consuming

    manual correction.

    2. Ease/Less fatigue: Process made easy by automation as against the laborious

    manual task.

    3. Equitable: No place for any unjust favoring or unfair partiality or preference; all

    scores are generated unbiased.

    4. Uniformity: Overcomes the problem of different mindsets or attitudes of

    different evaluators; ensures all essays are graded on a similar outlook.

    SCOPE:

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    12/31

    As automation has become the order of the day, with most of the jobs being

    automated, evaluating essays has been an issue for long time. Since late 1960s,systems have been developed to evaluate essays automatically. To address the

    shortcomings of the earlier systems, we adopt a new approach and propose a

    versatile system.All the phases of the system, namely content discovery,analysis,grading of content shall operate in an unsupervised fashion and with no

    need of manual assessment.

    LITERATURE REVIEW::Lin Bin,Lu Jun,Yao Jian-Min,Zhu Qiao-Ming[1]proposes in AUTOMATEDESSAY SCORING USING KNN ALGORITHM a methodology of transformation

    of Essays into Vectors in which the training set of essays is converted into vectorsof word frequencies.Then they are then transformed into word weights and these

    weight vectors occupy the training space. To score the test essay, it is also

    converted into a weight vector. A search is conducted to find the training vectorsmost similar to it.This is measured by the cosine between the test and trainingvectors. The closest matches among the training set are used to assign a score to

    the test essay in the lines of Burstein.

    Feature Selection KNN:

    After eliminating the stop words, the features of the essays viz. words, phrases and

    arguments are chosen .The value of each vector is expressed by the term frequencyand inversed document frequency (TF-IDF) weight. Similarity of essays is

    calculated with cosine in the KNN algorithm.Term frequency TF is used to selectfeatures by predetermined thresholds.To find the highest information features, we

    need to calculate information gain for each word. Information gain IG for

    classification is a measure of how common a feature is in a particular classcompared to how common it is in all other classes.K-Nearest Neighbor Algorithm for Text Categorization:

    We determine the most similar k features as nearest neighbors to a givenfeature, and assign individual scores according to the distance of the neighbors

    calculated from suitable methods like Euclidean and Cosine relation. The finalscore is the weighted sum.

    http://en.wikipedia.org/wiki/Information_gain_in_decision_treeshttp://en.wikipedia.org/wiki/Information_gain_in_decision_trees
  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    13/31

    --The major issues arise because of the following limitations of using KNN

    algorithm.1. Memory based : Large space requirement to store the entire dataset is required.2. Unreliable Neighborhood-Lack of Overlapping results: Since the dataset usually

    gives a sparse matrix, there are no overlapping values. But similarity measuresrequire high overlapping for higher reliability.3. Unsuitable for corporate dataset- Due to Sparseness: By the above argument,since the corporate datasets are usually sparse, KNN is less suitable for them.

    In[2]AUTOMATED ESSAY SCORING SYSTEM FOR CET4 Yali Li,Yonghong

    Yan comes up with a methodology involving the following Score-Determining

    Components.The surface Features involve the number of characters in the

    document(Chars),the number of words in the document(words),the number of

    different words (Diffwds),the fourth root of the number of words in the document,as suggested by the Page(Rootwds),the number of sentences in the

    document(Sents),average word length(Wordlen=Chars/Words),average sentence

    length(Sentlen=Words/Sents),number of words longer than five

    characters(BW5).Grammar checking uses ALEK(Assessment of Lexical

    Knowledge)a tool.Bigram and trigram of part-of-speech tag sequence are used.For

    Sentence Error Detection,Parts-of-Speech tag analysis is used.To determine the

    relation to the topic, 2 approaches are being used viz.Simple Comparison of

    Keywords and Content Vector Analysis. The final score is computed by linear

    regression which is the linear weighted sum of several components.

    The major limitations occur due to linear Regression and are as follows:

    Incomplete description of relationship among variables-

    1. Extremes are ignored.

    2. Only Mean is considered

    Sensitive to outliers

    The precision attained by this methodology is 70.125%

    In [3]AUTOMATED ESSAY SCORING USING GENERALIZED LATENT SEMANTIC

    ANALYSIS by Md. Monjurul Islam , A. S. M. Latiful Hoque,Informational retrieval by Latent

    Semantic analysis using Singular Value Decomposition is achieved by the following process as

    depicted in the block diagram. It uses n-gram by document matrix.

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    14/31

    The issues that occur in the performance due to SVD are:

    1. Very high order of complexity of the Algorithm : O(n^2k^3).

    2. Requires Normal distribution of term : Words are required to be normally distributed across

    the documents. But in corporate datasets there is sparse distribution.

    Yali Li, YonghongYans hypotheisis in [4] AN EFFECTIVE AUTOMATED ESSAYSCORING SYSTEM USING SUPPORT VECTOR REGRESSION follows Dataset

    Construction Using Character n-grams over words.The key idea is to use Content Vector

    Analysis(CVA) over Latent Semantic Analysis(LSA). Uses SVM( Support Vector Machine)

    which is Model Based and popular in text classification problems where very high-dimensional

    spaces are the norm.Support Vector Regression is used for the final score calculation. Evaluation

    of rhetorical arguments is also possible by treating each argument as a mini-document. The

    process involves vector construction for each document by extraction of words,subsequent

    morphological analysis, followed by frequency vector construction and finally weight-

    assignment based on salience( relative freq and inverse relative freq).In CVA,cosine relation

    between test vector and document/class vector is computed and the class with highest correlationis selected.

    PROBLEM DEFINITION:

    From the survey of the related literature it is apparent that the development of a simple system

    that grades user essays with an accuracy similar to manual correction remains as a great

    challenge in the arena of educational data mining.Manual evaluation has its own drawbacks. It is time-consuming and requires an arduous task of

    reading and evaluating when the corpus is very large. There is also an additional possibility of

    unduly favoring of the preferred candidate. When an essay is being graded by different

    evaluators there is a likely chance that the problem of different mindsets evolves. So essays arenot graded with a uniform outlook. The existing automated systems are more complex and

    require large amount of training before deploying them..Hence, the evaluation systems must be

    improved to support automated grading in a faster, simpler, more scalable and an equitablemanner.

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    15/31

    .

    CONTRIBUTIONS:

    Using Model-based approach :

    Our system will use Model Based approach because memory based approach requires a large

    training dataset. Also, this is popular in text classification problems where very high-dimensionalspaces are the norm. In comparison to the Memory based approach that needs huge space,

    storage and training requirements, our model based approach of calculating the deviation of theessay examined from the ideal scored essaysis better preferred.

    Salience based correlation method:

    To assess the consistency of essays with the topic, we use Content Vector Analysis (CVA) inpreference to Latent Semantic Analysis (LSA). This is because in LSA a higher order

    algorithmic complexity of O(n^2k^3) is involved in SVD and words are necessarily required to

    exhibit normal distribution for a good performance.CVA can be used in this case as the

    distribution of words in corporate datasets can be expected to be of randomnature only.Salience is determined by the relative frequency of the word in the document and the inverse

    relative frequency over the other documents. For example, the word the may appear very

    frequently in a given document but its salience is very low because it appears in all thedocuments. If the word metamorphosis occurs even a few times, it will have a high salience

    because there are relatively few documents that contain this word.

    Using Ridge Regression for final score consolidation:A complete relationship among the different variables(The variables here correspond to the differentscores resulting from various rubrics predefined) is established by means of Ridge Regression. Thismethod ensures that the mean is taken into consideration along with the extremes.Ridge regression is L2 regularized linear regression in which the final prediction is the result of a widervariety of inputs as against a single input in the case of Linear Regression. This tends to make the systemmore robust for generalization.

    ORGANIZATION OF THIS THESIS

    The remainder of this thesis is structured as follows. Chapter 2 gives a requirement analysis of

    the proposed system covering all functional and non-functional requirements following which

    the system use-cases are presented. Chapter 3 of this thesis gives an overview of the Systemdesign and its architecture. In Chapter 4, the algorithms and techniques employed are discussed.

    Chapter 5

    discusses the performance evaluation of the proposed system in comparison with baseline

    methods. The thesis ends by summarizing the conclusions obtained along pointers to futureresearch in Chapter 6. The references for this research are presented in Chapter 7 followed by the

    snapshots in Appendix A.

    REQUIREMENT ANALYSIS:FUNCTIONAL REQUIREMENTS:

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    16/31

    The proposed system is designed to have the following features:

    It evaluates the essay based on 4 dimensions viz. Grammar, Spelling , Correlation with the topic

    and Surface features.For Grammar checking, we use jlinkgrammar, a grammatical system to classify natural

    languages by designating links between sequences of words. Instead of using part-of-speech tags

    based on rules to parse sentences, it uses links to create a syntactic structure for a language.Spelling mistakes are identified and thereby automatically corrected by using the Peter Norvigsspell correction method. It uses probabilistic and Bayesian theories in its implementation.

    NONFUNCTIONAL REQUIREMENTS:User Interface Design:

    For Information Retrieval, Stemming and Stop word removal do not require any specialized

    interfaces. The system operates as a stand alone application. The system provides interfaces forquestion/answers

    to assess the learners competency level. Interfaces are also provided for presenting the details of

    the score assisgment process to the learner in the form of JFrames.In addition, the interfaceincludes provision for specification of the target corpus.

    2.2.2 Documentation

    The system is properly documented. All requirements(functional/non functional),use casediagrams, their description, the various packages used and the relevant tools employed shall form

    a part of the system documentation. The source code listing has also been documented to serve

    as a reference to future developers and contributors.

    2.2.3 Hardware Considerations

    The following hardware considerations are identified:

    Operating System: UbuntuProcessor: Pentium 2.0 GHz or higher

    RAM: 256 MB or more

    Hard Drive Space : 10 GB or more

    2.2.4 Performance Requirements

    The performance of the system is evaluated against baseline method of manual grading of essaywith respect to 4 parameters viz. Grammar, spelling, Correlation with the topic and Surface

    features.

    Error HandlingThe following errors are possible in each of the modules:

    Null entry in the text area:

    When the user enters nothing in the essay text area, a message box appears for the first time

    prompting him to key in the essay. This is done to ensure that he does not submit an essay byclicking the Submit button without his knowledge. If the action is repeated, the null essay is

    considered and assigned a score of zero.

    Content Vector analysis:

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    17/31

    In case there is total absence of relation between the essay question and the candidates answer, a

    score of zero is assigned notwithstanding the performance in Grammar,Spelling and Surface

    Feature aspects.

    CONSTRAINTS and ASSUMPTIONS:

    Language Constraints:Language of consideration is English.Assumption:

    A manually prescored corpus of reference essays is assumed to be already present before the

    candidate enters the essay of any particular question for a prompt score. The manually correctedessays are assumed to be evaluated on the basics of the 4 holistic rubrics mentioned above in an

    error free manner.

    SYSTEM MODELS:Use case model and Scenarios

    The various use cases in the Figure x.x are elaborated in this section.

    Use case : Create Essay questionID: 001

    TITLE: Create essay question

    DESCRIPTION: The question for which the candidate is required to answer is created by the

    administratorACTORS: Admin

    PRE CONDITIONS: The admin should have been logged in

    POST CONDITIONS: The question flashes on the User Interface.

    Use case : Input Test essay

    ID: 002

    TITLE: Input Test essayDESCRIPTION: The student enters his response for the required question in the Text area

    ACTORS: Student

    PRE CONDITIONS: The student should have been logged in and the question should have beenprompted on the screen

    POST CONDITIONS: The essay,upon submitting, gets stored in the required file.

    Use case : Store Essay

    ID: 003

    TITLE: Store Essay

    DESCRIPTION: The essay that the student enters gets stored at the required file location.ACTORS: Test essay db

    PRE CONDITIONS: The student should have entered the essay and clicked the Submit button.

    POST CONDITIONS: The essay contents get copied to the file in the desired location.

    Use case : Input Reference Essay

    ID: 004

    TITLE: Input Reference Essay

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    18/31

    DESCRIPTION: The admin enters the name of the folder where there are pre-scored manually

    graded essays.

    ACTORS: AdminPRE CONDITIONS: The reference essays must be available in the folder after having been

    graded manually.

    POST CONDITIONS: The reference essays will be ready for comparison with the test essay.

    Use Case: Text Processing

    ID: 005

    TITLE: Text ProcessingDESCRIPTION: Stemming and Stop-word removal of the reference and test essays are done.

    ACTORS: Reference Essay db, Test Essay db

    PRE CONDITIONS: The Simple Feature Extraction, Spell check and auto correction and

    Grammar check should have been performedPOST CONDITIONS: The keywords of the reference and test essays are displayed.

    Use Case: Generate Individual ScoreID: 006

    TITLE: Generate Individual Score

    DESCRIPTION: Depending on the rubrics, the essays are evaluated and suitable marks are

    awarded for each.ACTORS: Reference Essay db, Test Essay db

    PRE CONDITIONS: The text from test and reference essays must have been processed.

    POST CONDITIONS: The individual score components are recorded.

    Use Case: Display Grade

    ID: 007

    TITLE: Display GradeDESCRIPTION: The individual scores are combined and the overall score range is estimated.

    ACTORS: Score db

    PRE CONDITIONS: The individual scores must have been generated.POST CONDITIONS: The final score range is displayed on the screen.

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    19/31

    CHAPTER 3

    DESIGN

    This chapter gives the detailed design description of modules in the system.

    3.1 Data Flow Diagram:The Data Flow Diagram in Figure x.x lists the various stages involved in the implementation of

    the system. The various relationships between them are also shown in Figure 3.1.

    FIGURE x.x.1: DFD Level-0

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    20/31

    FIGURE x.x.2: DFD Level-1

    FIGURE x.x.3: DFD Level-23.2 USER INTERFACE DESIGN

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    21/31

    The system uses JFrame for essay input purposes. It also uses the same for displaying the

    different stages of the evaluation and for displaying the score. Text areas, Text boxes, message

    boxes and tabbed pane are used for user interactivity.

    3.3 OVERALL SYSTEM ARCHITECTURE

    The proposed system shown in Figure x.x is composed of the following modules:

    FIGURE 3.2: System Architecture

    3.4 MODULE DESCRIPTIONS

    3.4.1 SIMPLE FEATURE EXTRACTION:

    INPUT:Test Essay

    OUTPUT:Text Complexity Feature Score Component1

    The system first evaluates text complexity features, such as the number of characters in

    the document(Chars),number of words in the document(words),number of different words(Diffwds) fourth root of the number of words in the document, as suggested by the

    Page(Rootwds), number of sentences in the document(Sents),average word

    length(Wordlen=Chars/Words),average sentence length (Sentlen=Words/Sents) and number ofwords longer than five characters(BW5). Each feature has its own use. For example, the number

    of words represents the length of the essay since the length requirement is say 250-300 words.

    This feature can check the empty essay or essay which is ridiculously short that it cannot beprocessed and rejects it immediately. Otherwise a score can be assigned accordingly.

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    22/31

    3.4.2 GRAMMAR/SPELL CHECK:

    INPUT:Test EssayOUTPUT:Spell Check Score Component2

    Once the essay passes the feature extraction process, the next step is to check the essayfor any spelling mistakes. The count of the number of spelling mistakes has to be recorded and

    the errors must be auto-corrected.

    INPUT: Auto Corrected Test EssayOUTPUT: Grammar Check Score Component3

    Then the essay must be checked for grammatical mistakes using jlinkgrammar,whichworks on the basic principle of linking.It uses probabilistic parsing and deduces the number of

    linkage errors from which the potential grammar errors are identified from the passage sent

    through the batch file and based on it a component of score must be assigned.

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    23/31

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    24/31

    3.4.4 FINAL SCORE USING REGRESSION:The individual raw scores, namely from the feature extraction process, the grammar/spell

    check process and the content vector analysis process, are taken and weights are assigned for

    each component. The scores are then subjected to ridge regression techniques using which the

    final score is calculated.

    IMPLEMENTATION

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    25/31

    This chapter explains the details of implementation of all modules of the proposed

    system.

    4.1 IMPLEMENTATION DETAILS

    The system is implemented in Java.

    Java JFrames was used for the user interface design of the system.Netbeans was the IDE of choice for Java.For grammar checking , the system was coded using the features of the tool , jlinkgrammar.

    Stanford Tagger is used to find the number of verbs for Surface feature analysis.

    Tools used in the implementation:

    Packages: Stanford Parts-of-Speech Tagger

    Dictionary: Peter Norvigs essay (Spelling Correction)IDE for Java: Netbeans

    Grammar check: Jlinkgrammar

    TEXT PROCESSING DETAILS:

    3.1 Stop Word Removal

    Many of the most frequently used words in English are useless in Information Retrieval (IR) and

    text mining. These words are called 'Stop words'. Stop-words, which are language-specificfunctional words,are frequent words that carry no information (i.e., pronouns, prepositions,

    conjunctions). In English language, there are about 400-500 Stop words. Examples of such

    words include 'the', 'of','and', 'to'. The first step during preprocessing is to remove these Stop words, which has proven as

    very important.

    The present work uses the stop word list customized by us.

    3.2 StemmingStemming techniques are used to find out the root/stem of a word. Stemming converts words to

    their stems, which incorporates a great deal of language-dependent linguistic knowledge. Behind

    stemming, the hypothesis is that words with the same stem or word root mostly describe same orrelatively close concepts in text and so words can be conflated by using stems. For example, the

    words, user, users, used, using all can be stemmed to the word 'USE'. In the present work, the

    Stemmer algorithm is defined and used.3.3 Document Indexing

    The main objective of document indexing is to increase the efficiency by extracting from the

    resulting document a selected set of terms to be used for indexing the document. Document

    indexing consists of choosing the appropriate set of keywords based on the whole corpus ofdocuments, and assigning weights to those keywords for each particular document, thus

    transforming each document into a vector of keyword weights. The weight normally is related to

    the frequency of occurrence of the term in the

    document and the number of documents that use that term.3.3.1 Term Weighting

    In the vector space model, the documents are represented vectors. Term weighting is an

    important concept which determines the success or failure of the classification system. Since

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    26/31

    different terms have different level of importance in a text, the term weight is associated with

    every term as an important indicator.

    The main components that affect the importance of a term in a document are the Term Frequency(TF) factor and Inverse Document Frequency (IDF) factor. Term frequency of each word in a

    document (TF) is a weight which depends on the distribution of each word in documents. It

    expresses the importance of the word in the document. Inverse document frequency of each wordin the document database (IDF) is a weight which depends on the distribution of each word inthe document database. It expresses the importance of each word in the document database.

    TF/IDF is a technique which uses both TF and IDF to determine the weight a term. TF/IDF

    scheme is very popular in text classificationfield and almost all the other weighting schemes are variants of this scheme.

    Given a document collection 'D', a word 'w', and an individual documentd D, the weight w is

    calculated using Equation x.

    The result of TF/IDF is a vector with the various terms along with their term weight. The pseudocode for the calculation of TF/IDF is shown in Fig.2.

    Determine TF, calculate its corresponding weight andstore it in

    Weight matrix (WM)

    Determine IDF

    if IDF == zero thenRemove the word from the WordList

    Remove the corresponding TF from the WM

    ElseCalculate TF/IDF and store normalized

    TF/IDF in the corresponding element of the

    weight matrix

    Fig. 2 Algorithm TF/IDF

    RESULTS AND DISCUSSION(YET TO DO)TEST RESULTS AND ANALYSIS

    Test case Id: AES1

    Module being tested: Essay_entry UI

    Test case Description:

    This test case verifies if the essay is keyed in or not.

    Flow of Events:

    1.A question pertaining to a topic appears in the interface

    2.In case the users answer is null, the user is again prompted to make his entry.

    Expected Results:

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    27/31

    The essay, if null, gets invalidated.

    Exceptions:

    If he continues doing the same for the second time, the null entry gets accepted and given a scoreof 0.

    Test Result: PASS

    Comments and Bugs(if any) identified: NIL

    Test case Id: AES2

    Module being tested: Response Recording

    Test case Description:

    This test case verifies if the essay entered by the user gets stored in a dedicated text file.

    Pre-condition:

    The user provides an answer to the essay question.

    Flow of Events:

    1.The user submits the essay after entering it on the UI.

    2.If it is non-null the essay gets recorded in a text file

    Expected Results:

    The contents of the essay entered in the UI are copied on to the text file at the desired location.

    Exceptions:

    If the essay is null, it gets terminated after assignment of score zero.

    Test Result: PASS

    Comments and Bugs(if any) identified: NIL

    Test case Id: AES4

    Module being tested: Grammar and spell check

    Test case Description:

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    28/31

    This test case verifies if the essay entered by the user gets evaluated according to the no. of

    spelling and grammar errors..

    Pre-condition:

    The user provides an answer to the essay question.

    Flow of Events:

    1.The essay is checked for Spelling errors and auto-corrected using Peter Norvigs method

    2.The no. of such Spelling errors get recorded and influence a percentage of negative score.

    3. The test essay is then checked for Grammar errors using jinkgrammar.

    4. The no. of Grammar mistakes are recorded and a negative score is allotted correspondingly.

    Expected Results:

    The score components for Grammar and spelling are recorded.

    Exceptions:

    NIL

    Test Result: PASS

    Comments and Bugs(if any) identified: NIL

    Test case Id: AES5

    Module being tested: CVA

    Test case Description:

    This test case verifies if the most similar essay is determined using CVA.

    Pre-condition:

    The reference corpus contains the prescored essays.

    Flow of Events:

    1. The documents are indexed along with the test essay document.

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    29/31

    2.The keyword extraction occurs after stemming and stopword removal.

    3.The raw frequencies of the keywords in all the documents are determined.

    4. The relative frequencies are found using tf -idf computation and the weighted Term Document

    matrix(TDM) is found.5. The similarity level between the test essay and the reference essays is computed by cosinecomputation of vectors in which each column of Weighted TDM is treated as a vector.

    6. The most similar document is identified and the corresponding score is allotted.

    Expected Results:

    A specific component of the score is allotted based on the relation with the reference documents.

    Exceptions:

    If users response is totally off the topic,it is given a score of 0.

    Test Result: PASS

    Comments and Bugs(if any) identified: NIL

    CHAPTER 6CONCLUSION

    6.1 OVERALL CONCLUSIONIn this work, we have presented a novel framework for automatic evaluation of essays.

    For the grammatical checking component, we use linkages to detect errors which is more reliable

    than the traditional method of defining the rules for checking grammar.Instead of using part-of-

    speech tags based on rules to parse sentences, it uses links to create a syntactic structure for alanguage. For the topic detection component, we use the CVA model, and find it can effectively

    detect whether an essay is off-topic especially for large number of essays. The final score so

    computed by using ridge regression is influenced by a number of factors rather than being overly

    affected by a single factor.

    6.2 FUTUREWORK

    In our work we have restricted ourselves to the English language. But it can be further extendedto cater to the other languages using suitable dictionaries. Also, for descriptive science essays,

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    30/31

    provision to include and evaluate the equations and formulae can be made as an additional

    enhancement.

    Also, in addition to checking the correlation between the documents using exact words, similar

    words can be derived with an aid of thesaurus and the checking can be performed for more

    accuracy.

    REFERENCES

    [1] Lin Bin,Lu Jun,Yao Jian-Min and Zhu Qiao-Ming ,Automated Essay

    Scoring Using KNN Algorithm ,International Conference on ComputerScience and Software Engineering , 2008.

    [2] Yali Li and Yonghong Yan, Automated Essay Scoring System For

    CET4 , Second International Conference on Education technology and

    Computer Science ,2010.[3] Md. Monjurul Islam ,and A. S. M. Latiful Hoque , Automated Essay

    Scoring Using Generalized Latent Semantic Analysis ,13th International

    Conference on Computer and Information Technology ,2010.[4] Yali Li, and YonghongYan, An Effective Automated Essay Scoring

    Sy

    [4] Yali Li, and YonghongYan, An Effective Automated Essay ScoringSystem Using Support Vector Regression ,Fifth International Conference

    on Intelligent Computation Technology and Automation ,2012.

    [5]Dikli.S,An Overview of Automated Scoring of Essays,Journal of

    Technology, Learning, and Assessment, 5(1),Retrieved from http://

  • 8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

    31/31

    www.jtla.org,2006.

    [6]J. Burstein, K. Kukich, S. Wol, C. Lu, M. Chodorow, L. Bradenharder, and

    M.Dee Harris, Automated Scoring Using A Hybrid Feature Identication

    Technique, in Proc. In the Proceedings of the Annual Meeting of the

    Association of Computational Linguistics,1998.

    SNAPSHOTS:

    In line with text| Fixed position

    stem Using Support Vector Regression ,Fifth International Conference

    on Intelligent Computation Technology and Automation ,2012.[5]Dikli.S,An Overview of Automated Scoring of Essays,Journal of

    Technology, Learning, and Assessment, 5(1),Retrieved from http://

    www.jtla.org,2006.

    [6]J. Burstein, K. Kukich, S. Wol, C. Lu, M. Chodorow, L. Bradenharder, andM.Dee Harris, Automated Scoring Using A Hybrid Feature Identication

    Technique, in Proc. In the Proceedings of the Annual Meeting of the

    Association of Computational Linguistics,1998.

    SNAPSHOTS:


Recommended