Designing and evaluating a system of document recognition to support interoperability among...

Computers in Industry 64 (2013) 598–608

Designing and evaluating a system of document recognition to supportinteroperability among collaborative enterprises

Usman Wajid *, Abdallah Namoun, Cesar A. Marın, Nikolay Mehandjiev

The University of Manchester, Booth Street West, M15 6PB, United Kingdom

A R T I C L E I N F O

Article history:

Received 29 August 2012

Received in revised form 27 November 2012

Accepted 5 March 2013

Available online 30 March 2013

Keywords:

Enterprise interoperability

Document recognition

A B S T R A C T

A common understanding of business documents is important for realizing interoperability among

collaborating enterprises. In this paper we report on the design and evaluation of a system that can help

collaborating enterprises to efficiently recognise and align business documents (e.g. incoming emails

and attachments) according to the enterprises’ local document repository. The system relies on the

interplay between automatic recognition of business documents and human intervention by means of an

assistive mapping tool. Our findings show that the balance between automatic recognition and human

intervention ensures greater levels of accuracy in the system. We evaluate our system from both

usability and accuracy perspectives and argue that some of the lessons learned and design decisions we

took are applicable to the general mixed initiative tools for end user support.

� 2013 Elsevier B.V. All rights reserved.

Contents lists available at SciVerse ScienceDirect

Computers in Industry

jo ur n al ho m epag e: ww w.els evier . c om / lo cat e/co mp in d

1. Introduction

The dynamic nature of modern businesses often requires smalland medium sized enterprises to participate in agile networks ofcollaborative enterprises, where each member of the networkprovides specialised products, technologies and services. Toeffectively take part in such networks enterprises must operatein a manner that permits them to be seamlessly interoperable withpartner enterprises. Interoperability is a loosely coupled term inbusiness domain; generally associated with the use of IT systemsand services that allow exchange and use of information withoutrestrictions.

In this paper, we focus on semantic interoperability withincollaborative enterprises, which relates to the ability of anenterprise to meaningfully interpret exchanged information inorder to produce useful results. Since information in collaboratingenterprises is usually exchanged by means of business documents(documents here after) sent as emails and their attachments e.g.order, invoice, inquiry etc.; handling such documents is animportant function for any enterprise. However, manuallyprocessing of documents can be quite cumbersome, error prone,and time consuming. For example, imagine an employee of amedium-sized enterprise who receives around 90 documents daily

* Corresponding author. Tel.: +44 161 306 3331.

E-mail addresses: [email protected] (U. Wajid),

[email protected] (A. Namoun), [email protected]

(C.A. Marın), [email protected] (N. Mehandjiev).

0166-3615/$ – see front matter � 2013 Elsevier B.V. All rights reserved.

http://dx.doi.org/10.1016/j.compind.2013.03.003

in different formats such as emails and word and excel filesattached to emails. The employee is asked to read, understand andact upon these documents by sending them to the appropriatedepartments within the enterprise for further processing. Thislaborious task of recognizing or determining the nature ofincoming documents can be made more efficient by automatingthe document analysis and recognition process.

Document recognition is often seen as a process of convertingpaper documents into digital images and performing certainoperations on that data [13,14]. However, in this paper we usethe term document recognition for a process of identifying thenature and rationale of documents (transacted through elec-tronic means such as email) within the local context of anenterprise so that the document can effectively serve its purpose.For any enterprise, automation of such process can lead to betterperformance and efficient communication with customers andsuppliers.

In this context, we are interested in how mutual understandingcan be established on the type of documents that are exchanged bycollaborating enterprises. However, we acknowledge that thedynamic and open nature of modern business domain as well thesensitivity involved in dealing with business information renders acompletely automated document recognition system unsuitable.Hence, we resolved to create a system that performs automaticdocument recognition while allowing a certain level of manualintervention in situations where the automatic recognition is notenough, can be improved, or when the localised knowledge used inthe document analysis should be extended to accommodate newbusiness services, documents or relationships. In this paper we

http://crossmark.dyndns.org/dialog/?doi=10.1016/j.compind.2013.03.003&domain=pdf

http://crossmark.dyndns.org/dialog/?doi=10.1016/j.compind.2013.03.003&domain=pdf


mailto:[email protected]




http://www.sciencedirect.com/science/journal/01663615


Fig. 1. Conceptual architecture of document recognition system.

U. Wajid et al. / Computers in Industry 64 (2013) 598–608 599

report on the design and evaluation of our system that preservesthe benefits of both i.e. automation and manual intervention.

The conceptual architecture of our system (shown in Fig. 1) isbased on the interplay between automatic mechanism and manualintervention. As shown in Fig. 1 any new document received by theenterprise passes through the automatic recognition mechanism,which determines the type of incoming document before storing itin the local repository. At any time a user can use the visual datamapping tool (VDMT) to view and alter/modify a document. Afterthe manual intervention (facilitated by VDMT) the user may eithersend the document through the automatic recognition mechanismor manually classify it as an instance of a particular type (e.g. order,invoice etc.) before storing it in the local repository.

Our automatic document recognition mechanism is developedbased on the ideas presented in [8]. We chose this technique due to(a) its inherent applicability to a business domain, (b) itsconvenience to use any document’s own semantics, and (c) itsability to achieve promising results as shown in [8]. Basically, theautomatic mechanism finds similarities among a set of documentsused by an enterprise and automatically recognises them as aninstance of a particular document type, thus reducing thecomplexity and effort required to manually process incoming orexisting documents. Our automatic mechanism does not imposeany centralised agreement on the use of pre-agreed documenttypes or syntax by collaborating enterprises. Instead, it analysesthe incoming documents for patterns of occurrence of significantinformation items (such as address, item number, bank accountinformation, etc.) that can help in the recognition of a document,thus reasoning with the information content not the documentsyntax. For example, consider our partner enterprise sends an‘order’ to our enterprise. We do not look at the documentspecification as referred to in any external document repository (as‘‘order’’) since this may not make sense in the context of ourenterprise. Our mechanism recognises the document based on ourinternal document repository (e.g. as a ‘‘purchase request’’), usingthe semantics of the information items contained in the document,such as item codes and descriptions, customer account, shippingaddress, etc. Further details about the techniques underlyingautomatic document recognition mechanism can be found in [8],while here we focus on the interplay between the automaticmechanism and manual intervention and how the user involve-ment informed the design of the VDMT.

The main function of the VDMT is to facilitate manualintervention and complement the automatic mechanism. Inessence the VDMT is envisioned to enable individual enterprisesto manually interpret and recognise documents as well as createand maintain a repository of document types adapted to theirneeds. The VDMT was created using a user-centric design processinvolving actual users from various enterprises.

In this paper, we evaluate our system from both perspectives:the accuracy of automatic recognition mechanism as well as theuser-based evaluations, and argue that some of the lessons learnedand design decisions we took are applicable to the general type ofmixed initiative tools for end user support in the business as wellas industrial domain. Here mixed initiative tools broadly refer toset of tools that explicitly support an efficient and naturalinterleaving of contributions by users and automated services toachieve fluid collaboration between users and computers that cansolve difficult challenges [12].

2. Techniques and tools for facilitating interoperabilitythrough document recognition

From research point of view, typically research efforts forsemantic interoperability focus on fixing a unique meaning(semantics) through a commitment to a common ontology, whichis used as metadata that is shared by all collaborating entities. Inthis respect, many proponents of the Semantic Web seek auniversal medium for information exchange based upon XMLsyntax. This has given rise to standards like Resource DescriptionFramework (RDF) and its elaboration in Web Ontology Language(OWL).

The predominant use of ontology to foster semantic interoper-ability is also reflected by the numerous research efforts anddevelopment of software tools for ontology alignment (see [10]).For instance, in [11] an overview of multiple-view tool AlViz isprovided that can support the visual alignment of two ontologiesby making the type of similarity between entities explicit.However, developing domain specific ontologies requires technicalskills and specialised knowledge (about local information struc-tures and appropriate mapping rules) in order to align thecentralised and local ontologies. We expect that such skills andspecialised knowledge is difficult to find in our target users thathave business background and little technical know-how. More-over, the agreement on common semantic model (ontology) is notwell suited for distributed environments [1].

Furthermore, in collaborative enterprises establishing a com-mon understanding on exchanged documents is often achieved byemploying tools that are based on the ‘usage’ of a central canonical

model of existing document types and data formats. Here, ‘usage’refers to (a) adopting an SME’s internal document structures to thecanonical model and (b) transforming documents between twocollaborating SMEs. In order to address the first issue, aneXtensible Markup Language (ebXML) [4] has been formulatedas medium for interoperability in electronic businesses. In parallela canonical model that should be used by collaborating parties wasalso formulated. This canonical model is addressed as CoreComponent Technical Specification (CCTS) [3].

Although the formulation of ebXML and CCTS were crucial,there is a huge gap regarding their adoption by the SMEs. In thisrespect, a variety of tools have been created in order to bridge thisgap. Indicatively, ebMail [4] is a platform-neutral GUI system thatprovides an email-client-like user interface to help users withminimal knowledge of ebXML to engage in B2B activities. Thesystem allows users to communicate with other partner enter-prises using ebXML by means of an easy to use GUI. However, as inthe case of using a centralised ontology, centralised agreements ondocument types can impede operation in agile networks of SMEs,since a centralised set of agreed document types would make theintroduction of a new document type laborious and long-windedprocess of agreeing an extension to an already agreed standard.

An industry level solution for automated document exchange isNEXUSe2e [6]. NEXUSe2e provides XML based messaging usingebXML and other protocols to integrate business processes thatspan multiple companies. Its main use is for supply chain

U. Wajid et al. / Computers in Industry 64 (2013) 598–608600

integration, e.g. for exchanging orders and other businessdocuments in electronic form. However, the core idea behindNEXUSe2e is not to facilitate the recognition of different types ofbusiness documents exchanged within collaborative enterprisesbut to enable enterprises to exchange information, which can bedifficult to understand by ordinary/non-technical users.

In the area of document recognition, early work on documentrecognition in [14] presents a system for automatically reading(paper based) office document using document segmentation,classification, character recognition and logo identification.However, the proposed system is not designed to classify oridentify the types of documents as it mainly focuses oninterpreting paper document and transforming them into digitalobjects.

Elsewhere, the Text Map Explorer tool discussed in [9] allowusers to group documents and visually explore the relationshipsamong different documents. However, the tool does not allowusers to (a) alter or manipulate the documents according to localneeds and (b) consider user preferences while determining therelationships between documents. In this respect, the Text MapExplorer tool lacks the flexibility required for dealing with differenttypes of documents.

In other work, the Pan-European Public Procurement On-Lineproject (PEPPOL) aims to facilitate the automated procurementprocesses of European SMEs [17]. The project covers the entireprocurement process including a document repository that carriesthe necessary documents (e.g. certificates and attestations)required for the qualitative selection of tenders. However, theproject is designed to facilitate the procurement process specifi-cally for European Commission and does not focus on enablingSMEs to automatically interpret and recognise the different typesof documents.

The OASIS Framework in [19] describes the adoption of acommon European standardised interchange format to enable aminimum level of interoperability between civil protectionagencies. However, the proposed framework has limited applica-bility in the enterprise domain since the OASIS framework is notdesigned to deal with business documents. Rather, the informationto be exchange in OASIS is encoded in XML Schema, which is thenpassed to different entities in the system.

Another related effort to document recognition focusesprimarily on CASE tools [18]. The proposed approach enhancesreusability of UML diagrams among differing UML modelling tools.However, the approach works with UML diagrams and does notapply to documents of different types.

The approaches described above were designed to work inspecific domains (e.g. e-procurement, UML modelling, etc.), whichrenders them unsuitable for use in any other domain. Moreimportantly, the key feature of these approaches is not to interpretand identify document types, which drops any case of comparisonwith the system presented in this paper.

At commercial level, several systems are available for automaticdocument recognitions e.g. [15,16]. However, we do not considerthem as direct competitor to our system since they are eitherdesigned for heavy volume users (such as corporations orgovernment agencies), not flexible enough to address the specificneeds of SMEs from different sectors or come at a cost that is notviable for SMEs. In comparison to commercial solutions, oursystem is available free of cost as an open source resource, has beendeveloped and tested based on requirements from collaboratingSMEs and has the ability to adapt according to the specific needs ofindividual SMEs.

In this respect, the overview of existing work reveals thatexisting approaches lack in one way or the other the supportneeded by SMEs to deal with documents of different types. Themain weaknesses or limitations of existing work/approaches are:

� Use of complex technologies which are not easy to understandand use by ordinary non-technical users.� Cost that is not viable for SMEs.� Suitability for large enterprises thus impeding their adoption by

SMEs [2].

On the other hand the system presented in this paper isdesigned to be used by users with limited technical know-how, isavailable relatively free and is suitable for SMEs that can tune it tosuit their specific needs and work practices.

In Table 1 we organise the existing work on semanticinteroperability and document recognition as discussed aboveand more. We than highlight the salient advantages of the existingwork as well as their underlying limitations.

3. Design of visual data mapping tool

To acquire users’ point of view about the desired features of thevisual data mapping tool we consulted users from variousEuropean SMEs during the design phase of the tool. The processof gathering requirements and using them to design the tool isdescribed below.

3.1. Requirement elicitation process

Since our document recognition system relies on the use ofshared set of information items that are seen as building blocks ofdocuments, initially we intended to use the information items andstructures defined in a widely used standard for inter-organisa-tional interoperability, such as UN/CEFACT [3]. We thereforeselected a generic set of the UN/CEFACT’s core components basedon their suitability for the document samples and case studiesacquired from our partner SMEs. These core components were toserve as the building blocks for documents. This was importantsince our users were needed to comprehend and annotate the keyinformation items in the documents using the VDMT, theannotations are considered by the automatic recognition mecha-nism when it tries to recognise the type of the document.

We thus set up a user study aiming to validate the degree ofcomprehension of our selected set of core components by targetusers. During the study we presented the set of core components tousers and asked them to map individual core components with keyinformation items in sample documents. However, during the pilotstudy it became clear that the core components standard uses aninformation-based structure making it incomprehensible to ournon-technical target users.

Therefore, we re-designed the study to elicit which units ofinformation were sought by our end users when they analyse andrecognise a document, and then translate these units into a user-centric classification of information items. This classification wasthen mapped to the standard core components set (as shown inFig. 2). During this process it became evident that some userspecified information items did not match with standard corecomponents, these items were treated as new core components,and were included in the core components set to be used by theVDMT. This procedure was also prescribed to be followed in futurewhen new core components are identified or specified by the usersduring day to day document processing activities.

Based on the outcome of user requirement gather sessions, thedesired features of the VDMT are organised in the followinggroups.

G1. Annotation and classification of documents: this groupconsists of the requirements dealing with the basic functionali-ty, i.e. recognition of document instances already within thelocal document repository of the enterprise.

Table 1Advantages and limitations of existing techniques and systems.

Technique or system Type or main purpose Key features and advantages Weaknesses or limitations

AlViz Facilitating semantic interoperability � Visual alignment of two

ontologies

� Requires technical know-how for ontology

alignment

� Agreement on a common semantic model is

not feasible in target domain as in

collaborative SMEs

ebMail Document exchange � GUI system for information

exchange

� Requires centralised agreement on document

types

� Difficult to get agreement on new document

types

NEXUSe2e Message (document) passing across

different processes/companies

� Allow exchange of documents

in electronic form

� No support for recognizing or classifying

incoming document

Text map explorer Document matching � Allows grouping of documents

based on their types and visual

exploration of relationship

between documents of

different type

� Does not provide automatic recognition of

incoming documents

� User cannot adapt documents and alter

relationships based on local preferences

PEPPOL

environment (VCD)

Facilitate EU-wide interoperable

public e-Procurement online

� EU SMEs communicate with EU

governmental institutions for

the entire procurement process

� Not suitable for recognizing the incoming

documents since it requires more semantic

support to understand the different types of

exchanged documents

� Allows SMEs to submit certificates

and attestations electronically

� User cannot annotate electronic documents

for ease of understanding the meaning of

certain terms or the purpose of overall

document

CASE tools Evaluate interoperability of UML

modelling tools

� Enhance reusability of UML diagrams

among differing UML modelling tools

� Focuses on UML diagrams only

OASIS framework

(using TSO standards)

Enable interoperability among civil

protection agencies

� Uses heterogeneous systems to share

a comprehensive operating picture

� Does not apply to the enterprise domain

� Compiles information from different

teams

� Does not allow annotations of documents

� No automatic classification of messages


R1. To allow the retrieval of existing documents and thevisualisation of their annotations and current recognition resulti.e. type of the document.R2. To permit the annotation, de-annotation, and re-annotationof documents.R3. To allow the re-classification of documents.G2. Document type management: this group includes thecreation and updating of document types based on documentinstances.R4. To allow the manipulation of document types, i.e. thecreation, modification, deletion and updating of documenttypes.R5. To permit the creation of document types by uploading localdocuments and templates.

Based on the outcome of user requirement gathering studies, inTable 2 we organise a set of desired features for VDMT andcompare them with the main characteristics of some existing tools(as discussed in Section 2).

Table 2Overview of existing tools and their classification against a set of desired features.

Existing technique/

technological solution

Distinguishing key feature Desired features

Document

annotation

Document m

(editing, form

AlViz Facilitating semantic

interoperability

No No

ebMail Document exchange No No

Text Map Explorer Document matching No No

NEXUSe2e Message (document)

passing across different

processes/companies

No No

As shown in Table 2, our set of desired features is notcompletely represented by any existing tool. This is partly becausein comparison to the existing work, which mainly focuses oncentralised information sharing solutions, we presume thatdocuments are composed from a shared set of information itemsor building blocks–core components with a specific semanticmeaning that can appear on a number of document types. By‘shared set of core components’ we mean that the semanticinteroperability system within collaborating enterprises willencompass that shared set of core components. In particular ourautomatic recognition mechanism classifies the documents basedon the set of such information items or core components containedin the document.

3.2. Overall design of VDMT

Fig. 3 depicts the overall functioning of our overall system thatis composed of automatic recognition and manual interventionmodules. Starting from VDMT, at the highest level the VDMT has

anipulation

atting, etc.)

Document alignment

(match with other documents)

Ease of use (visual interface,

drag and drop)

No Yes

Yes No

Yes Yes

No No

Fig. 2. Mapping of user-centric classification of information items (left, blue and orange) with the UN/CEFACT core components (right, in white background). (For

interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)


two main components, namely Core Components (CC) Browser andDocument Browser. As the name suggests, the purpose of CCBrowser is to present core components hierarchy to the users andthe Document Browser is designed for viewing documentinstances (including incoming and existing documents). The useof CC Browser and Document Browser leads to the Annotation ofdocuments at the second level. At this level, the most criticaldesign principle was derived from, what emerged as, thedecomposition or fragmentation problem. More specifically thetool should enable a user to annotate graphically any piece ofpresented/rendered information. Annotation was seen as a user ledprocess of highlighting important pieces of information within the

Fig. 3. Functionality offered in the system by the interplay of

document and associating each piece of information with relevantcore component(s) from the CC browser.

While structuring the core components representation in theCC browser, the design guidelines (extracted from user studies)specified that the presentation of the core components shouldbe hierarchical but with restricted levels (maximum three) inorder to avoid time-consuming browsing through the tree likehierarchy. Another critical design principle was to allow users toannotate the document by selecting only the basic elements ofthe core components and not the aggregate components fromthe tree-like hierarchy. Finally, a design parameter was definedto enable users to define their own document types even if

automatic mechanism and manual intervention (VDMT).

Fig. 4. Screenshot of advanced prototype of VDMT.


certain key/required core components were not mapped in thedocument.

As shown in Fig. 3, Annotation is then succeeded by theClassification process, which is a fundamental feature of theautomatic recognition mechanism. Although a user can alsomanually classify a document in the VDMT. In the first case, afterAnnotation the document is passed to the automated recognitionmechanism which tries to classify the document based on the set ofannotated or user associated core components. In the second casethe user can classify the document manually within the VDMT bysaving it as an instance of a certain document type. At the end ofboth classification procedures the resulting document type isvalidated that helps prepare the automatic recognition mechanismin dealing with similar documents in future.

3.3. Visual layout of VDMT

The visual interface of the VDMT is shown in Fig. 4. The corecomponent’s repository is rendered in a tree-view hierarchy in theleft pane. Only leafs can be used for annotation since theyrepresent basic elements or basic core components. Users can loada document in the document browser, which is the centre/mainpane of the tool. After loading a document the user can highlight adocument area (containing specific information) and select amongthe tree-leafs the core components that are representative of thehighlighted area, as shown in Fig. 4. This functionality targets userrequirements R1 and R2 in Section 3.1.

When the user finishes the annotation of a document, twopossible pathways are available. The first one is followed when theuser is not satisfied by the automatic recognition of the documentinstance. In this case, the user manually classifies the document(e.g. as Purchase-request, Invoice, etc.). The second pathway isfollowed when user decides to consult the automated recognitionmechanism that is based on the correlations between existingDocumentTypes (e.g. purchase order, quotation) with annotated oruser associated core components. In this respect, every time a newannotation takes place the user can consult the automatedmechanism that recognises the document instance by employingthe underlying algorithms [8]. The results of the classificationprocess are instantly shown in the right pane of the tool, whichshows the extent to which the currently open document resembles

with one of the local existing document types (e.g. order, invoice,etc.). This feature targets requirement R3. As shown in Fig. 3 after aclassification cycle the right pane can show the user that accordingto the current annotation the document can be either an Order orQuotation.

Even after the user is satisfied by the automated recognition,the user is allowed to enrich the document by associating anyadditional annotations (core components) in the document that donot correspond to any text in the document. For this, the user candrag and drop additional core components to the bottom middlepan. This pane will enable the automatic document recognition totake into account additional information, to what is alreadyannotated in the opened document, while recognizing thedocument Such functionality targets requirement R4 and R5.

The VDMT is developed with web 2.0 technologies in order toleverage the responsiveness and the end-user’s interactivity.Among many web2.0 frameworks (jQuery, Prototype) we selectedYAHOO YUI1, which allows development of richly interactive webapplications using Java and CSS.

3.4. Functionality offered by the tool

The current functionally offered by the tool has been directlyaffected by the design recommendations from the user studies.Based on these recommendations the annotation functionality isimplemented in a way that allows users to annotate concepts usingdifferent colours. Users can also highlight information items withvarious concepts, achieving a more effective and easy tounderstand annotation. In regard to user friendliness, thedocument viewer has been supported with many featuresmimicking Microsoft Word to enable direct manipulation of thebusiness documents, e.g. users can edit, delete, rearrange theirdocuments (as shown in Fig. 4).

In an open document users can edit/remove or add newannotations in the central pane e.g., by highlighting an informationitem and then clicking on the related core component in thehierarchy (left pane). At all times the right pane shows the currentclassification of the already open document. Finally, DocumentTypes (such as order, quotation, invoice, etc.) can be easily createdusing the bottom pane. In this respect, document type creationuses any existing Document as the ‘‘starting template’’ which, as


explained above, already contains pre-associated core componentsi.e. these are the components already included in the DocumentType. If the user is happy with those components, then a simpleclick on the button ‘‘Save New Document Type’’ will show a smallfield for entering the name of the new Document Type. Aftersaving, a new Document Type will appear in the right pane.However, if the user prefers to further enrich the Document Typeby using additional core component, then by using a simple Drag &Drop functionality the user can choose core components (from theleft pane in Fig. 4) and drop them in the bottom pane. By clicking on‘‘Save New Document Type’’ button the Document Type will besaved as explained above.

4. Usability evaluation of VDMT

At the end of the tool’s development process, we conducted asummative usability study to test the usability of various featuresof the VDMT and gauge user acceptance level for the tool. The studywas concluded by a general rating session where participantsreported their views about the diverse features of the tool andrated several usability dimensions on a 7-point Likert scale.

4.1. User-based evaluation – data collection and method of analysis

Participants were given a background questionnaire to recordtheir technical background and business knowledge. Followingthis they were trained to use the tool and instructed to complete aset of visual mapping tasks. During their interaction with the tool,they were encouraged to discuss their visual mapping strategies,problems arising, and positive impressions. Their solutions andvocal discourses were recorded using a screen capturing and voicerecording software for post-experiment analysis. On completion ofthe visual mapping tasks, participants scored the tool on a set ofusability dimensions such as: ease of use, ease learning, andfunctionality, and discussed the positive and negative aspects ofthe tool and proposed ways to alleviate the issues. The quantitativescores were used to calculate the mean scores and standarddeviations for user rating. User impressions and self-reported datawere analysed. The organisers of the study used the contentanalysis method [7] to categorise user comments into themes andreport the number of emerging themes.

4.2. User-based evaluation – visual mapping tasks

Prior to their interaction with the visual mapping tasks,participants were provided with a scenario alongside 7 differingvisual mapping tasks to complete. These tasks cover the typicaldocument type creation and annotation processes required torecognise incoming documents. A task starts by opening a businessdocument, going through a list of core information items, choosingand associating the desired core components to relevant informa-tion items. The end product of this process is a business document

Table 3Visual mapping tasks, average completion time and response correctness.

5. Visual mapping task

1. Understanding of business document

2. Exploration of core concepts

3. Annotating three information items with appropriate core concepts

4. Associating three core concepts to appropriate information

items within the business document

5. Check existing annotations

6. Adding general core concepts to the document type

7. Saving of business document

type. During the completion of the tasks, participants werecontinuously encouraged to report their impressions includingany problems and suggestions.

4.3. Results of usability evaluation

4.3.1. Participants background

A total of 8 participants (5 males and 3 females) took part in theusability study. The background questionnaire used in the studyindicated our participants come from a business-related back-ground with an average business experience of 3.37/5 (where1 = not expert, 5 = expert). Questions included: prior experiencehandling business documents (e.g. invoices) using Microsoftproducts (3.37/5), using specialised software applications (e.g.SAP ERP) (2.25/5), manual recognition of business documents (3.5/5), and frequency of dealing with business orders both manually (6participants: monthly or weekly) and using a PC (4 participants:monthly or weekly).

4.3.2. User preferences on video

We inspected the video recordings and calculated the timespent by each participant to complete the designated tasks. Wealso noted down the correctness of user responses against theperfect expected responses we prepared in advance.

Both completion time and response correctness are importantindicators of the effectiveness and efficiency of the visual mappingtool. The tasks given to the users were all performed correctly byour users (i.e. responses’ correctness was 100%). Participantsshowed a good understanding of the purpose and constituent partsof the business document without major issues. However, inregard to the task which involved selecting and associatingappropriate core concepts to information items 2 participants wereunable to correctly complete the tasks. These 2 participantsmanaged to make correct associations to 2 out of 3 concepts/information items but failed to choose the right concept for 1information item. Time wise, participants spent the longest timemaking connections between core concepts and information items(mean = 457.0 s) and exploring core concepts (mean = 382.66 s).The results signify that these two tasks were the most challengingactivities to our participants as reflected by their vocal thoughtsduring the interaction process. Table 3 summarises the results ofvisual mapping tasks, average completion time and responsecorrectness.

4.3.3. User preferences on paper

We analysed the transcribed user statements and commentsusing the content analysis methodology, which examines inten-tions and trends of individuals towards a particular idea.

Self-reported data was categorised into three main categories:negative, positive, and suggestion. These categories were decom-posed into themes whose occurrences were counted. A total of 77comments emerged from the qualitative data. Among which 37

6. Average completion

time (in seconds)

7. Correctness

182.66 Completed by all participants


292.75 Completed by all participants, apart from

1 who completed it partially

457.0 Completed by all participants, apart from

1 partial completion




Table 4Average usability scores to various usability questions.

8. Usability question 9. Mean 10. Standard deviation

It was easy to learn to use this mapping tool 6.25 1.16

I feel comfortable using this mapping tool 5.87 1.12

It was simple to use this mapping tool 5.75 1.67

I believe I became productive quickly using this mapping tool 5.75 1.58

I like using the interface of this mapping tool 5.75 1.58

The mapping tool contains the necessary concepts I need 5.62 1.3

I can effectively complete my work using this mapping tool 5.50 1.19

I can efficiently complete my work using this mapping tool 5.50 1.07

I can complete my work quickly using this mapping tool 5.25 1.38

This mapping tool has all the functions and capabilities I expect to have 5.25 1.48

The information provided by the mapping tool is easy to understand 5.12 1.24

The information is effective in helping me complete the tasks and scenarios 5.12 1.64

The organisation of concepts on the mapping tool is clear 4.75 1.28

It is easy to find the information I need 4.37 1.4

I can find the relevant concepts easily 4.37 0.91

The mapping tool gives error messages that clearly tell me how to fix problems 4.25 1.83

The information (such as online help, on-page messages, and other documentation)

provided with this mapping tool is clear

4.12 1.8

Whenever I make a mistake using this mapping tool, I recover easily and quickly 3.87 1.45

Overall, I am satisfied with this data mapping tool 5.50 1.41

Table 5Types and number of real documents used as the document seeds for the experiments.

Document type Authorisation of invoice Invoice Job description Order Delivery note Chatter

Seed documents 6 20 5 7 5 20


were positive comments (mean = 4 comment), 25 were negativecomments (mean = 3 comments), and 15 were suggestions (mean=2 comments) to improve the visual mapping tool. In the next stepwe created a specific theme for each comment, for instance: forcomment ‘‘the interface is good looking’’ we created the theme‘‘user interface’’. Finally, we organised and accommodated thesespecific themes into more general themes which we created suchas: usability, core concepts, and user interface.

Positively participants praised usability aspects (16 comments)of the tool in the form of ‘‘ease of use’’, ‘‘ease of learning’’,‘‘efficiency and effectiveness’’. This was followed by user interfacecomments (10 comments) due to its conciseness and clearness,visual appearance, and adjustable pane windows. The participantsalso valued the list of core concepts and ability to annotatebusiness documents (6 comments). In regard to the drawbacks, themost reoccurring conceptual problem revolved around finding,understanding and selecting the appropriate core concepts (12comments). Another related problem that participants alsocomplained about was lack of textual hints which could havehelped in their comprehension and selection of core concepts (3comments). The remaining comments focused on the documentviewer and selection and de-selection of items.

To consolidate our understanding of user business documentannotation we administered a usability questionnaire whichcollected and measured user impressions covering severaldimensions, including ease of use, ease of learning, convenienceusing the tool, efficiency and effectiveness in completing workusing the tool, help and recovery from mistakes offered by the tool.All usability questions were rated on a 7-point Likert scale where1 = Strongly Disagree, 3 = Neutral and 7 = Strongly Agree.

In general our representative users found the tool quite easy touse, felt comfortable and became more productive using the tool.They also agreed that the right amount of core concepts is includedin the tool, along with the expected functions and capabilities.However, participants were neutral as to whether it is easy to findinformation and concepts (mean = 4.37) and whether sufficienthelp is provided in the tool (mean = 4.12). The overall impression

of our participants was quite satisfactory (mean = 5.50, SD = 1.41).Table 4 summarises average user rating of various features andaspects of the visual mapping tool.

4.3.4. Recommendations of usability evaluation

The user evaluations demonstrated the success of our visualmapping tool in terms of usability and functionality. The mostrecurring theme discussed by our participants is ‘‘ease of use andlearning’’. However, participants highlighted some drawbacks thatrequire attention in the future for better usability and increaseduser satisfaction. As evinced from the interaction problems anduser statements, participants had difficulty understanding somecore concepts which led to inability to select and associate the rightcore concept to information items within the business document.

The immediate solution to this issue, as suggested by our users,is to add text hints to core concepts to explain their meaning andpotential ways of using them (8 comments out of 15 suggestioncomments). These hints should be activated upon hovering on topof desired core concepts. Another suggestion was to provide easilyaccessible summaries of the annotations added to the businessdocument type. To empower business users to fully capitalise onthe benefits offered by our visual mapping, we plan to adddefinitions to the existing core concepts and a help file serving as atrouble shooting guide.

The potential improvements for the future versions of this toolare described as follows:

� Document annotations: An ad hoc system-driven annotationmodality can be developed with which the user will highlight atext and on the fly a pop-up window will prompt for therecommended core components.� Synchronisation: More dynamism and immediate communica-

tion can be included between all four panes. For instance, whenclicking on a Document Instance and showing it in the centrepane, the Document Type similarities can be loaded accordinglyas well as the core components already contained in theDocument Instance.


� Integration: The VDMT can be integrated with enterprise emailsystem to enable users to perform the classification tasks andDocument Type manipulation without having to use differenttools.� Compliance: Further development can consist of making the

VDMT completely compliant with the OSGi Framework [5] tomake it usable within a wide variety of enterprise interoperabili-ty software.� System connectors: Another future improvement can involve

lifting information from legacy systems into document instancesdynamically created by the VDMT.

5. Evaluating the automatic document recognition mechanism

To determine the level at which a balance can be achievedbetween the accuracy of automated recognition mechanism andmanual intervention we included a learning module in theautomated mechanism and set up some experiments. Oneimportant objective of the experiments was to analyse the levelof susceptibility of the automated recognition mechanism whenexposed to supervised learning by means of manual/humanintervention using the VDMT.

In essence, the precision for automatic document recognition isa direct factor of the mechanism’s reliability for working on it own.Since the VDMT allows a user to change document types andinstances at will, the susceptibility level indicates how thereliability is affected when an emulated user overrides themechanism.

5.1. Experimental setup

For the experiments we collected 63 electronic documents(which we call seed documents) from three companies from Italyand Spain, and manually annotated their contents with therelevant core components. In order to focus our efforts on thedocument recognition mechanism we assume automated extrac-tion of core components is feasible using [20]. Therefore, ourexperiment setup starts from those 63 seed documents of differenttypes annotated and represented by core components. Table 5shows the number of documents and their types we use for theexperiments:

� Authorisation of invoice: a document used to verify that all thenecessary information for an invoice has been collected and anyrelevant process has been finished.� Invoice: as its name says.� Job description: refers to a job to carry out as part of a contracted

service.� Order: described a collection of goods and services to be

purchased.� Delivery note: lists a set of items being delivered as part of a

service.� Chatter: parts of email conversations related to a business

situation.

The experiments consist of document passing through theautomated recognition mechanism. The mechanism analyses thedocument and selects the document type that best represents thatdocument. Since we know the type of the new document inadvance, we can determine whether the mechanism is right orwrong. Either way the mechanism stores that document (with atype assigned to it) in the internal knowledgebase and uses it forfuture recognitions as part of its learning capability.

Initially we randomly select one seed document per documenttype and use it to define that particular type. Then the rest of theseed documents are used to generate new documents that can then

be passed through the recognition mechanism. This is done byrandomly selecting a seed document and introducing noise to it. Anoise corresponds to inducing a 10% variation in that seeddocument by (a) randomly adding core components, (b) randomlyremoving core components, and (c) randomly replacing corecomponents. This variation, according to the companies fromwhich the documents were collected, represents the typicaldifference found between documents of the same type comingfrom different sources. Thus, by adding 10% noise, we areemulating a real business scenario where documents come fromdifferent companies. Moreover, chatter documents (e.g. emailsconversations not corresponding to any particular documenttypes) are generated with a 0.10 probability which, according tothe same companies, represents the probability to receive suchemails on a daily basis. The rest of the document types share theremaining 0.90 probability evenly for the purpose of theexperiments.

The circle of interest technique scores and ranks the documentamong all locally available document types. Using the Circle ofInterest technique, the following results can be expected:

1. Perfect match, when the highest ranked document type is thecorrect one.

2. Tied match, when the highest rank is shared by two or moredocument types and one of them is the correct one.

3. Wrong match, when none of the other possibilities occur.

For the experiments we emulate a user correcting therecognition results in the VDMT, e.g. a supervisor, who can specifythe correct document type when automatic recognition results in atied or wrong match. However, we vary the level at which such auser supervises the tool according to an intervention probability,i.e. the probability at which the user will override the automaticrecognition results in the VDMT. For the sake of simplicity, yetwithout losing representativeness, we use five levels of supervisionnamely 0%, 25%, 50%, 75% and 100%. Once the tool recognises adocument, the result is stored in the VDMT’s internal knowledgebase, even when no user corrects it, and is used for futuredocument recognitions. The experiments were run to process 1000documents.

5.2. Analysis of experimental results

To measure the document recognition precision we calculatethe success rate of document recognition with and without tiedmatches per supervision level. This provides an overview of howthe learning process of the system is affected while recognising andlearning consecutive documents.

To measure the susceptibility of the (emulated) user interven-tion we consider the rate of tied matches when recognising thesame documents. Since tied matches indicate indecision of theautomatic recognition mechanism, this indicates to what extentuser supervision affects the precision to recognise new documents.

Fig. 5 depicts the success rate of document recognition usingthe five supervision levels. Notice that when the result ofautomatic recognition is a tied match, it is still considered acorrect case because the mechanism was undecided, not wrong.The figure shows that all supervision levels have a learning periodat the beginning which stabilises after �120 documents have beenrecognised. The lowest rates are produced by the supervision levelsof 0% and 25%, with 0.66 and 0.87 respectively. However, thesupervision levels of 50%, 75% and 100% stabilise at 0.94, 0.95 and0.95 respectively. This indicates that even with 50% supervision theprecision to recognise new document is highly acceptable. Yet thisfigure contains both the perfect and tied matches.

Fig. 7. Tied matches rate showing the susceptibility to user supervision. Notice

X-axis scale.

Fig. 5. Precision of document recognition using different supervision levels.


Fig. 6 shows the perfect match rate of document recognitionusing five supervision levels. Notice that the learning period lastsup to �200 documents, yet their tendency is to continue improvingslightly. Also notice that all the supervision levels have theirprecision reduced due to the lack of undecided cases. In particular,supervision level 0% and 25% stabilise at 0.63 and 0.82 respectively.The most noticeable difference with the previous graph is thatsupervision levels 75% and 100% are now both at 0.89, whereassupervision level 50% is at 0.91, the only one above 0.90. Thissuggests then that when supervision occurs 75% or 100% of thetime, the automatic mechanism gets confused (i.e. amount of tiedcases is higher) more than when supervision happens 50% of thetime.

Finally, Fig. 7 presents the rate of tied matches whenrecognising documents using five different supervision levels. Inthis graph it is clearer that the learning period takes approximately200 documents to complete before the tied matches occurrencestart to diminish. In the five cases the tied matches rate is low,under 0.11. What is interesting to see is that the lowest rate isachieved when there is a 0% of supervision (under 0.03) asopposing to supervision levels of 75% and 100% with �0.10 after200 documents have been recognised and gradually diminishinguntil 0.06 and 0.05, respectively, after 1000 documents.

Furthermore, supervision level of 25% diminishes faster than75% and 100% supervision levels, reaching 0.04 at the end of theexperiments. Finally, supervision level 50% remaining fairly stableat 0.03 from after 100 documents onwards. These results supportFig. 6 in suggesting that at 50% of supervision the automatic

Fig. 6. Precision of document recognition with perfect matches only.

recognition is the least confused, provides a better precision (i.e.higher rate of perfect matches) while being the least susceptible toemulated user intervention.

This supports our claim that manual support or humanintervention provided by the VDMT is suitable for a businessdomain as it allows higher precision levels of document recogni-tion while balancing learning on its own. The VDMT is also usefulfor enriching user’s experience in dealing with business docu-ments.

6. Discussion and implications

This paper presents the case for the design and evaluation of aVDMT to facilitate semantic interoperability between collaborat-ing enterprises. The tool aims to enable users (or SMEs) to mapbusiness documents to elements of a core ontology of businessconcepts, thus providing support for the recognition of businessdocuments based on relevant information items.

In this paper we presented a novel user view on informationitems that make up the basis of business documents and theirrelationships in the information repository to be used in themapping tool. We then described how the user view providesvaluable input to the design of a VDMT, which aims to facilitate andsimplify the user interactions with the documents or informationrepository.

The analysis of the end user studies show that the developedtool has the potential to simplify the process of mappinginformation items contained in different business documents,and to make this process accessible to end users who are not ITspecialists, thus fitting to the profile of users in SMEs. We were ableto tune the design of our visual data mapping tool according to theusers’ understanding, views and needs. Unfortunately the fastmajority of mixed initiative tools built using state of the arttechnologies fail to include the preferences of target end users inthe design process. This not only widens the gap between thedesign model of the tool and mental models of end users but alsomakes the tools extremely difficult to use by the target users.

Our user studies demonstrated that associating core concepts tothe items of information within business documents is the mostchallenging task in the manual annotation process. In this respect,we have taken first important steps towards reversing thedirection of information identification. That is, instead of present-ing users with standards-informed information items, we elicitedtheir perception of natural information items and relationshipsbetween them, and designed a tool to facilitate the mappingbetween the natural representations and the standards-based core


components, which is then used for annotations and informationprocessing.

Several lessons can be learned from our application of the user-centric design methodology to the tool design experience.Consequently we have summarised the following lessons thatcan be useful in developing mixed initiative tools:

1. Provide a concise and easy to understand set of informationitems; by this we mean to use user terminology and stay awayfrom specialised jargon.

2. Provide definitions for each domain specific concept along withexamples of use; users can refer to this when it is difficult tounderstand the required concept.

3. When presenting information items to users, use a hierarchicalstructure with a limited number of child nodes to facilitate thebrowsing and selection of information items.

4. Suggest help or extra description following selection of aparticular information item using (intelligent) semantic recog-nition techniques.

In the end, the results of our experiments exhibit theimportance of complementing automatic recognition with manualintervention as realised by VDMT. The interplay of automatic andmanual mechanisms helps in getting higher precision levels ofdocument recognition.

References

[1] V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Bargas-Vera, E. Motta, F. Ciravegna,Semantic annotation for knowledge management: requirements and a survey ofthe state of the art, Journal of Web Semantics: Science, Services and Agents on theWWW 4 (1) (2005) 14–28.

[2] SME interoperability in the global economy: A discussion paper. Colin Piddington,on behalf of the INTEROP Network of Excellence Workpakages 11 & 22, IFAC(2005).

[3] UN/CEFACT Core Components Technical Specification: http://www.unece.org/cefact/ebxml/CCTS_V2-01_Final.pdf.

[4] ebMail: http://sourceforge.net/projects/ebmail/.[5] OSGI: http://www.osgi.org/About/Technology#Framework.[6] NEXUSe2e Business Messaging Server: http://www.nexuse2e.org/NEXUSe2e/

Product.html.[7] K. Krippendorff, Content Analysis: An Introduction to its Methodology, Sage,

Thousand Oaks, CA, 2004.[8] D. Joseph, C.A. Marın, A study on aligning documents using the circle of interest

technique, in: Proceedings of the Fifth International Conference on Software andData Technologies (ICSOFT 2010), Vol. 2, 2010, pp. 374–383.

[9] F.V. Paulovich, R. Minghim, Text map explorer: a tool to create and exploredocument maps, in: Proceedings of IEEE 10th Information Visualization Confer-ence, 2006.

[10] C. Namyoun, I.-Y. Song, H. Han, A survey on ontology mapping, in: ACM SIGMODRecord. Vol. 35, September, 2006, pp. 34–41.

[11] M. Lazerberger, J. Sampson, AlVix–a tool for visual ontology alignment, in:Proceedings of IEEE Visualization Conference, 2006.

[12] E. Horvitz, Principles of mixed initiative user interfaces, in: Proceedings of CHI,1999, pp. 159–166.

[13] L. Cinque, S. Levialdi, A. Malizia, A system for the automatic layout segmentationand classification of digital documents, in: Proceedings of 12th InternationalConference on Image Analysis and Processing, 2003.

[14] F. Shih, et al., A document segmentation, classification and recognition system, in:Proceedings of 2nd International Conference on Systems Integration, 1992.

[15] Intelligent Document Recognition with OCR: http://www.cvisiontech.com/document-automation/forms-processing/intelligent-document-recognition-with-ocr-3.html?lang=eng.

[16] Document Recognition: http://www.readsoft.com/invoice-processing-automation-terminology/document-automation-software/document-recognition/index.aspx.

[17] A. Mondorf, M. Wimmer, Interoperability in e-tendering: the case of the virtualcompany dossier, in: T. Janowski, T.A. Pardo (Eds.), Proceedings of the 2ndinternational conference on Theory and practice of electronic governance (ICE-GOV ‘08), ACM, New York, NY, USA, 2008, pp. 110–116.

[18] S. Huang, V. Gohel, S. Hsu, Towards interoperability of UML tools for exchanginghigh-fidelity diagrams, in: Proceedings of the 25th Annual ACM InternationalConference on Design of Communication (SIGDOC ’07), ACM, USA, 2007, pp. 134–141.

[19] F. Henriques, D. Rego, OASIS Tactical Situation Object: a route to interoperability,in: Proceedings of the 26th Annual ACM International Conference on Design ofCommunication (SIGDOC ‘08). ACM, New York, NY, USA, 2008, pp. 269–270.

[20] Michal Laclavik, Stefan Dlugolinsky, Martin Seleng, Marcel Kvassay, Emil Gatial,Zoltan Balogh, Ladislav Hluchy, Email Analysis and Information Extraction forEnterprise Benefit, Computing and Informatics 30 (1) (2011) 57–87.

Dr. Usman Wajid is a Research Associate in the Schoolof Computer Science at the University of Manchester.He holds a PhD in Informatics (University of Manche-ster) and MSc in E-Commerce (Middlesex UniversityLondon). His research primarily seeks to addresschallenges in automated interactions involving multi-ple autonomous and heterogeneous entities, and spansflexible interaction protocols, design and developmentof service-based systems as well as optimization ofenergy consumption and resource utilization in dis-tributed computational environments or Cloud.

Dr. Abdallah Namoun is a Research Associate in theCentre for Service Research at the University ofManchester. He obtained his PhD in Informatics fromthe same University focusing on the link betweenwebsite design, visual attention, and user performance.He investigates user needs and interaction withtechnology, design of visual interfaces, methods fortesting usability of human interfaces, and end userdevelopment activities. Until now he has worked on thefollowing HCI aspects: models for re-usable interfaces,software re-use, cognitive modelling, usability engi-neering, end user development, and requirementsengineering.

Dr. Cesar A. Marin is a Research Associate at the Centrefor Service Research at the University of Manchester.His research is mainly about the applications ofadaptive and self-organising approaches to servicesystems engineering. He completed a PhD in informat-ics at The University of Manchester, focused onadaptation to unexpected changes in ecosystemdomains. He obtained both an MSc and a Diploma inIntelligent Systems, and a BSc in Computer SystemsEngineering from the Monterrey Institute of Technolo-gy, Mexico.

Dr Nikolay Mehandjiev is a Professor of EnterpriseInformation Systems at the Centre for Service Research atthe University of Manchester, UK, where he researchesthe design of flexible service systems, using knowledge-based techniques and user-centric design to open serviceapplications to users who are not professional program-mers. He has initiated and managed projects worths5,5m, and organised a number of international work-shops on Interdisciplinary Software Engineering Re-search and End User Development. He has co-authoredtwo books, more than 100 refereed papers and has guest-edited four special issues of international journals,including the Communications of ACM.

http://www.unece.org/cefact/ebxml/CCTS_V2-01_Final.pdf

http://www.unece.org/cefact/ebxml/CCTS_V2-01_Final.pdf

http://sourceforge.net/projects/ebmail/

http://www.osgi.org/About/Technology#Framework

http://www.nexuse2e.org/NEXUSe2e/Product.html

http://www.nexuse2e.org/NEXUSe2e/Product.html

http://www.cvisiontech.com/document-automation/forms-processing/intelligent-document-recognition-with-ocr-3.html?lang=eng



http://www.readsoft.com/invoice-processing-automation-terminology/document-automation-software/document-recognition/index.aspx

http://www.readsoft.com/invoice-processing-automation-terminology/document-automation-software/document-recognition/index.aspx

Date post:	10-Dec-2016
Category:	Documents
Upload:	nikolay
View:	214 times
Download:	0 times

Designing and evaluating a system of document recognition to support interoperability among...

Documents