+ All Categories
Home > Documents > An AI Framework for the Automatic Assessment of e-Government Forms

An AI Framework for the Automatic Assessment of e-Government Forms

Date post: 03-Feb-2022
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
13
Articles 52 AI MAGAZINE G overnment agencies around the world process thousands to millions of forms yearly; some may even process a million forms a day (Vert- Markets IT Group 2002). To improve efficiencies and reduce cost, many government agencies use some type of document imaging solution for unstructured information or form-processing solutions to capture structured data (Captiva Software Corporation 2002, VertMarkets IT Group 2003). Some government agencies also offer e-forms as an alternative to paper forms to their citizens for even more efficient and accurate data collection. Once forms are digitized and indexed and the data extracted, it is just a nat- ural progression to consider using artificial intelligence (AI) to further enhance efficiencies with intelligent decision support. This article describes use of an XML-based AI framework to create an AI module for an immigra- tion agency to support its extensive form-processing needs. Immigration agencies play vital roles in maintaining the security and pros- perity of a place. They control the entry and departure of people at its borders and safeguard it against threats. They may also be responsible for enforcing immigration control within the boundaries of the place. Besides immigration control, the immigration agency for which this work was performed is also responsible for providing a wide variety of document-related services to its citizens and visitors. These services include issuing various types of travel doc- uments, identity cards, nationality documents, visas or permits, right of abode, and birth, death, and marriage registrations. In fact, the agency rou- tinely handles more than a hundred dif- ferent types of document requests. In 2004, the agency processed close to 4 mil- lion application forms at its headquarters, which has a tight workforce of roughly a couple of thousand. To overcome rapidly increasing work- loads, the agency looks toward IT to improve efficiency and productivity (Hong Kong Trade Development Council 2006, Hong Kong Government 2005). The AI project described in this article is part of a new IT strategy to streamline the entire immigration form-processing work- flow with advanced document manage- ment and forms processing software (Questex Media 2006, Hong Kong Gov- ernment 2004). The new system provides the agency with virtually a paperless envi- ronment where all documents are digi- This article describes the architecture and AI technology behind an XML-based AI framework designed to streamline e-government form pro- cessing. The framework performs several crucial assessment and decision support functions, including workflow case assignment, automat- ic assessment, follow-up action generation, precedent case retrieval, and learning of current practices. To implement these services, several AI techniques were used, including rule-based processing, schema-based reasoning, AI cluster- ing, case-based reasoning, data mining, and machine learning. The primary objective of using AI for e-government form processing is of course to provide faster and higher quality serv- ice as well as ensure that all forms are processed fairly and accurately. With AI, all relevant laws and regulations as well as current practices are guaranteed to be considered and followed. An AI framework has been used to implement an AI module for one of the busiest immigration agencies in the world. An AI Framework for the Automatic Assessment of e-Government Forms Andy Hon Wai Chun Copyright © 2008, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602 AI Magazine Volume 29 Number 1 (2008) (© AAAI)
Transcript

Articles

52 AI MAGAZINE

Government agencies around the world process thousands to millionsof forms yearly; some may even process a million forms a day (Vert-Markets IT Group 2002). To improve efficiencies and reduce cost,

many government agencies use some type of document imaging solution forunstructured information or form-processing solutions to capture structureddata (Captiva Software Corporation 2002, VertMarkets IT Group 2003). Somegovernment agencies also offer e-forms as an alternative to paper forms totheir citizens for even more efficient and accurate data collection.

Once forms are digitized and indexed and the data extracted, it is just a nat-ural progression to consider using artificial intelligence (AI) to furtherenhance efficiencies with intelligent decision support. This article describesuse of an XML-based AI framework to create an AI module for an immigra-tion agency to support its extensive form-processing needs.

Immigration agencies play vital roles in maintaining the security and pros-perity of a place. They control the entry and departure of people at its bordersand safeguard it against threats. They may also be responsible for enforcingimmigration control within the boundaries of the place. Besides immigrationcontrol, the immigration agency for which this work was performed is alsoresponsible for providing a wide variety of document-related services to itscitizens and visitors. These services include issuing various types of travel doc-uments, identity cards, nationality documents, visas or permits, right ofabode, and birth, death, and marriage registrations. In fact, the agency rou-

tinely handles more than a hundred dif-ferent types of document requests. In2004, the agency processed close to 4 mil-lion application forms at its headquarters,which has a tight workforce of roughly acouple of thousand.

To overcome rapidly increasing work-loads, the agency looks toward IT toimprove efficiency and productivity(Hong Kong Trade Development Council2006, Hong Kong Government 2005). TheAI project described in this article is partof a new IT strategy to streamline theentire immigration form-processing work-flow with advanced document manage-ment and forms processing software(Questex Media 2006, Hong Kong Gov-ernment 2004). The new system providesthe agency with virtually a paperless envi-ronment where all documents are digi-

■ This article describes the architecture and AItechnology behind an XML-based AI frameworkdesigned to streamline e-government form pro-cessing. The framework performs several crucialassessment and decision support functions,including workflow case assignment, automat-ic assessment, follow-up action generation,precedent case retrieval, and learning of currentpractices. To implement these services, severalAI techniques were used, including rule-basedprocessing, schema-based reasoning, AI cluster-ing, case-based reasoning, data mining, andmachine learning. The primary objective ofusing AI for e-government form processing is ofcourse to provide faster and higher quality serv-ice as well as ensure that all forms are processedfairly and accurately. With AI, all relevant lawsand regulations as well as current practices areguaranteed to be considered and followed. AnAI framework has been used to implement anAI module for one of the busiest immigrationagencies in the world.

An AI Framework for the Automatic Assessment

of e-Government Forms

Andy Hon Wai Chun

Copyright © 2008, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602

AI Magazine Volume 29 Number 1 (2008) (© AAAI)

Articles

SPRING 2008 53

tized and indexed with automatic data extractionfrom forms. The AI module further streamlinesprocesses and workflows with decision supportcapabilities to help the agency cope with contin-ued growth.

With the new system, the public will be able toobtain services by submitting either paper forms ore-forms (for some services). With AI decision sup-port, visits to the agency will be minimized, andprocessing time will be significantly shortened.One-stop, on-the-spot service will be provided forsome application types. Application status andprogress can be checked through the web. Overall,the new IT strategy greatly improves the level ofconvenience to citizens and visitors.

Current Manual ApproachThe workflow for each type of form may be slight-ly different. Here, I will describe a typical workflowfor the manual approach (figure 1 is a simplifiedprocess flow diagram). This workflow is probablysimilar to those of other government agenciesaround the world.

The process starts with the applicant appearingin person to submit paper forms together with rel-evant documents and papers. The frontline staff atthe counter does a preliminary check to see if allthe necessary documents are attached, and theapplicant leaves.

A case folder is created and eventually passed toan authorization officer who will do a preliminaryassessment of the case and then assign a suitablecase officer actually to process the case to comple-tion. The case officer is assigned according to his orher experience and familiarity with handling thattype of application. After a thorough and detailedreview and analysis of the application form, thecase officer may request additional supplementarydocuments from the applicant. Several rounds ofvisits may be needed depending on the content ofthe documents provided and the nature of theapplication. For example, it may be the case thatthe applicant might not qualify for the given appli-cation type but may qualify under a different sce-nario. If so, different sets of documentation may berequired. The case officer may need to considermultiple approval scenarios at the same time.

When all the supporting documents have beensubmitted and verified, the case officer will make afinal assessment, which will then be reviewed andendorsed by the authorization officer. Finally, theapplicant will be notified of the result and returnto collect the requested documents or permits ifapplication was successful. The entire process mayrequire several visits by the applicant to the immi-gration agency and many days, weeks, or evenmonths to complete depending on complexity.

In order for a case officer to adequately process

an application, he or she must possess thoroughknowledge of all the applicable laws and regula-tions as well as immigration guidelines, whichmight change from time to time. In addition, thecase officer must also be able to use his or her expe-rience in processing other similar cases to draw onprecedent cases for reference if discretionary deci-sion making is needed. Historical case documentsare available in microfiche, but searching for relat-ed cases will take time. The case officer may needto consult with other more senior or experiencedcase officers before a decision can be made. As youcan see, assessing a complex case, such as applyingfor right of abode, can be very time consuming andknowledge intensive.

AI Project ObjectivesVery challenging goals were defined for the AI sys-tem—to streamline the entire assessment workflowwith automated decision support wherever possi-ble. The key objectives for the AI module are (1) toautomatically assess straightforward cases, (2) toprovide decision support for nonstandard cases,and (3) to learn “current practices” from humans.

Cases are divided into straightforward cases andnonstandardcases. Straightforward cases are thosefor which a determination as to whether they sat-isfy applicable laws and regulations can be madeimmediately and require very little processing.Nonstandard cases are those that may require addi-tional information or documentation or mayinvolve discretionary decision making by the caseofficer. Discretionary decision making must followcurrent practices and guidelines. Since practicesand guidelines change from time to time to reflectchanging needs, the AI module will need to auto-matically adapt itself through learning.

New AI Approach Based on these AI goals and objectives, several newAI processes were designed to streamline the formsprocessing workflow (see figure 2—processes A1 toA6). With the new AI system, application formswill either be submitted online or as hard copy andthen scanned and processed by optical characterrecognition. Associated supporting documents willalso be stored digitally in a secured documentmanagement system. For simple forms, online sub-mission represents substantial savings in commu-nity cost since the applicant need not go to theagency in person (Hong Kong Government 2004).

After submission, the AI module assists with caseassignment (process A1 in figure 2) by automati-cally categorizing the case into defined categories.At the same time, it performs an initial case assess-ment (A2) that is used by the assigned case officerto determine whether the case is a straightforwardor a nonstandard case. Case assessment is done byevaluating the case against all applicable laws and

regulations as well as current practices and guide-lines for each application type. For certain types ofapplication, the assessment may be done in a “one-stop” fashion and the applicant can collect the per-mits or letters during the same visit. Since formany application types, a majority of the cases arestraightforward cases, the agency estimates greatefficiency savings with AI.

For nonstandard cases, the case officer will use

the AI module to (A3) generate follow-up actions—these are suggested steps to take in order to get theapplication to a final state that can be assessed. Forexample, the AI module may recommend that thecase officer request additional supplementary doc-uments or clarifications from the applicant.

In the manual approach, follow-up actions maybe an iterative process. The applicant may need tovisit the agency more than one time before all the

Articles

54 AI MAGAZINE

Applicant Frontline Staff Other Case OfficersCase OfficersApproval Officer

YesNo

Yes

No

Non-standardOne or more iterations

NoYes

Submit additionaldocuments toGovernment

Request additionalsupportingdocuments

Accepts form &documents

Search historicalcases / microfiche

Discuss case &knowledge sharing

After receiving alldocuments, perform

final assessment

Detailed caseassessment

Discuss case &knowledge sharing

Assign case officerfrom preliminary

assessment

Submit form &documents

Get rejectnotification

Archive closedcases

Give requesteddocument

Accepts documents

Final decision Recommendreject

RecommendapproveFinal decisionReceive approve

notification

Goes to collectdocument

Sufficient info?

Approved?

Approved?

Figure 1. Process Flow Diagram for Manual Approach.

Articles

SPRING 2008 55

necessary documents are collected and informa-tion clarified. With AI, all possible approval sce-narios are considered at the same time before fol-low-up actions are generated, thus reducing thenumber of visits needed.

Once information is complete, the case officerwill request the AI module to perform (A4) final caseassessment. This is similar to (A2) initial case assess-ment except that now information is complete.

For complex borderline cases, some form of dis-cretionary decision making may be required fromthe case officer. The AI module assists this processby (A5) retrieving a set of “similar” cases from his-torical records together with assessment resultsand justifications for the case officer to use as ref-erence. This is done within seconds compared withhours or possibly days to search through micro-fiche to find reference cases.

Applicant Frontline Staff Case OfficerApproval OfficersAI Module

YesNo

Yes

Yes

No

Nonstandard

Nonstandard

Submit additionaldocuments toGovernment

Accepts form &documents

After receiving alldocuments, initiatefinal AI assessment

Case assessmentbased on AIsuggestions

Submit form &documents

Get rejectnotification

Archive closedcases

Give requesteddocument

Accepts documents

Final decision Recommendreject

RecommendapproveFinal decisionReceive approve

notification

A4. Final caseassessment

A6. Learnnew case

A5. Retrieve similarhistorical cases

A2. Initial caseassessment

A1. Caseassignment

Approves AIassignment

A3. Generatefollow-up actions Perform AI

suggested actions

Goes to collectdocument

HumanApproved?

Approved?

AI Approved?No

Figure 2. Process Flow Diagram for AI Approach.

Once a final decision has been made and thecase is closed, the AI module will (A6) performlearning on the closed case. This involves indexingit into the case base (for future case retrieval) aswell as decision trees (to learn current practices).

AI Architecture To support the AI processing performed by the AImodule, the AI framework makes use of several AItechniques, represented by five AI engines (figure3).

Functionalities required by processes A1 to A4are provided by rule-based technology. Caseassignment (A1) is performed by a workflow ruleengine. Initial (A2) and final case assessment (A4)are performed by an assessment rule engine. Fol-low-up actions (A3) are generated by a schema-based reasoning engine. A case-based reasoning(CBR) engine is used to retrieve similar historicalcases (A5), and a self-learning engine is used toindex and learn new cases (A6). Results from theself-learning engine feed back to the assessmentrule engine as self-learned rules that represent cur-

rent practices in discretionary decision making.Learning results also feed back to the CBR engineas newly indexed cases. To support these AIengines, the knowledge base consists of rules,schemas, cases, and decision trees.

Keeping It Manageable Although there are over a hundred different typesof forms, or application types, they are organizedinto only a few different categories or subsystems,such as the “right of abode,” “certificate of entitle-ment,” “birth, death, and marriage” (whichincludes adoption), “permits and visas,” “travelpass system,” “investigation,” “nationality,” “assis-tance to residents,” and “electronic passport” sub-systems.

To keep the AI development and deploymentmanageable, each subsystem has its own cus-tomized version of the AI module. The AI architec-ture (figure 3) is replicated for each subsystem. Forexample, the electronic passport subsystem has itsown set of rules, schemas, cases, and decision trees.

Articles

56 AI MAGAZINE

AI ModuleAI Framework CBR EngineRule Engine Self-LearningEngine

Workflow ruleengine

Case-basedreasoning

Machinelearning

Schema-basedreasoning engine

Assessment ruleengine

A4. Final caseassessment

A6. Learnnew case

A5. Retrieve similarhistorical cases

A2. Initial caseassessment

A1. Caseassignment

A3. Generatefollow-up actions

Figure 3. The Engines within the AI Framework.

Articles

SPRING 2008 57

Related WorkThe AI work is built upon several different AI rep-resentations and reasoning algorithms—rules,schema-based reasoning, clustering, case-basedreasoning, and decision trees.

Rules The rule engine is similar to rules in traditionalexpert systems (Forgy and McDermott 1977,Buchanan and Shortliffe 1984). However, insteadof heuristics or rules of thumb, the rules encodelegislative knowledge (Gardner 1987). Each sub-system has its own rule base. The structure of therule base was designed to facilitate ease of encod-ing expert knowledge on immigration-related leg-islation. A subsystem may have many differenttypes of application forms. Each type of applica-tion has its rule agenda that defines which combi-nation of rules or rule sets is applicable for a par-ticular application type. The rule agenda is similarto other rule agendas1, 2, 3, 4 except that its mainpurpose is to encode relationships among rulesrather than just sequence. Beside the rule agenda,rules are also organized into rule sets (Quintus Pro-log 2007, Jess 2007, CLIPS 2007). Each rule set rep-resents one assessment criterion. Rules in a rule setrepresent how that criterion can be satisfied. Rulesin the system operate in a forward-chaining man-ner.

Many government agencies around the worlduse rule engines to assist with decision making(Administrative Review Council 2004). For exam-ple, the Australian Department of Agriculture,Fisheries, and Forestry uses rule-based systems tomake decisions on whether to permit or reject animport, whether to perform import inspections,and to determine what kind of tests to apply. TheAustralian Taxation Office also uses a number ofrule-based systems to assist in determining whichmethods should be used in calculating taxes, ben-efits, and penalties. Customs uses expert systemsto valuate imports, calculate customs taxes, andprofile and select high-risk import or export trans-actions for scrutiny. The Department of Defenceuses rule-based systems to calculate workers’ com-pensation. The Department of Health and Ageinguses a rule-based system to check approvedproviders’ compliances. The Department of Veter-ans’ Affairs uses a rule-based system to supportdecision makers in determining veterans’ entitle-ments.

In the United States, the Customs and BorderProtection agency uses an expert system called theAutomated Targeting System (ATS) (U.S. House ofRepresentatives 1997, U.S. General AccountingOffice 2004, U.S. Bureau of Customs and BorderProtection 2005) to find suspicious cargo transac-tions and for antiterror work. ATS has more than300 rules provided by field personnel, inspectors,

and analysts in order to separate high-risk ship-ments from legitimate ones.

Schemas Besides rules, the AI module uses schema-basedreasoning (Turner and Turner 1991, Turner 1994)to represent procedural knowledge of actions andtasks that the case officers may take in the courseof handling a case, for example, requests for verifi-cation of certain documents, letters of reference,and so on. Actions and tasks are triggered by rules.The schema encodes procedural knowledge of typ-ical steps or actions taken by case officers in han-dling different kinds of cases.

Schema-based reasoning was also used in SAIRE(Odubiyi et al. 1997), a multiagent AI searchengine to search Earth and space science data overthe Internet. Chen and Lee (1992) explored howschema-based reasoning can identify fraud poten-tials exposed by an internal accounting controlsystem.

Cases and Clustering To provide decision support and precedent caseretrieval, the AI module makes use of incremental(Ester et al. 1998) AI clustering (Fisher et al. 1993)with multivalued attributes (Ryu and Eick 1998)using k-means clustering algorithm.5 AI clusteringhas been used successfully for many similar appli-cations, such as QCS (query, cluster, summarize)(Dunlavy et al. 2006)—an information retrievalsystem that allows users to retrieve relevant docu-ments separated into topic clusters with a singlesummary for each cluster. IBM Research (Campbellet al. 2006) developed a clustering system forindexing, analysis, and retrieval of videos.

The case-based reasoning (Kolodner 1993)engine makes use of AI clustering results to retrievesimilar relevant cases to create recommendationsand summaries. CBR is a popular approach to reuseprevious experience to handle new situations. Forexample, PlayMaker (Allendoerfer and Weber2004) is a CBR prototype that models how air traf-fic controllers handle traffic flow under severeweather or congestion. Xu (1996) used CBR toidentify people who are “AIDS risky” to provideintervention and prevention. Esmaili et al. (1996)used CBR for computer intrusion detection.

Decision TreesFinally, I use incremental decision trees (Mitchell1997, Winston 1992, Utgoff 1989) to performmachine learning and rule generation (Quinlan1987) to capture how case officers handle non-standardcases. To enable decision trees to integrateback to the rule engines, each decision tree repre-sents one assessment criterion, which is represent-ed by a rule set in the rule engine.

In Australia, the Department of Family and

Community Services has an “edge expert system”that uses decision trees to determine a citizen’slikely entitlement to payments and services(Administrative Review Council 2004).

Application DescriptionThe platform for the form-processing system isJava 2 Platform, Enterprise Edition. The AI moduleis therefore also Java-based and packaged anddeployed as Java Enterprise Archive (EAR) files. Forscalability, AI services are provided in a statelessmanner and can be deployed on as many applica-tion servers as needed.

The front end to the AI software is a web-basedthin client operated by immigration case officers.The layout and design of the web client is typicalof other form-based systems. Each case officer hasan inbox containing all the applications he or shehas been assigned to handle. For each application,there are several screens to display personal infor-mation on the applicant, the details of the current

application, related documents provided by theapplicant and those sent to the applicant by theimmigration agency, historical record on thisapplicant such as prior applications, other relatedinformation, and follow-up actions. Basically, any-thing related to an applicant, all his or her currentand past applications and documents, is all con-solidated in a conveniently accessible dashboardfor the case officer to review.

The various AI features used by case officersoperate in near real time. The other AI tasks thatare not performance critical, such as case learningand rule generation, are done behind the scenes asbackground processes.

AI processing results are displayed on two keyscreens—an assessment screen and a decision sup-port screen. The assessment screen displays resultsof the assessment rule engine as a list of violatedrules and details of those rules, such as attributesand parameter values, as well as links to legal ref-erences related to that rule. Rules may include hardrules, soft rules, or self-learned rules. The assess-

Articles

58 AI MAGAZINE

AI Core

App Core

AI Engine

AI EngineCompiler

Auto-generated Auto-generated

Legend:Java

xml

AI coding &configurations

Subsystemcoding

Basic AI class libraries

ApplicationCore Libraries

coding

class libraries

auto-generated

Figure 4. The Structure of each AI Engine.

Articles

SPRING 2008 59

ment screen also contains a set of recommendedfollow-up action to take (generated by the schema-based reasoning engine), such as requesting addi-tional documents or verifying the validity of cer-tain information provided by the applicant.Actions already taken are also displayed.

The decision support screen is used to handlemore complex, nonstandard cases. It displays a listof related precedent cases and their key attributes,generated by the CBR engine through AI cluster-ing. In addition, the case officer can request theCBR engine to search for similar cases based on aselected subset of those attributes. The assessmentresults are also shown with reasons for approval orrejection.

Scalable AI DevelopmentOne of the key AI design objectives is that thedevelopment approach must be scalable so that itwill be easy and convenient to extend the AI capa-bilities to cover the hundred different types offorms. Ease of maintenance is another importantdesign objective to ensure that knowledge can beupdated easily without any impact to any otherdeployed components.

The AI development approach created for thisapplication is a nonintrusive XML-based approachin which all knowledge and configurations arecoded using only resource-definition framework(RDF) or XML documents.6, 7 Automatic code-gen-eration techniques were then used to dynamicallygenerate the actual AI engines as Java binaries,associated object-relational mappings, and data-base tables (see figure 4). This greatly shortensdevelopment time and minimizes potential codingerrors (Tabet, Bhogaraju, and Ash 2000).

Other rule engines also use XML to encoderules.8, 9, 10 This approach is used extensively forall the AI engines, not just the rule engines. Fur-thermore, the interface codes to the relevantdomain objects that represent the applicationdetails are also autogenerated from RDF or XMLand require no Java coding. In the past, interfacingan AI engine to an application has usually beenvery time consuming and error prone. To furthersimplify interfacing, results from AI processing aresimply returned as an encapsulated result object.

Figure 4 shows the structure of an AI engine.Each of the five AI engines (figure 3) within each AImodule of each subsystem follows a similar struc-ture and development approach. At the bottom isthe AI core—basic AI Java class libraries that repre-sent AI algorithms and routines. On top of that arethe subsystem-specific AI engine, object and rela-tional mappings,11 and database that are generatedautomatically from RDF or XML using the AIengine compiler.

For a rule engine, the RDF and XML documents

describe the domain objects, rules, agendas, andrule sets. For CBR, the RDF and XML documentsdefine the attribute vectors used for clustering. Fordecision tree learning, the RDF and XML docu-ments define the decision trees and attributeswithin each tree. The crux of the AI developmenteffort is in creating these RDF/XML documents. AIdevelopment is greatly simplified because, first,RDF and XML documents are easier to create(either directly or through a graphical user inter-face) than Java source code, and secondly, sinceJava binaries are created automatically, debuggingtime is eliminated. Nevertheless, there are still sev-eral thousand rules that need to be encoded, morethan a hundred clustering vectors, and hundredsof decision trees.

The autogenerated AI engine requires a con-trollerto provide application-level APIs to controlhow the AI engine should operate and how resultsshould be returned. The controller is provided bythe app core—a set of application-specific classlibraries. These Java libraries are shared by all sub-systems. The only custom Java coding that is need-ed for each subsystem is the subsystem coding thatdefines behaviors and process flows specific to aparticular subsystem. The amount of Java coding isvery small and is needed only if the subsystemdoes not follow standard processing defined in theapp core.

Although there are many subsystems and over ahundred types of application forms, the uniquenonintrusive XML-based development approachmakes the AI module very easy to customize andmaintain. The generated AI engines represent AIservices that are decoupled from other compo-nents within the application.

Uses of AI Technology This section provides further details on how thedifferent AI engines are used to streamline e-gov-ernment forms processing.

The Assessment Rule EngineThe assessment rule engine is probably the mostimportant as it encodes immigration-related leg-islative knowledge and guarantees that all applica-ble laws, regulations, and guidelines have beenconsidered for each and every case. The key func-tions are (1) perform initial preliminary assessmentto assist the workflow engine in case assignment aswell as to guide information collection, and (2)perform final assessment to determine the applica-tion result.

There is one rule engine and hence one rule baseper subsystem. Each subsystem may have manydifferent types of application forms. To organizethis large body of legislative knowledge, the rulebase for each subsystem contains a separate rule

agenda for each application type. The agendadetermines which combination of rule sets is appli-cable for a particular application type. Rule setscontain all the rules related to determining the sta-tus of particular assessment criteria. For example,whether a person is a recognized citizen or not isone criterion in determining his or her right ofabode. For this criterion, there are more than 30different rules to help determine whether that cri-terion is satisfied or not. All those rules are storedin the “citizen” rule set and controlled by agendasthat require this criterion in assessment of theirassociated application type.

Although most laws and regulations regardingimmigration are relatively static, some of theguidelines do change from time to time. To facili-tate user maintenance of rules without having toregenerate and republish a new rule engine eachtime, the rules are designed with parameter-drivencapabilities. Parameter values can be edited byusers with appropriate authority; the effect onrelated rules is instantaneous.

Beside established legislative knowledge, therule engine also uses knowledge on discretionarydecision making. This knowledge is taken fromself-learned rules that are generated by the self-learning engine from observing how case officershandle nonstandard cases.

Before AI, assessments were done by sortingthrough and reviewing paper documents submit-ted by the applicant and using the case officer’sown personal experience and knowledge of lawsand guidelines. Time needed might be a few min-utes for very simple cases to much longer for com-plicated cases. Some cases may take several days asthe case officer may need to seek help and advicefrom other officers or superiors. With AI, assess-ment is done in less than 10 seconds for all casesregardless of complexity, while guaranteeing thatall relevant legislation and guidelines, as well as allpossible approval scenarios, are considered.

The Schema-Based Reasoning EngineThe schemas stored in the schema-based reasoningengine (Turner and Turner 1991) represent proce-dural knowledge in processing applications. It isused to generate tasks, checklists, and follow upactions for the case officer to perform.

It guides the case officer in collecting all neces-sary information and supplementary documents aswell as printing documents and instructions. Theengine is itself rule driven. This allows differentsets of steps and actions to be proposed dependingon the particulars of the application at hand andprevious actions already taken.

Before AI, if there were unclear points in theapplication or if certain information needed to beverified, the case officer sent a letter to the appli-cant for additional supporting documents. After

receiving the documents, the case officer analyzedthe case once again and possibly requested moreinformation from the applicant if needed. Thiscycle was time-consuming and stressful for theapplicant as he or she may have needed to visit theagency several times before his or her applicationcould be assessed. With AI, different scenarios areanalyzed automatically at the same time, and aconsolidated list is generated, thus minimizing thenumber of visits an applicant needs to make to theagency. Furthermore, letters to applicants are gen-erated automatically. A task checklist is also pro-vided to keep track of tasks so that nothing is over-looked.

The Case-Based Reasoning EngineStraightforward cases are handled automatically bythe assessment rule engine. But in the real world,there are many nonstandard cases that requiremore detailed analysis. This complicates andlengthens the assessment process. The CBR enginehelps alleviate this situation by retrieving relevantclosed cases from the case base to act as precedentsor reference and indexing newly closed cases intothe case base.

There is one CBR engine per application typesince the attributes considered by different appli-cation types will be different. Each case is repre-sented by a prioritized attribute vector that con-tains either data from the application form orresults from the assessment rule engine. The objec-tive of the CBR is to retrieve similar cases and usestatistics to generate recommendations. The CBRengine also supports advanced features, such asmultivalued attributes and incremental AI cluster-ing (Ester et al. 1998).

The case officer may fine-tune the way the CBRengine retrieves relevant cases by selecting theassessment criteria that he or she feels are mostimportant for the case at hand.

Before AI, it was very difficult if not impossibleto locate past cases to use as reference or establishprecedents. There was no easy way to searchthrough microfiche or case folders, which werestored in physical archives. Case officers had torely on their own memory or personal notes, or askother case officers if they remembered handlingsimilar cases before. Even if cases could be recalled,trying to retrieve them from microfiche or thearchives took time. With AI, a precise list of all sim-ilar cases within a given time period can beretrieved within seconds, with all the details of thecases and the analysis results and comments fromthe case officers in charge.

The Self-Learning EngineThe self-learning engine captures discretionarydecision-making knowledge that represents caseofficers’ experience in handling special cases as

Articles

60 AI MAGAZINE

Articles

SPRING 2008 61

well as their knowledge of assessment best prac-tices and guidelines that change with the chang-ing needs of society. The key functions are (1)incrementally learn and index new cases into deci-sion trees, and (2) generate self-learned rules fromthe decision trees and integrate them into theassessment rule engine

For the same reasons as the CBR engine, thereis one self-learning engine per application type.However, each engine may contain many deci-sion trees. Each decision tree represents knowl-edge related to one assessment criterion. These arethe same assessment criteria used by the ruleengines as well as the CBR engines. The decisiontree is either constructed from prioritized datafrom the application form or retrieved fromresults from the assessment rule engine. Thisengine also supports advanced features such asincremental learning.

The self-learned rules generated from each deci-sion tree are used to determine whether an assess-ment criterion was fulfilled or not. Hence rulesgenerated by the self-learning engine are well inte-grated with the assessment rule engine and direct-ly contribute to the assessment result.

Before AI, discretionary decision making wasbased on practices and guidelines that were dis-cussed and shared verbally among case officers.Each case officer kept a personal notebook of theseguidelines and practices as reference. The perform-ance of a case officer depended greatly on his orher knowledge of these practices and guidelinesand his or her personal experience in handlingsimilar cases. There was no easy way to share thistype of knowledge efficiently before. With AIlearning, patterns in discretionary decision makingare extracted and codified as rules so that the cur-rent practices can be shared and used regardless ofthe experiences of the case officers.

Application Use and PayoffThe AI module was deployed to production inDecember 2006. Starting in early February 2007 itbegan to process each and every electronic pass-port application. Rollout for the remaining subsys-tems is scheduled throughout 2007 and 2008. Sofar, several hundred immigration case officers havebeen trained on the system.

Evaluation ResultsPrior to deployment, extensive unit, integration,and stress testing was performed. After that, thesystem went through 2 months of user testing and2 months in a production environment before theofficial launch in February 2007. Obviously, thistype of AI system must return correct results 100percent of the time and be fast (within seconds)and stable. General feedback and results from user

evaluation included a number of conclusions.First, for subsystems with a large volume of appli-cations, automatic assessment with AI rules is theonly way to improve efficiency. Second, the abili-ty to automatically find precedent cases is veryimportant and highly useful for decision support.This is too hard to do with the old microfiche sys-tem. Third, self-learning is also very important forcertain application types because rules can be toocomplex to code manually. Fourth, automaticallyconsolidating all information related to a particu-lar case into a dashboard was found to be very use-ful. The old approach of manually sorting throughpaper documents and records was too time con-suming and error prone. Finally, the ability toautomatically propose follow-up actions and toautomatically generate notification letters andminutes was also found to be very useful and amajor time savings.

Key payoffs include improved quality of serviceand assessment, increased productivity, improvedagility, increased capacity for growth, and eco-nomic savings.

Quality of service is the number one priority forthis immigration agency. Year after year, it receivesnumerous awards and recognitions for outstand-ing quality of service to citizens and visitors (HongKong Government 2007a). The use of AI to stream-line processing workflow enhances the quality ofservice by reducing turnaround time (Hong KongGovernment 2007b). For example, time to get anentry permit for employment can be shortened by3 to 5 days, whereas a search of birth, death, ormarriage records can be reduced to several min-utes. One-stop service can be possible for someapplications. Second, the use of AI provides a morecomprehensive and thorough assessment of eachcase so that follow-up tasks are consolidated, min-imizing the number of documents the applicantmust provide and the number of visits to theagency.

In the past, assessment quality depended on theexperience and knowledge of case officers. Timewas needed to think through numerous intricateand complex laws and regulations for each typeof application. Manual assessment was time con-suming and error prone. With AI, all relevantlaws, regulations, and guidelines are considered atall times within seconds, guaranteeing that noth-ing is overlooked and eliminating any potentialfor errors.

Increased productivity was another key payoff.For complicated cases, case officers need time tosort out all the information provided by the appli-cant as well as run through different approval sce-narios. This can be time consuming and mayrequire discussions with other case officers to clar-ify fine details of legislation. With AI, applicationsare assessed under all possible scenarios at the

same time and within seconds. In addition, locat-ing historical cases from microfiche or folders fromphysical archives was previously very time con-suming. Using AI, relevant cases are automaticallyretrieved without any effort from the case officer.Case officers can then focus on using their expert-ise more effectively for decision making.

The system also offered improved agility.Because the AI module is parameter driven, anyurgent change in guidelines and policies can bemade instantly without any change to software.With self-learning capabilities, the AI module auto-matically adapts itself to changing practices andguidelines. Hence the agency becomes more agilein terms of its knowledge management capabili-ties.

Another payoff was in capacity for growth. Inthe long term, the AI module will allow the agencyto cope with continuously increasing workloads tosupport the city’s economic growth.

A final key payoff was economic savings. Theagency estimates that the application processingsystem will save the government more than US$16million annually (Hong Kong Government 2004).Efficiencies provided by the AI module representnot only cost savings for the government but alsosubstantial savings in community cost in reducedwaiting and turnaround time for citizens and visi-tors.

Application Development and Deployment

The design and development of the applicationprocessing system began in early 2005 with the AIwork starting in mid-2005. The project prime con-tractor is NCSI,12 a wholly owned subsidiary ofNCS, a leading IT solutions provider headquarteredin Singapore with several thousand IT profession-als worldwide. AI technology for the project wasprovided by CityU Professional Services Limited, anonprofit subsidiary of the City University ofHong Kong.

The total IT team for the entire project consistsof roughly 200 programmers, system analysts, andconsultants from several IT vendors and systemintegrators from around the world. In addition,roughly another 60 officers and managers from theuser side are dedicated to this project.

The AI design and development team consists ofroughly 10 knowledge engineers and AI develop-ers. AI development was simplified with extensivesupport from the user side in providing knowledgein a form that was readily convertible into rules forthe rule engine.

For a system as complex as this, integration,robustness, and scalability were major concerns inthe design of the AI module. To minimize integra-tion issues, the AI module was designed to be

decoupled from other components using well-defined interfaces. Robustness is handled bydesigning the AI module to be deployable per sub-system or even per application type if needed. Anyfault in one subsystem or application type will notaffect others. In addition, all internal databasesused by the AI module have redundancy toimprove robustness and performance. Scalability ishandled by designing the AI module to provide AIservices in a stateless manner. If workload increas-es, all that is needed is simply to add more appli-cation servers. This distributed design also allowsthe application to switch over to another AI serverif one fails.

Deployment AI deployment is prioritized based on subsystemsand application types with the “electronic pass-port” (Xinhua News Agency 2007, Hong KongGovernment 2007c) and “birth, death, and mar-riage” subsystems to be the first to be deployed.

The first version of the assessment rule engineand schema-based reasoning engine was releasedin mid-January 2006. This was followed by theCBR engine in mid-February 2006 and the self-learning engine at the end of March 2006. Sincethen, the systems have been undergoing exten-sive testing. In parallel, the engines were cus-tomized for different subsystems and applicationtypes as well as fine-tuning features and perform-ances.

User testing began in September 2006 with thefirst rollout to production in December 2006. Sub-sequent subsystems are scheduled to be deployedthroughout 2007 and 2008.

MaintenanceJust as with any other mission-critical software,there will inevitably be changes and upgrades tothe AI module after deployment to reflect legisla-tive or operational changes for the agency. Thedesign of the AI architecture is such that thesetypes of changes are very easy to do.

First, all knowledge-related changes can be donewithout any Java coding simply by updating theRDF and XML documents and configuration files.Binaries and databases are then generated auto-matically by the engine compilers. Second, thebehavior of the rule engines are parameter drivenand under user control to reduce the need for codechange. Packaging the AI module as a decoupledcomponent from the other parts of the systemhelps further reduce maintenance and integrationneeds.

For support, the IT team of the prime contractor,NCSI, provides front-line technical and end-usersupport while CityU provides additional assistanceon the AI technologies when needed.

Articles

62 AI MAGAZINE

Articles

SPRING 2008 63

ConclusionThis article provided an overview of how variousAI techniques were used to provide highly intelli-gent and accurate case assessment capabilities toan e-government form-processing system. AIstreamlines processes and results in higher qualityand faster service to citizens and visitors. In addi-tion, valuable domain knowledge and expertiserelated to immigration laws, regulations, andguidelines are now quantified, coded, and pre-served within the agency for use in this and othersystems. The AI work makes use of several innova-tive techniques, such as nonintrusive RDF andXML coding, and integrated rule, schema-based,case-based, and self-learning engines as well asincremental clustering and learning. This may bethe first time any immigration agency in the worldis using AI for automated assessment and in such alarge and broad scale of deployment.

Notes1. See the CLIPS website (www.ghg.net/clips/CLIPS.html).

2. See the Quintus Prolog website (www.sics.se/isl/quin-tuswww/site/flex.html)

3. JBoss Rules (www.jboss.com/products/rules).

4. See Jess—the Rule Engine for Java Platform (herzberg.ca.sandia.gov/jess/).

5. See the Wolfram MathWorld website (mathworld.wol-fram.com/K-MeansClusteringAlgorithm.html).

6. The resource description framework (RDF) is availableat www.w3.org/RDF.

7. The RDF/XML Syntax Specification (Revised), W3CRecommendation 10 February 2004 is available atwww.w3.org/TR/rdf-syntax-grammar.

8. See the Drools website (drools.org/).

9. See A Java Deductive Reasoning Engine for the Web(jDrew) (www.jdrew.org).

10. See the Mandarax Project (mandarax.sourceforge.net).

11. See the Hibernate website (www.hibernate.org).

12. See the NCS website (www.ncs.com.sg)

ReferencesAllendoerfer, K. R., and Weber, R. 2004. PlayMaker: AnApplication of Case-Based Reasoning to Air Traffic Con-trol Plays. In Proceedings of the Seventh European Conferenceon Case-based Reasoning (ECCBR), 476–488. Berlin:Springer.

Administrative Review Council, Government of Aus-tralia. 2004. Automated Assistance in AdministrativeDecision Making, Administrative Review Council, Reportto the Attorney-General, Report No. 46, November 2004.Canberra, ACT: Government of Australia (www.ag.gov.au/agd/www/archome.nsf/Page/RWPD0BD2AED121A15A8CA256F5E00022956).

Buchanan, B. G., and Shortliffe, E. H., eds. 1984. Rule-Based Expert Systems: The MYCIN Experiments of the Stan-

ford Heuristic Programming Project. Reading, MA: Addison-Wesley.

Campbell, M.; Haubold, A.; Ebadollahi, S.; Naphade, M.R.; Natsev, A.; Seidl, J.; Smith, J. R.; Tei, J., and Xie, L.2006. IBM Research TRECVID-2006 Video Retrieval Sys-tem. In Proceedings of the NIST TRECVID-2006 Workshop.Gaithersburg, MD: National Institute of Standards andTechnology.

Captiva Software Corporation. 2002. Swiss Tax OfficeProcessing More Returns for Less. Hopkinton, MA: EMCCorporation (www.captivasoftware.com/products/cases-tudies/casestudies_view.asp?wcs_id=55).

Chen, K. T., and Lee, R. M. 1992. Schematic Evaluation ofInternal Accounting Control Systems. EURIDIS ResearchMonograph, RM-1992-08-1. Erasmus University, Rotter-dam, The Netherlands, August 1992.

Dunlavy, D. M.; O’Leary, D. P.; Conroy, J. M.; andSchlesinger, J. D. 2006. QCS: A System for Querying,Clustering, and Summarizing Documents, Sandia Tech-nical Report, Sandia National Laboratories, Albuquerque,NM, July.

Esmaili, M.; Balachandran, B.; Safavi-Naini, R. andPieprzyk, J. 1996. Case-Based Reasoning for IntrusionDetection. Paper presented at the Computer SecurityApplications Conference, San Diego, CA, 9–13 December.

Ester, M.; Kriegel, H. P.; Sander, J.; Wimmer, M. and Xu,X. 1998. Incremental Clustering for Mining in a DataWarehousing Environment. In Proceedings of the 24thInternational Conference on Very Large Data Bases (VLDB).San Francisco: Morgan Kaufmann Publishers.

Fisher, D.; Xu, L.; Carnes, J.; Reich, Y.; Fenves, S.; Chen, J.;Shiavi, R.; Biswas, G.; and Weinberg, J. 1993. Applying AIClustering to Engineering Tasks. IEEE Expert 8(2): 51–60.

Forgy, C., and McDermott, J. P. 1977. OPS, A Domain-Independent Production System Language. In Proceedingsof the Fifth International Joint Conference on Artificial Intel-ligence (IJCAI-1977), 933–939. Los Altos, CA: WilliamKaufmann, Inc.

Gardner, A. 1987. An Artificial Intelligence Approach toLegal Reasoning. Cambridge MA: The MIT Press.

Hong Kong Government. Special Administrative Regionof the People’s Republic of China. 2004. Implementationof Phase III of the Updated Information Systems Strategyfor the Immigration Department, FCR(2004–05)10, 14May (www.legco.gov.hk/yr03-04/english/fc/fc/papers/f04-10e.pdf).

Hong Kong Government. Special Administrative Regionof the People’s Republic of China. 2005. ImmigrationDepartment Annual Report 2004–2005. Hong Kong:Government of the People’s Republic of China(www.immd.gov.hk/a_report_04-05/htm_en/ch05/url/frame_020.htm).

Hong Kong Government. Special Administrative Regionof the People’s Republic of China. 2007a. Recognition ofthe Quality Service of Immigration Department, Immi-gration Department. Hong Kong: Government of thePeople’s Republic of China (www.immd.gov.hk/ehtml/rqs.htm).

Hong Kong Government. Special Administrative Regionof the People’s Republic of China. 2007b. PerformancePledge, Immigration Department. Hong Kong: Govern-ment of the People’s Republic of China (www.immd.gov.hk/ehtml/pledge.htm).

Hong Kong Government. Special Administrative Regionof the People’s Republic of China. 2007c. HKSAR Elec-tronic Passport (e-Passport) and Electronic Document ofIdentity for Visa Purposes, Immigration Department.Hong Kong: Government of the People’s Republic of Chi-na (www.immd.gov.hk/ehtml/eppt_edi.htm).

Hong Kong Trade Development Council. 2006. HKSARImmigration Department: Aiming at Security and Quali-ty Service, May (ict.tdctrade.com/suc-e513.htm).

Kolodner, J. 1993. Case-Based Reasoning. San Mateo, CA:Morgan Kaufmann Publishers.

Mitchell, T. 1997. Decision Tree Learning, Machine Learn-ing, 52–78. New York: McGraw-Hill.

Odubiyi, J. B.; Kocur, D. J.; Weinstein, S. M.; Wakim, N.;Srivastava, S.; Gokey, C. and Graham, J. 1997. SAIRE—AScalable Agent-Based Information Retrieval Engine. InProceedings of the First International Conference onAutonomous Agents, 292–299. New York: Association forComputing Machinery.

Questex Media. 2006. The HK Immigration Department’sContinuing Drive for Automation. Enterprise Innovation15 August (www.enterpriseinnovation.net/perspec-tives.php?cat1=4&id=816).

Quinlan, J. R. 1987. Generating Production Rules fromDecision Trees. In Proceedings of the Tenth InternationalJoint Conference on Artificial Intelligence (IJCAI 1987), 304–307. San Francisco: Morgan Kaufmann Publishers.

Ryu, T. W., and Eick, C. F. 1998. Similarity Measures forMultivalued Attributes for Database Clustering, In Pro-ceedings of the Conference on Smart Engineering SystemDesign: Neural Networks, Fuzzy Logic, Evolutionary Pro-gramming, Data Mining and Rough Sets. New York: ASMEPress.

Tabet, S.; Bhogaraju, P.; and Ash, D. 2000. Using XML asa Language Interface for AI Applications. In Advances inArtificial Intelligence, PRICAI 2000 Workshop Reader, FourWorkshops held at PRICAI 2000, Lecture Notes in ComputerScience 2112, 103–110. Berlin: Springer-Verlag.

Turner, E. H.; and Turner, R. M. 1991. A Schema-based

Approach to Cooperative Problem Solving withAutonomous Underwater Vehicles. In Proceedings of theIEEE Oceanic Engineering Society OCEANS’91. Piscataway,NJ: Institute of Electrical and Electronic Engineers.

Turner, R. M. 1994. Adaptive Reasoning for Real-World Prob-lems: A Schema-Based Approach. Hillsdale, NJ: LawrenceErlbaum Associates.

U.S. Bureau of Customs and Border Protection. 2005. Tar-geting Center the Brains behind Anti-Terror Efforts. Cus-toms and Border Protection Today, (November/December)(www.cbp.gov/xp/CustomsToday/2005/nov_dec/target-ing.xml).

U.S. General Accounting Office. 2004. Homeland Securi-ty: Summary of Challenges Faced in Targeting Oceango-ing Cargo Containers for Inspection, March 2004, GAO-04-557T. Washington, D.C: U.S. Government PrintingOffice (www.gao.gov/htext/d04557t.html).

U.S. House of Representatives. Subcommittee on Immi-gration and Claims, Committee on the Judiciary. 1997.Border Security and Deterring Illegal Entry into the Unit-ed States, April 1997. Washington, D.C.: U.S. Govern-ment Printing Office. (commdocs.house.gov/commit-tees/judiciary/hju43664.000/hju43664_0f.htm).

Utgoff, P. E. 1989. Incremental Induction of DecisionTree. Machine Learning 4: 161–186.

VertMarkets IT Group. 2003. Australian GovernmentAgency Improves Immigration Records Keeping, LowersOperating Costs. Enterprise Content Management News andSolutions, 11 December (www.ecmconnection.com/Con-tent/news/article.asp?Bucket=Article&DocID={91F02017-2FF4-4DC6-BEAD-BEA3B9B44620).

VertMarkets IT Group. 2002. U.S. Census Demands Reli-able Imaging Solution. Enterprise Content ManagementNews and Solutions, 9 April (www.ecmconnection.com-Content/news/article.asp?Bucket=Article&DocID=%7B954C8253-9B23-4BF3-B3BF-A3CD94FF45C6%7D).

Winston, P. 1992. Learning by Building Identification Trees,Artificial Intelligence, 423–442. Reading, MA: Addison-Wesley.

Xinhua News Agency. 2007. Hong Kong to Issue e-Pass-port, 1 January 2007. Hong Kong: Government of thePeople’s Republic of China (english.gov.cn/2007-01/01/content_485516.htm).

Xu, L. D. 1996. An Integrated Rule- and Case-Based Rea-soning to AIDS Initial Assessment. International Journal ofBio-medical Computing 40(3): 197–207.

Andy Chun is an associate professor inthe Department of Computer Scienceat the City University of Hong Kong.His research interests include webtechnologies; search engine optimiza-tion; scheduling, rostering, optimiza-tion; business intelligence and datamining; and distributed architectures.He received a B.S. from the Illinois

Institute of Technology and an M.S. and a Ph.D. in elec-trical engineering from the University of Illinois atUrbana-Champaign. His e-mail address is [email protected].

Articles

64 AI MAGAZINE

Please Join Us for AAAI-08!July 13–17, 2008

www.aaai.org/aaai08.php


Recommended