+ All Categories
Home > Documents > Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC...

Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC...

Date post: 22-Mar-2020
Category:
Upload: others
View: 11 times
Download: 0 times
Share this document with a friend
47
AUTOMATIC RECOGNITION OF HANDWRITTEN MEDICAL FORMS FOR SEARCH ENGINES Robert Milewski Venu Govindaraju
Transcript
Page 1: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

AUTOMATIC RECOGNITION OF

HANDWRITTEN MEDICAL FORMS FOR

SEARCH ENGINES

Robert MilewskiVenu Govindaraju

Page 2: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

TOCTOC

IntroductionPrior WorkBinarizationTopic Categorization: TrainingTopic Categorization: TestingResultsApplications

Page 3: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Problem StatementProblem Statement

Address unconstrained recognition and retrieval of handwritten forms by focusing on the NYS PCR medical form.

Page 4: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

ObjectiveObjective

We will explore:linguistic modelstopic categorizationlexicon reduction

Page 5: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Envisioned Journey of the PCREnvisioned Journey of the PCR• Ambulance rescues patient• Emergency environment complex • Patient information documented• Form passed to emergency room• Information extraction• Data disseminated• Analysis performed

Page 6: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

TOCTOC

IntroductionPrior WorkBinarizationTopic Categorization: TrainingTopic Categorization: TestingResultsApplications

Page 7: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Evolution of Form AnalysisEvolution of Form Analysis(Kim and Govindaraju, 1996)(Madhavanath et al 1996)(Kim and Govindaraju, 1997)(Tomai, Zhang and Govindaraju, 2002)

Census Form

Postal Mail

Historic Documents

Bank Check

Page 8: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Survey of Lexicon ReductionSurvey of Lexicon Reduction(Koerich, Sabourin, Suen, 2003)

Bank ApplicationsDollar amount constrains lexicon

Postal ApplicationsCity/State/Street databases constains lexicon

Word Length / Word ShapeAffected by rescue environment

Performance only ReductionSome models significantly improve speed but drop recognition by as much as 30%

Page 9: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Holistic Lexicon ReductionHolistic Lexicon Reduction(Madhvanath,Krpasundar 2001)

Used for lexicon reductionFeatures

ascendersdescendersword shape

Medical environment complicates these features

STERNAL

Page 10: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Call-Routing ProblemCall-Routing Problem(Lucent Technologies Bell Laboratories, Chu-Carroll, et al., 1999)

Used R-SVD approach.

Call-Routing:Input: Voice RecognitionOutput: Destination

Medical Forms:Input: Handwriting RecognitionOutput: Topic Category

Page 11: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Medical Text CategorizationMedical Text Categorization(Yang and Chute 1994)

Used SVD approachApplied on known textLearned word-category associations on medical textUsed MEDLINE and MAYO

Page 12: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

TOCTOC

IntroductionPrior WorkBinarizationTopic Categorization: TrainingTopic Categorization: TestingResultsApplications

Page 13: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Histogram ClassificationHistogram Classification

Original Image

Thresholded to keep only form

Thresholded to keep only carbon

Page 14: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Sinusoidal CoverageSinusoidal CoverageOnly 4 of the 8 paths are shown.The algorithm is computed on a larger surface; the surface of an entire form block.Desire to stay close to the origin since paper fluctuations occur occur further out.

Page 15: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Visual ComparisonVisual Comparison

Binarization Only Binarization + Processing(a) Original image (b) Original image with form drop out (c) Wu/Manmatha Binarization (d) Kamel/Zhao Binarization (e) Niblack Binarization (f) Sauvola Binarization (g) Otsu Binarization (h) Gatos (i) Sine Wave Binarization

Page 16: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

PerformancePerformanceLegend(SW) Sine Wave Binarization(O) Otsu Binarization(N) Niblack Binarization(G) Gatos Binarization(S) Sauvola Binarization(W) Wu/Manmatha

Binarization(K) Kamel/Zhao Binarization

ImprovementBinarization: + 11-31%Binarization/PP: + 4.5-7.25%

EnvironmentPCR’s 62Words 3,000

Page 17: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

TOCTOC

IntroductionPrior WorkBinarizationTopic Categorization: TrainingTopic Categorization: TestingResultsApplications

Page 18: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Road MapRoad Map

Page 19: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

HypothesisHypothesis

A sequence of confidently recognized characters, extracted from an image of handwritten medical text, can be used to represent a topic category.

Page 20: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

PCR TruthingPCR Truthing10 Body SystemsCirculatory/Cardiovascular DigestiveEndocrineExcretoryImmuneIntegumentaryMuscuskeletalNervousReproductiveRespiratory4 Extremities/Joint

LocationsArms/Shoulders/ElbowsFeet/Ankles/ToesHands/Wrists/FingersLegs/Knees4 General LocationsFluid/Chemical ImbalanceFull BodyHospital Transfer/TransportSenses

6 Body Range LocationsAbdomen HeadBack/Thoracic/Lumbar Neck/CervicalChest Pelvic/Sacrum/Coccyx

Page 21: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

PCR-Category RelationshipPCR-Category Relationship

Page 22: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Modeled Category ExamplesModeled Category Examples

Example 1: A patient treated for an emergency pregnancy would be considered the Reproductive System category.

Example 2: A conscious and breathing patient treated for gun shot wounds to the abdominal region would be considered Circulatory/Cardiovascular System, due to potential loss of blood.

Page 23: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Phrase Extraction using CohesionPhrase Extraction using CohesionDIGESTIVE-SYSTEM FQ CHSN PHRASE30 0.72 PAIN INCIDENT5 0.31 PAIN TRANSPORTED42 0.54 PAIN CHEST52 0.81 STOMACH PAIN9 0.25 HOME PAIN6 0.43 VOMITING ILLNESS

PELVIS1860 2.49 PAIN HIP144 0.34 HIP JVD112 0.39 PAIN CHANGE275 0.81 HIP FX110 0.37 HIP CHANGE163 0.40 JVD PAIN106 0.40 CAOX3 PAIN202 0.50 PAIN JVD213 0.55 PAIN LEG205 0.42 CHEST HIP

cohesion(wa ,wb ) = z •f (wa ,wb )

f (wa )* f (wb ))

Phrases determined for each category (word order preserved)f(wa, wb) frequency of two words co-occuringf(wa) independent word frequency of waz a constant (z=2 in this research)

Page 24: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Term ConstructionTerm Construction

Page 25: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Term Encoding StrategiesTerm Encoding Strategies

Terms can be constructed in different ways:No Spatial Information (NSI)

*C1*C2*BLOOD -> *L*D*

Exact Spatial Information (ESI)xC1yC2zBLOOD -> 1L2D0

Approximate Spatial Information (ASI)xa-bC1yc-dC2ze-f

Encode a range of lengths to a codeCode 0: indicate no charactersCode 1: indicates 1 or 2 charactersCode 2: indicates at least 3 characters

BLOOD -> 1L1D0

Page 26: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Combinatorial Analysis (1/2)Equation

Combinatorial Analysis (1/2)Equation

C(n) = (n − i)i=1

n−1

∑⎛

⎝ ⎜

⎠ ⎟ + n

⎝ ⎜ ⎜

⎠ ⎟ ⎟ =

n2

⎛ ⎝ ⎜

⎞ ⎠ ⎟ (n −1)

⎝ ⎜

⎠ ⎟ + n

⎝ ⎜

⎠ ⎟

Input: Word of length n s.t. n is at least 1

Output: Quantity of terms to be generated

P(a,b) = C(a) ⋅ C(b)Input: Two word phrase a and b

Output: Quantity of terms to be generated

Page 27: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Combinatorial Analysis (2/2)Example

Combinatorial Analysis (2/2)Example

Let the phrase to evaluate uni/bi-gram combinations be “PULMONARY DISEASE”Let n = length(“PULMONARY”) = 9 Let m = length(“DISEASE”) = 7C (n) = 45 uni-gram + bi-gram combinations for “PULMONARY”C (m) = 28 uni-gram + bi-gram combinations for “DISEASE”P (n,m) = 1,260 uni-gram + bi-gram phrase combinations for “PULMONARY DISEASE”

Page 28: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

LDWR Character ExtractionChest Pain/Pressure

LDWR Character ExtractionChest Pain/Pressure

• Phrases extracted from Circulatory/Cardiovascular-System category.

• Blue indicates successfully extracted top-choice character.

• Red indicates unsuccessfully extracted top-choice character.

Page 29: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Term-Category Matrix ConstructionTerm-Category Matrix Construction

•Terms on rows.

•Categories on columns.

•Terms generated from high cohesive phrases under category.

•The Term-Category matrix is large.

(Chu-Carroll, et al., 1999)

Page 30: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Matrix NormalizationMatrix NormalizationBt, c =

At, c

At, e2

e=1

n

• A is the input matrix containing raw frequencies.

• B is the output matrix with normalized frequencies.

• (t,c) is a (term, category) coordinate within a matrix.

(Chu-Carroll, et al., 1999)

Page 31: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Matrix WeightingMatrix WeightingIDF(t) = log2

nc(t)

Xt, c = IDF(t) ⋅ Bt, c

(Sparck Jones, 1972)(Chu-Carroll, et al., 1999)

The IDF is used to improve term discrimination ability.

IDF(t) computes IDF on term t

c(t) = number of categories containing term t

B is previous normalized/weighted matrix

Pair (t,c) represents a matrix coordinate

e.g. (bl, mt) occurred in 3 categories.

Page 32: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

TOCTOC

IntroductionPrior WorkBinarizationTopic Categorization: TrainingTopic Categorization: TestingResultsApplications

Page 33: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Text Image ExtractionText Image Extraction

Binarization and post-processing

(R. Milewski, V. Govindaraju 2006)

Recognizer Confusion:abd “a ? q”snt “t”stable “t”, “a”, “b”, “e ? c”pelvis “e ? c”, “l”, “v ? u”

Page 34: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Pseudo-Category VectorPseudo-Category Vector

A vector Q of size (|T|x1) is constructed.Each cell in the vector represents the frequency of the uni/bi-gram sequence generated from the LDWR.

(Chu-Carroll, et al., 1999)

Page 35: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Vector MergerVector Merger

• Q is merged with the original matrix X.

• The purpose is to produce a vector Vq which represents Q in k-dimensional space.

(Chu-Carroll, et al., 1999)

Page 36: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

SVDSVD(Deerwester., 1990)(Chu-Carroll, et al., 1999)

X = U ⋅ S ⋅V T

X is a matrix which is decomposed into 3 matricesU is a T x k matrix representing term vectorsS is a k x k matrixV is a k x C matrix representing category vectorsThe correctly scaled pseudo-category vector is now produced by computing Vq • Sr

Page 37: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Category QueryingCategory QueryingVector x: correctly scaled pseudo-categoryVector y: A single category vector in Vr• Sr

The value z is the score between x and each y.

z is the ‘closeness’ of the LDWR data to a category

z = cos(x,y) =x ⋅ yT

xi2 ⋅ yi

2

i =1

n

∑i =1

n

(Chu-Carroll, et al., 1999)

(Simplified Dimensional Space)

Page 38: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Sigmoid FittingSigmoid Fitting(Chu-Carroll, et al., 1999)

Lucent Technologies showed that the use of a sigmoid function instead of the cosine score can reduce error rate.The cosine score is mapped to a sigmoid function.Least Squares Regression line is computed as follows:

The final confidence score is produced by mapping the regression line to the sigmoid function:

a =n xiyi − xi

i =1

n

∑i =1

n

∑ yii =1

n

n xi2 − xi

i =1

n

∑⎛

⎝ ⎜

⎠ ⎟

2

i =1

n

∑b =

1n

yi − a xii =1

n

∑i =1

n

∑⎛

⎝ ⎜

⎠ ⎟

confidence(a,b,z) =1

1+ e−(az+ b)( )

Page 39: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

TOCTOC

IntroductionPrior WorkBinarizationTopic Categorization: TrainingTopic Categorization: TestingResultsApplications

Page 40: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Lexicon OrganizationLexicon OrganizationUNIQUE LEXICON SIZE FROM TRAINING SET: 5,628TOTAL WORDS ACROSS LEXICONS (OVERLAPPING): 8,156

AVERAGE CATGORIES PER FORM 1.4MAXIMUM CATEGORIES PER FORM 5

TOTAL TRAINING SET PCR’S: 750TOTAL TESTING SET PCR’S: 62

TOTAL TRAINING SET WORDS: 42,226 TOTAL TEST SET WORDS: 3,089

Page 41: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Handwriting Recognition(HR) RatesHandwriting Recognition(HR) RatesCL CLT AL ALT SL SLT RL RLT

ACC 76.34% 76.92% 63.52% 66.59% 70.51% 71.51% 70.70% 71.06%

ERR 71.93% 69.95% 57.24% 47.12% 62.26% 59.44% 62.04% 59.45%

RAW 23.31% 25.32% 32.31% 41.73% 30.30% 32.73% 30.62% 32.63%

LS 5,628 8,156 1,193 1,246 2,514 2,620 2,401 2,463

!L n/a n/a 23.89% 8.02% 16.06% 10.46% 16.61% 12.23%

!HL n/a n/a 33.33% 97.98% 48.19% 73.99% 46.59% 62.96%

ACC: accept rate

ERR: error rate

RAW: raw recognition rate

LS: lexicon size for experiment

!L: percentage of truther word not in reduced lexicon

!HL: percentage from !L set in which a human could not decipher the word

CL: complete lexiconAL: category oracle machineSL: synthetic term generationRL: reduced lexicon approach

“-T” indicates test deck lexicon inserted into reduced lexicon.

Page 42: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

HR InterpretationHR Interpretation

CLT to RLT CL to RL CLT to ALT CLT to SLT

HR ↑7.48% ↑7.42% ↑17.58% ↑7.42%

Error Rate ↓10.78% ↓10.88% ↓24.53% ↓10.21%

HR Performance based on Lexicon Reduction

METRIC VALUEReduction Accuracy (α) 0.33Reduction Degree (ρ) 0.83Reduction Efficacy (η) 0.06Lexicon Density (words) 1.07 → 0.87Lexicon Density (uni/bi-grams) 0.50 → 0.78

Word present after reduction

Words removed from lexicon

η = ∆LDWR ×α (1−ρ )Effectiveness Rating

Lexicon becomes less confusing

N-Gram’s are more common

Lexicon Reduction Performance

Page 43: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Retrieval PerformanceCohesive Phrase Search

Retrieval PerformanceCohesive Phrase Search

Querying 800 forms with 1,250 cohesive phrases from isolated deck of 200 forms:CL returns 0 forms 73% of the timeRL returns 0 forms 23% of the time

Page 44: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Retrieval PerformanceQuery Expansion Search

Retrieval PerformanceQuery Expansion Search

Cohesive phrase is decomposed into uni/bi-gram componentsSearch performed by matching only uni/bi-gram componentsPoor performance

Page 45: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

TOCTOC

IntroductionPrior WorkBinarizationTopic Categorization: TrainingTopic Categorization: TestingResultsApplications

Page 46: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

Health SurveillanceHealth Surveillance

Page 47: Robert Milewski Venu Govindarajugovind/MILEWSKI-DEFENSE-P4.pdfRobert Milewski Venu Govindaraju. TOC Introduction Prior Work Binarization ... Envisioned Journey of the PCR • Ambulance

ContributionsContributionsThe first binarization and post-processing strategy on carbon forms.The first application of recognition of handwritten medical forms.The first search engine using handwritten forms.Construction of the first data set of actual handwritten emergency medical documents for use in document analysis research.A paradigm showing a mapping between character encodings to a topic categorization used for lexicon reduction. This strategy is reusable for other lexicon driven handwriting recognizers that are based on character segmentation.New metrics for measuing the performance of lexicon reduction systems.Compatibility with standard information protocols used by Health Level 7 (HL7) and the Center for Disease Control (CDC).A framework for automated, centralized and secure health surveillance network.An advanced software system with diverse visual interfaces and command-line execution modes.


Recommended