+ All Categories
Home > Documents > Arcs Can

Arcs Can

Date post: 02-Jun-2018
Category:
Upload: sasavukoje
View: 221 times
Download: 0 times
Share this document with a friend

of 62

Transcript
  • 8/10/2019 Arcs Can

    1/62

    G-141/3.36.06

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    Scanning Data EntrySolutions forARC/INFO GIS

    An ESRI White Paper

    Contents Page

    Executive Summary 1

    Evaluating Scanning Data Entry 7

    ESRI Scanning Data Entry SolutionsA GIS Focus 31

    Glossary 45

  • 8/10/2019 Arcs Can

    2/62

    Copyright 1995 Environmental Systems Research Institute, Inc.All rights reserved.Printed in the United States of America.

    The information contained in this document is the exclusive property of Environmental Systems ResearchInstitute, Inc. This work is protected under United States copyright law and other international copyrighttreaties and conventions. ESRI grants the recipient of the ESRI information contained herein the right to

    freely reproduce, redistribute, rebroadcast, and/or retransmit this information for personal, noncommercialpurposes including, teaching, classroom use, scholarship, and/or research, subject to the fair use rightsenumerated in Section 107 and 108 of the Copyright Act (Title 17 of the United States Code). No part of thiswork may be reproduced or transmitted for commercial purposes in any form or by any means, electronic ormechanical, including photocopying and recording, or by any information storage or retrieval system, exceptas expressly permitted in writing by Environmental Systems Research Institute, Inc. All requests should besent to Environmental Systems Research Institute, Inc., 380 New York Street, Redlands, CA 92373 USA,Attention: Contracts Manager.

    The information contained in this document is subject to change without notice.

    RESTRICTED RIGHTS LEGENDUse, duplication, and disclosure by the government are subject to restrictions as set forth in FAR 52.227-14Alternate III (g)(3) (JUN 1987), FAR 52.227-19 (JUN 1987), or DFARS 252.227-7013 (c)(1)(ii) (OCT

    1988), as applicable. Contractor/Manufacturer is Environmental Systems Research Institute, Inc., 380 NewYork Street, Redlands, CA 92373 USA.

    ESRI, ARC/INFO, PC ARC/INFO, ArcView, and ArcCAD are registered trademarks; ARC COGO, ARCNETWORK, ARC TIN, ARC GRID, ARC/INFO LIBRARIAN, ARCPLOT, ARCEDIT, TABLES,Application Development Framework (ADF), ARC Macro Language (AML), Avenue, FormEdit, ArcSdl,ArcBrowser, ArcDoc, ARCLine, ARCSHELL, IMAGE INTEGRATOR, DATABASE INTEGRATOR, DBIKit, WorkStation ARC/INFO, ArcTools, ArcStorm, ArcScan, ArcExpress, ArcPress, Mapplets, SPATIALDATABASE ENGINE (SDE), PC ARCEDIT, PC ARCPLOT, PC DATA CONVERSION, PC NETWORK,PC OVERLAY, PC STARTER KIT, PC ARCSHELL, Simple Macro Language (SML), ArcUSA, ArcWorld,ArcScene, ArcCensus, ArcCity, the ESRI corporate logo, the ESRI globe logo, the ARC/INFO logo, the ARCCOGO logo, the ARC NETWORK logo, the ARC TIN logo, the ARC GRID logo, the ARCPLOT logo, theARCEDIT logo, the Avenue logo, the ArcTools logo, the ArcStorm logo, the ArcScan logo, the ArcExpresslogo, the PC ARC/INFO logo, the ArcView logo, the ArcCAD logo, the ArcData logo, ARCware,ARC News,ArcSchool, ESRITeam GIS, ESRIThe GIS People, GIS by ESRI, ARC/INFOThe World's GIS,Geographic User Interface (GUI), Geographic User System (GUS), Your Personal Geographic InformationSystem, and Geographic Table of Contents (GTC) are trademarks; and the ArcData Publishing Program,ARCMAIL, ArcQuest, ArcWeb, and Rent-a-Tech are service marks of Environmental Systems ResearchInstitute, Inc.

    The names of other companies and products herein are trademarks or registered trademarks of their respectivetrademark owners.

  • 8/10/2019 Arcs Can

    3/62

  • 8/10/2019 Arcs Can

    4/62

    Executive Summary

    2 G-141/3.36.06

    March 1994

    Air Photos Maps

    Overview of Scanning Data Entry Pathways

    Scanner

    RasterDatabase

    ARC/INFOCoverageDatabase

    Vectorize Raster-to-vector

    conversion

    Vectorize Heads-up

    digitizing

    Database merge with Existing vector data COGO data CAD data

    Georeference Raster edit Tiling Raster-to-raster

    Conversion

  • 8/10/2019 Arcs Can

    5/62

    Executive Summary

    G-141/3.36.06 3

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    Costs and Benefitsof ScannedData Entry

    Cost analysis of GIS projects shows that database automation often

    accounts for more than 75 percent of the total project expense.

    Scanning data entry is a viable cost-reduction alternative for this most

    expensive GIS componentdata automation.

    The cost of scanning data entry has decreased dramatically since 1990.

    Even agencies with limited budgets and relatively small data automa-

    tion projects are discovering that a scanning data entry strategy makes

    good economic sense. Database automation strategies based on

    scanning are cost-competitive with, and in some cases can be

    significantly less expensive than, other methods.

    Scanning technology is no longer the data capture solution of the

    distant future, but is quickly becoming the preferred method of data

    capture for GIS. Data automation methods within ESRI's Database

    Development Group have shifted dramatically in recent years to

    scanning data entry as the preferred solution.

    Is Scanning DataEntry Appropriatefor Your Project?

    Scanning data entry provides many advantages to GIS users.

    Scanning is generally faster and a great deal more accurate than table

    digitizing. Scanned data that are subsequently vectorized have more

    consistent coordinate placement than data entered through manual

    digitizing. Scanning data entry methods can be easily learned andused by existing staff. Scanning data entry can be used to generate

    application-specific data that are not commercially available. Recent

    technical advances in hardware and software have tailored scanning

    data entry capabilities specifically for GIS requirements.

    GIS users should carefully evaluate scanning data entry in the context

    of project requirements, which can vary greatly. Factors to consider

    include data sources and their availability, map quality, update

    frequency, data volume, accuracy requirements, and system capacity.

    As you consider various data automation options, the specific require-ments of your project will guide your analysis of the alternatives. For

    example, if your GIS application uses street centerline data with

    address ranges, you may find standard "off-the-shelf" data from a

    commercial data vendor a good solution. But commercial data may

  • 8/10/2019 Arcs Can

    6/62

    Executive Summary

    4 G-141/3.36.06

    March 1994

    not exist for all of your organization's needs. For example, a data

    layer such as property boundaries (i.e., land parcels) may not be

    commercially available.

    When you prefer to do data automation in-house, scanning data entry

    takes no more time than table digitizing and offers better coordinate

    accuracy and consistency without the random errors often associated

    with manual methods. If you have an existing digital database and

    need to perform only a low volume of intermittent updates, table

    digitizing these updates may be appropriate. Even so, incremental

    updates from scanned documents are a feasible alternative.

    The available data sources will also influence the feasibility of

    scanning data entry. The scanning data entry alternative is most

    appropriate when the data do not already exist in digital form but do

    exist in document form. Scanning data entry does require a source

    document of some kind. If these documents are of poor quality,

    scanning data entry can be effective but will require more operator data

    cleanup. Scanning data entry is most useful and cost-effective when a

    high-quality data source (e.g., maps or air photos) is available.

    Feature layers on separate documents reduce processing requirements.

    Some applications may be able to use raster data obtained without

    document scanningfor example, satellite or airborne scanner dataprovided on digital media. Scanning data entry technology offers

    software tools that take advantage of commercially available raster

    data.

    When planning scanning data entry projects, it is very important to

    spend time assessing your needs before implementing a solution.

    Accuracy requirements and the characteristics of your source

    documents will determine the most appropriate hardware and software

    package. For example, the need for raster integration may require

    additional disk storage or a more powerful CPU. If you work mainly

    with air photos and do not have stringent accuracy constraints, youshould consider heads-up digitizing as a scanning data entry option.

    If you have good digital geodetic control and a complete series of

    plats, raster-to-vector conversion using that control can be an effective

    strategy. Used for appropriate applications and implemented

  • 8/10/2019 Arcs Can

    7/62

    Executive Summary

    G-141/3.36.06 5

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    correctly, scanning can be the most cost-effective and efficient method

    for capturing your data.

    ESRI Solutionsfor Scanning

    Data Entry

    ESRI has used scanning data entry technology for many years. The

    ESRI Database Automation Group has adopted scanning data entry as

    a preferred methodology. ESRIsoftware has supported raster data

    sets for many years, and a new ESRI software product called

    ArcScanfocuses specifically on providing scanning data entry

    software tools. ArcScan is closely integrated with the rest of

    ARC/INFO in a single software environment. Thus, all the

    functionality of ARC/INFO can be combined with specialized

    software for scanning data entry. ESRI can provide turnkey scanning

    data entry systems through reseller agreements with industry-leading

    hardware vendors. Many other companies have joined ESRI's open

    systems approach and offer complementary capabilities that work with

    the ARC/INFOdata structures and user interface. ESRI scanning

    data entry solutions are affordable, easy to use, and are integrated with

    ESRI's advanced GIS data management technology. ESRI scanning

    data entry solutions provide a clear and effective alternative for data

    automation.

    About ThisWhite Paper Scanning data entry technology offers a variety of tools formanipulating raster data and for converting raster data to vector data.A thorough evaluation of project needs and available data sources will

    determine the tool or tools that work best for you. The next section

    presents information to help you evaluate scanning data entry. ESRI's

    rich toolset for scanning data entry and integrated data editing/data

    management technology is described in the last section. A glossary

    provides definitions for many of the specialized terms used in

    scanning data entry.

  • 8/10/2019 Arcs Can

    8/62

    Executive Summary

    6 G-141/3.36.06

    March 1994

  • 8/10/2019 Arcs Can

    9/62

    G-141/3.36.06 7

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    Evaluating ScanningData Entry

    This section is designed to help you evaluate scanning dataentry. Careful evaluation is a key factor in ensuring your

    success. If you take the time to think it through and

    evaluate the options and trade-offs carefully, your projectwill benefit greatly. Evaluation considerations should

    include the data available, the hardware and software, and

    the methods or procedures.

    EvaluatingData Sourcesfor Scanning

    Data Entry

    Understanding data sources is crucial to understanding how scanning

    data entry can be used in your GIS. Scanning solutions are as diverse

    as the data used in the GIS applications they support, because

    scanning and vectorization requirements are determined by the data.

    Two main categories of data are used with scanned data entry projects.

    The first is paper or Mylar maps, containing line art, that are scanned

    into bi-tonal (black-and-white) raster data sets, or images. Scanned

    maps in raster format are usually converted to vectors using raster-to-

    vector conversion programs. The second category of scanned

    documents in wide use is aerial photographs, typically black and

    white, that are scanned into a grayscale image. Scanned photos are

    not conducive to automated raster-to-vector conversion techniques and

    are often used in raster format. Scanned photos are often used in

    vector conversions as visual background for heads-up digitizing.

    While both types of documents lend themselves to scanning dataentry, each has different processing requirements. Scanned data entry

    can build a GIS database using any or all of the following data

    sources:

  • 8/10/2019 Arcs Can

    10/62

    Evaluating Scanning Data Entry

    8 G-141/3.36.06

    March 1994

    Line Work Maps Low-quality 36 x 44 (E format) maps. Ranging fromantique linen maps, to CAD plotter output, to blue lines, to as-

    builts, the overwhelming majority of hard-copy maps are low

    quality. Quality, in the scanning sense, refers to the quality of the

    media itself and the problems the media presents to the scanning

    and vectorization process, not to the informational quality of the

    data on the document. This is the most common data source

    category for scanned data entry applications.

    High-quality E format maps. Typically, these are Mylar

    maps with multiple data layers. They are more frequently found in

    larger, rather than smaller, organizations and public agencies.

    Media quality is highfor example, the lines on the media are

    clear and crisp and the media will not have extraneous marks or

    "noise." This type of data can have clutter, in the form of

    annotation or unwanted data layers, which can complicate the

    vectorization process.

    High-quality, single-layer E format maps. Mylar

    separates, such as topographic contours, are often available in this

    form. Soil maps, separates from a production map series, and

    specially prepared Mylars can fall in this category. Cadastral maps

    on Mylar fit in this category if they have only one data layer (i.e.,only parcels). This type of bi-tonal, line art map data source is the

    most amenable to scanning data entry because its high quality

    requires the least pre- or post-processing. Hallmarks of high-

    quality documents for scanning are a single data layer, absence of

    clutter, presence of registration tics (georeference marks), and

    clean, high-contrast media.

    11 x 17 (B format) plat maps. Plat maps in this format are

    found throughout the United States. Map quality will vary.

  • 8/10/2019 Arcs Can

    11/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 9

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    Line work can contain gaps. Inthis case, the gaps are caused by the

    line symbology used to representintermittent streams. Gaps canalso be caused by low-quality data,resulting in noisy scanned output.

    The ArcScan tracing tool can"jump" gaps to create continuous

    vector output. Gap jumpingparameters can be modified to suit

    data requirements.

    Photographs andDigital Imagery

    9 x 9 aerial photos. This source data category is widely used

    with scanning data entry. Aerial photos are most often black and

    white, with some in color when budget allows. The most common

    use of scanned photography is to serve as a visual backdrop or

    "reality check" to other data. Heads-up digitizing uses scannedphotographic images to guide operator coordinate capture by

    "tracing" features from a display screen. Air photos are often the

    most up-to-date data source. You can usually scan air photos at a

    much lower resolution than line art. This can help compensate for

    the greater storage requirements of grayscale images. Visual

    display has less stringent resolution requirements than raster-to-

    vector conversion.

    Much photography being scanned today is uncorrected.

    Uncorrected photography, while relatively inexpensive and useful in

    many applications, should be used with care. Even though vectordata derived from uncorrected photography will overlay properly on

    its source image, it is only accurate relative to its source data.

    However, vector data converted from uncorrected photo images are

  • 8/10/2019 Arcs Can

    12/62

    Evaluating Scanning Data Entry

    10 G-141/3.36.06

    March 1994

    likely to misregister with data from other sources. In addition,

    uncorrected photographic images may not merge well with adjacent

    images, and measurements made on uncorrected images will be

    incorrect.

    Large format photography. Large format photos (e.g.,

    24 x 30 or 30 x 30) are available in a variety of scales. 1:24,000

    orthophoto quads, usually black and white, have been produced

    for large areas (e.g., statewide coverage). Orthocorrection can be

    applied photographically or digitally. That is, orthophotos can be

    scanned directly and used without need of coordinate correction,

    or uncorrected stereophotos can be scanned and then

    orthocorrected in digital form. The output of digital orthophotos

    can be at any scale appropriate for the resolution of the image.

    Other image data from airborne scanners or satellites.

    LANDSAT and SPOT images are commercially available (e.g.,

    through the ArcDataSMprogram) and typically offer ten- to thirty-

    foot resolution. Airborne scanners linked with GPS receivers can

    produce images with much greater resolution (pixel resolution of

    less than one foot, for example).

    Digital DataConcepts The raster data format is a cellular data format well suited for storingimages or maps. A raster data set is like a carpet of cells overlayingthe map where each cell has a value representing the corresponding

    value beneath it in the map. For example, a raster data set of a

    scanned map will have pixel values that correspond to the brightness

    of the light reflected from the map.

    This figure illustrates how apolygon, line, and point feature

    would be stored in an x,ycoordinate system (vector) and a

    row, column system (raster).

    Polygon

    Point

    Line

    x-axis

    y-axis columns

    rows

    Vector Raster

  • 8/10/2019 Arcs Can

    13/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 11

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    A raster data set can be bi-tonal, grayscale, or color (often satellite

    images are displayed as false color images). Raster data can bedirectly useful in a GIS. For example, a scanned air photo can be

    used as a backdrop to other infrastructure data, such as roads or

    sewers, using ARC/INFO IMAGE INTEGRATORcapabilities.

    The ARC/INFO GRIDextension uses the raster data format for

    complex spatial analysis. Raster data sets tend to be largein the

    multi-megabyte rangeand have special data processing needs.

    Header record

    Raster data file

    Pixel data

    .

    .

    .end of file

    Raster data can be organized in a number of ways depending upon the

    particular raster format. Typically, the raster data file contains a

    header record that stores information about the data such as the

    number of rows and columns, the number of bits per pixel, the colorrequirements, and the georeferencing information. Following the

    raster header is the actual pixel data for the image. The internal

    organization of the raster data is dependent upon the raster format.

    Some formats contain only a single band of data, while others contain

    multiple bands.

    When planning your scanning data entry project, you should give

    special attention to input resolution. Input resolution is the number of

    pixels per inch in both x and y dimensions of the digital snapshot.

    Most scanners allow some control over input resolution. In general,

    you should try to reduce data storage requirements by choosing thelowest resolution that will cleanly capture your data. Some

    experimentation will be required as you "fine tune" your scanning data

    entry methods.

    The input resolutions shown in Table 1 are typical, but not absolute

    resolutions. Resolution is expressed in dots per inch (dpi). Doubling

    resolution (e.g., from 400 dpi to 800 dpi) can have the effect of

    quadrupling data set size (compressed formats will show less increase

    in size). Scanning data at a resolution greater than that required by the

    source document will only increase data storage requirements with no

    appreciable improvement in data quality. Unneeded input resolutioncan even create processing problems by exaggerating errors in poor-

    quality data (e.g., additional white noise).

  • 8/10/2019 Arcs Can

    14/62

    Evaluating Scanning Data Entry

    12 G-141/3.36.06

    March 1994

    TABLE 1Raster Data Sources

    Type ofSource Data

    T y p i c a lDataExample

    Typical Usein ScanningData Entry

    T y p i c a lScanner (Input)R e s o l u t i o n

    Raster DataFormat

    Typical RasterData Size

    Low quality36 x 44 inch(E format) map

    As-builts,blue lines

    Raster-to-vectorconversion usinginteractive techniquessuch as raster cleanupand line following

    400 dpi RLC or GRID bi-tonal (compressed)

    6 megabytes

    High quality Eformat map

    Contours onMylarseparates

    Raster-to-vectorconversion usinginteractive (multipledata layers or clutter) or

    batch (single data layer)

    400 to 800 dpi(depending on datatype and quality)

    RLC or GRID bi-tonal (compressed)

    6 to 30megabytes

    11 x 17 inch(B format) map

    Plat map Raster-to-vectorconversion

    400 to 500 dpiresolution variesby source dataquality

    RLC or GRID bi-tonal (compressed)

    1 to 4 megabytes

    9 x 9 inch aerialphoto (black andwhite)

    Standard airphoto

    Visual backdrop forother data entrymethods such asCOGO or heads-updigitizing. Ortho-photo production

    200 dpi TIFF(uncompressed)

    4 megabytes

    9 x 9 inch aerial

    photo (color)

    Standard air

    photo (color)

    Visual backdrop for

    other data entrymethods such asCOGO or heads-updigitizing. Orthophotoproduction

    200 dpi TIFF

    (uncompressed)

    4 megabytes

    (8 bit color),12 megabytes(24 bit color)

    Large formataerial photo(30 x 30 inch)

    Orthophoto Visual backdrop forother data entrymethods such asCOGO or heads-updigitizing. Ortho-photo production

    200 dpi TIFF(uncompressed)

    36 megabytes

    Three band satelliteimage

    EOSAT orSPOT Imagedata

    Visual backdrop forheads-up digitizing ofmajor roads

    Purchased inraster format

    Band interleaved,etc.(uncompressed)

    Varies with typeof data and area ofcoverage, 2 to600 megabytes

  • 8/10/2019 Arcs Can

    15/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 13

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    Raster data can be compressed. That is, you can use a data storage

    scheme to reduce the amount of disk space required to store the data.

    Bi-tonal raster data can be compressed to a greater degree than

    grayscale or color data because the cell values of bi-tonal data can be

    represented with a single biteither black or white, on or off, data or

    no data. Grayscale and color raster data can also be compressed, but

    with lesser compression ratios and at a higher processing cost to

    support decompression. When raster data are compressed for storage,

    it must be decompressed for display and other operations.

    Vector data are in a format that represents map features with the x,y

    coordinates of the features. Where a raster data set would represent a

    feature by tagging all the cells that overlay the feature, a vector data set

    would represent the feature by listing the coordinates of points along

    it. Many GIS applications, such as parcel maintenance, demographic

    analysis, or vehicle routing, require data in a vector format.

    Raster data are unsuitable for these applications because, although the

    raster and vector data may look the same displayed on a screen, raster

    data have very different characteristics. Scanned raster data are simply

    a "digital snapshot" of the source documenta scanned map has

    pictorial information and limited connectivity to other data. ARC/INFO

    georelational vector data, on the other hand, maintains the internal

    spatial relationships of the features it represents and has far moreinformation than a simple picture. In addition, georelational vector data

    have strong connections to other related data, such as tables stored in a

    relational database management system (RDBMS).

  • 8/10/2019 Arcs Can

    16/62

    Evaluating Scanning Data Entry

    14 G-141/3.36.06

    March 1994

    The Georelational Model

    Raster data can also be used to yield a vector representation of the

    same map datathis process is called raster-to-vector conversion and

    is a main topic of this white paper. Only recently, however, has

    raster-to-vector conversion technology become affordable, reliable,

    and widely available. These advances in both hardware and software

    have made scanning data entry a feasible alternative.

    Evaluating

    Computer Hardwarefor ScanningData Entry

    Scanning data entry has special hardware requirements. These

    hardware capabilities support raster data characteristics such as large

    data set size and cellular format. A scanning data entry system can

    include several hardware components:

    The Scanner Scanners to input hard copy paper maps or photos. Scanners take a"digital snapshot" of the source material and store this raster data on

    disk. Advanced scanners offer on-screen graphical user interface

    (GUI) control software to enhance ease of use. Scanners are available

    at a variety of output resolutions, support a variety of media sizes, and

    can output black-and-white, grayscale, and color images. Scanners

    can output scanned images directly to work-station secondary storage

    via a high bandwidth interface (e.g., SCSI). Scanners output data instandard raster data formats such as RLC (for bi-tonal data) or TIFF

    (for grayscale data).

  • 8/10/2019 Arcs Can

    17/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 15

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    A scanner is a device with a mechanical document feed that is set

    above a row of cameras. The document feed can either be continuous,

    as in a drum scanner, or direct feed, where the document feeds in the

    front and comes out the back of the device. A light source and a glass

    window are between the cameras and the document. The light source

    is angled in such a way as to shine through the window and reflect off

    the document. There is usually a white background behind the

    document in the event that the media is transparent. The reflected light

    enters the cameras that focus the image onto a charge-coupled device

    (CCD). The CCD is a ceramic board with an imbedded array that

    translates the presence or absence of light into digital form. The

    output from the CCD is a value (usually from 0 to 255) that indicates a

    level of gray. Usually, a grayscale value of zero indicates total

    absence of light, while a value of 255 indicates complete light

    saturation.

    Scanners sense reflected light

    values and store the reflectance

    values as a digital image.

    Scanner Basics

    Media feedMechanism

    Map orPhoto

    CCD

    Camera

    GlassWindow

    Digital Image Output

    LightSource

    White Background

    Media

    MovementLightSource

    Scanner resolution is important. Optical resolution is the ability of

    cameras in the scanner to discern data. As a rule of thumb, the optical

    resolution of a scanner is expressed in this formula: optical resolution

    in dpi = (number of cameras + 1) * 100. Thus, a scanner with three

    cameras can offer an optical resolution of 400 dpi. Scanners can also

    offer interpolated resolution, in which the data from the CCD are

    resampled into smaller pixels. Thus, a scanner with optical resolution

    of 400 dpi can also offer inter-polated resolution of 800 dpi. Thismethod can produce an output image with higher resolution, but not

    necessarily with greater accuracy. In general, you should evaluate

    scanners for GIS applications using optical resolution because GIS

  • 8/10/2019 Arcs Can

    18/62

    Evaluating Scanning Data Entry

    16 G-141/3.36.06

    March 1994

    data are too complicated to be adequately captured with interpolated

    resolution data.

    Positional accuracy is also an important consideration in GIS

    applications that require accurate spatial data placement. Positional

    inaccuracy can result from media slippage in the scanner document

    feed mechanism or from miscalibrated cameras.

    Decide on the minimum resolution and accuracy required by your

    application. Then find the least expensive scanner that will meet those

    needs. A less expensive scanner that provides 200 dpi optical

    resolution is adequate for scanning photographs used as visual

    backdrops. On the other hand, if your application must scan closely

    drawn "tight" contours, you will need a scanner capable of at least 800

    dpi optical resolution. And, as when buying any piece of equipment,

    you should consider reliability, repair costs, maintenance agreements,

    user support, and so on.

    The Computer High-powered computer workstations capable of displaying andmanipulating raster data. Raster data require a bitmapped graphics

    monitor for display. Even with bi-tonal data, a color monitor is useful

    to support color overlay of vector data. If you are working with

    grayscale data, you need at least an 8-bit color display in order to

    support a color space that includes both the grayscale image and therest of the graphical user interface.

    The workstation needs to have the CPU power required to manipulate

    and display large amounts of data rapidlythis consideration should

    not be underestimated, as slow response in handling large data sets

    can adversely affect project productivity. One workstation can be

    dedicated as a scanner and/or plotter server if warranted by high

    usage, or the workstation can perform other functions.

  • 8/10/2019 Arcs Can

    19/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 17

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    A Plotter Plotters capable of creating hard-copy output of raster data. Plottertechnologies that support output of raster data include electrostatic, ink

    jet, and thermal transfer. Plotters can be color or black and white

    (black-and-white plotters usually support grayscale output). Pen

    plotters are not appropriate for raster data output. For optimum utility,

    plotter software should be capable of combining raster and vector

    data.

    Data Storage Secondary storage devices, such as high-speed magnetic disk drives,or high-volume optical disk drives capable of storing large raster data

    files. If you envision an on-line raster database you should be careful

    to assess data storage needs carefully. To estimate your raster data

    storage requirements, multiply the number of documents you need tohave available on your system at any one time by the raster data set

    size for that document type in Table 1. Many applications need to

    have only a limited amount of raster data on-line; other applications

    want to develop a library of raster data for ad hoc access. Estimation

    of your total storage requirements should include the storage needs of

    system software, application software, and other types of data (e.g.,

    vector data and editing copies of raster data).

    A variety of technologies are available to ease the data storage burden.

    First, tape archival can be used to simply take unneeded data off the

    system. Some sites use a two-tiered approach to data storage in whichmore frequently accessed raster data are kept on high-speed magnetic

    disks, and less frequently used data are migrated to optical media.

    Optical systems can be purchased with software to perform data

    migration automatically at off-peak times. Optical systems and

    magnetic systems should be transparently usable as mountable file

    systems, usually accessible through an open network access standard

    such as a Network File System (NFS).

  • 8/10/2019 Arcs Can

    20/62

    Evaluating Scanning Data Entry

    18 G-141/3.36.06

    March 1994

    Networks High-speed local area networks (LANs) capable of transferring largeamounts of data at high rates of data throughput. LAN configurations

    for distributed processing support scanning data entry by connecting

    specialized computing machinery on a high-speed data pathway.

    Today's LAN configurations can isolate high bandwidth raster data

    traffic from other functions by creating subnets using network

    bridges. LAN-based systems are modular and scalable.

    Evaluating Softwarefor Scanning

    Data Entry

    Scanning data entry has special software requirements. Software

    tools are used for managing, manipulating, and displaying raster data,

    converting raster data to vector data and providing a graphical user

    interface to make scanning data entry easy to do. Some of these toolsoperate in batch mode or "behind the scenes" with little user inter-

    action. Their benefit is increased savings of people's time through

    increased automation. Others require a higher level of user

    interactiontheir benefit is the intelligent combination of machine and

    human capabilities to attain higher productivity. Software functions

    needed to support scanned data entry projects include the following:

    Raster data management. The data produced by scanning

    data entry must be organized in an orderly way. Georeferenced

    data should be organized geographically. Data management

    software can optimize storage and retrieval of raster data evenwhen data volumes are large. For integration with other systems

    and scanners, the software should provide raster-to-raster data

    conversion. Raster data management software can extract, edit,

    and merge a raster data set from a raster database.

    Software data compression. Software data compression can

    minimize raster data storage requirements. A variety of industry-

    standard data compression formats are available. Industry-

    standard data compression formats include RLC and CCITT

    Group 3 and Group 4. The CCITT compression standards are

    implemented in the TIFF raster data standard. Eight to ten timesdata size reduction can be achieved with bi-tonal data. The amount

    of actual data reduction will depend on the compression algorithm

    used and the complexity of the data. Typically, denser more

  • 8/10/2019 Arcs Can

    21/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 19

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    1

    2

    3

    4

    5

    Example Scanning Data Entry Configurations

    Simpler configuration:

    UNIXWorkstation

    MagneticDisk

    SCSISCSI

    More complex configuration:

    Scanner

    Scanner

    Raster Plotter

    Loc al Area Network

    MultipurposeUNIX Workstation

    Optical StorageDevice "Jukebox"

    1

    2

    3

    4

    5

    DedicatedUNIX Workstation

    1

    2

    3

    4

    5

    UNIX Server

  • 8/10/2019 Arcs Can

    22/62

  • 8/10/2019 Arcs Can

    23/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 21

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    ArcScan georeferencingmenu supports

    multiwindow visualinteraction.

    Raster data display. Raster data can be displayed with full

    control over display symbology and graphic overlay of vector

    data. Background values in bi-tonal raster data can be displayed

    transparently, thus allowing concurrent display of multiple raster

    data sets. Software can alter the display characteristics of

    grayscale and color images to suit the needs of the application.

    The software provides direct output of raster data to raster-capable

    plotters, thus enhancing output speed and reducing hard-copy

    processing requirements. The software can merge raster and

    vector data in the same hard-copy plot.

  • 8/10/2019 Arcs Can

    24/62

    Evaluating Scanning Data Entry

    22 G-141/3.36.06

    March 1994

    ArcScan raster editing menu. Raster data editing. The software provides capability to clean

    up raster data with tools that work directly on the raster data

    format. Raster editing is a common pre-processing step to raster-

    to-vector conversion. For example, raster editing software can

    remove speckling from raster dataand cleaner raster data are

    converted to vector data with less post-processing.

    Raster-to-vector data conversion. Software can provide an

    array of tools for converting raster data to vector data. The tools

    offer the flexibility to adapt to a wide variety of raster data. For

    high-quality media, batch vector conversion software may be a

    good choice. When source documents are lower quality or have

    much clutter, interactive line-following software is often

    preferable. When photos are scanned, heads-up digitizing can be

    appropriate. Maps that are scanned to capture coordinate

    information often have a wealth of feature attribute information as

    well. This attribute data can be interactively captured during

    scanning data entry procedures if the scanning software tools are

    well integrated with other editing functions.

    The reason that raster-to-vector conversion requires special

    algorithms to overcome problems posed by noise and clutter is that

    the raster data format is simply a pictorial representation of the

    data. That is, when the conversion software examines the raster

    data, it can see only pixels of black and whiteit cannot see whatthe raster "picture" is supposed to represent. Thus, conversion

    software will attempt to make vector lines out of all data it

    encounters, even if the "lines" are really the pen strokes that make

    up annotation. Conversion software deals with this kind of

    problem in many different ways, but human intervention is often

    necessary.

    GUI interface and software integration. Ease-of-use is

    greatly enhanced through a Graphical User Interface (GUI) that

    provides point-and-click control of software functionality. For

    highest efficiency, the conversion software should be integratedwith other raster and vector editing functions in a common

    software environment. By integrating scanning data entry

    technology in a common software environment, more software

    functions are available and access to more types of data is possible

  • 8/10/2019 Arcs Can

    25/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 23

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    at each processing step. It is easier to learn an integrated software

    system that has a consistent and attractive "look and feel."

    Evaluating ScanningData Entry

    Methodologies

    Scanning data entry methods vary as widely as the applications in

    which they are used. Some of these methods are unique to scanning

    data entry; others take advantage of data and software integration to

    bring additional functionality scanning data entry projects. Noise

    removal and clutter removal are pre-processing methods. Pre-

    processing prepares the raster data for the raster-to-vector conversion

    step.

    Noise removal. Scanned maps will have a certain amount of

    noise. The lower the map quality, the higher the noise content.

    Noise is data that do not have informational content. For example,

    a common type of noise is tiny spots called speckles that are an

    artifact of the scanning process. The speckles are unwanted in the

    final vector output. Various methods can be used to remove noise

    from the image so that vectorization can proceed. These methods

    are collectively called noise removal.

    Clutter removal. Even a high-quality map may have unwanted

    data on it, such as annotation. Often maps show more than one

    data type or layer. For example, a map might show parcelboundaries, easements, and street names. When only the parcels

    are to be vectorized, the easements and street names are clutter.

    Scanning data entry has developed methods for clutter removal,

    such as raster editing tools that "white out" indicated areas.

    Software filters that remove data below a given threshold size can

    be useful for removing the relatively small lines that make up

    lettering (annotation).

    Post-processing. Once the raster-to-vector conversion has

    been performed, the output vector data can be post-processed.Post-processing is usually accomplished with vector editing

    software to correct output vector data. The more closely the post-

    processing software is integrated with the vectorization software,

    the more efficient the entire vectorization process becomes.

  • 8/10/2019 Arcs Can

    26/62

  • 8/10/2019 Arcs Can

    27/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 25

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    uncluttered data. Batch vectorization will require post-processing.

    Since batch vectorization proceeds without operator intervention

    and vectorizes all data input, low-quality, cluttered data can reduce

    its efficiency by requiring high levels of post processing.

    Typically, the user tests a variety of vectorization parameters to

    find the combination of parameters best suited to the data, and then

    initiates batch vectorization using those parameters. The major

    advantage of batch vectorization is that once parameters are set,

    vectorization can proceed unattended and produce an output vector

    data set in much less time than other methods. Since all the maps

    of a single map series tend to be alike and can use the same

    parameters, batch vectorization is well suited to projects that scan a

    large number of similar maps.

    Both interactive vectorizers and batch vectorizers can deal with a

    variety of cartographic problems such as dotted line symbolism.

    The choice of which vectorizer to use is mostly data dependent.

    Having both methods available broadens the array of available

    tools. Vectorizers of both kinds work only with bi-tonal raster

    datathey cannot presently be used with grayscale or color raster

    data.

    Feature attribute capture. Line followers and batchvectorizers both output lines in vector format. These methods are

    good for automating the line symbology on maps. Maps can also

    contain other data such as point symbols and annotation. The

    symbols and annotation are often attribute data associated with a

    point, line, or area feature on the map and may be of value to the

    GIS database. The informational content of the point symbology

    and annotation can be captured by methods that combine heads-up

    digitizing with data editing capabilities. Advanced scanning data

    entry software will allow the user to point at symbology or

    annotation and attach its information to a feature. In a single

    integrated software environment, the user can take advantage ofimaging capabilities, vector editing capabilities, and forms entry

    capabilities, all within an application tailored to capturing data

    from a specific map series. Operator intervention and some key

    entry is required, but this method can be an efficient way to

  • 8/10/2019 Arcs Can

    28/62

    Evaluating Scanning Data Entry

    26 G-141/3.36.06

    March 1994

    capture attribute information along with coordinate information in

    one handling of the source document.

    Heads-up digitizing. Here, the user captures coordinate

    information by tracing features directly from data displayed on the

    screen. A digitizing table is not used by all. Hence the term

    "heads-up"an allusion to the heads-up instrument display

    technology developed for aircraft pilots.

    Heads-up digitizing is a scanning data entry method that is

    commonly used when coordinate accuracy need not be at

    engineering levels. As the operator digitizes from the screen,

    output accuracy is determined by the accuracy of the source

    document, the resolution at which it was scanned, the resolution at

    which it is displayed, the resolution of the screen itself, and the

    skill of the operator. Usually greater accuracy can be attained by

    other methods. Heads-up digitizing is usually performed on

    georeferenced raster data sets.

    Heads-up digitizing is most commonly performed using scanned

    images and thus can be a way to quickly capture the most current

    information. Vectorizing methods are the most common way to

    capture information from scanned maps, although heads-up

    digitizing can be used with scanned maps.

    Orthophoto production. Scanned stereopair images can be

    digitally orthocorrected to produce a digital orthophoto. Optically

    corrected orthophotos can be scanned directly into digital

    orthophotos. A digital orthophoto can be plotted at virtually any

    scale. For example, a series of 30 x 30 orthophotos can be

    scanned, merged into a common database, and reproduced on a

    plotter without reference to the original size, coverage, and format

    of the hard-copy photos. While output scale can be changed

    freely, care should be taken not to "blow up" the image too much

    or the output image will appear blocky. Orthophotos can also beplotted with overlays of vector data.

    Combining scanned data with other data. The data

    produced through scanning data entry may not be the only data of

  • 8/10/2019 Arcs Can

    29/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 27

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    interest. For example, scanned and vectorized data can be fit into

    a geodetic control network created by coordinate geometry

    (COGO) data entry. This technique can be used to localize, or

    bound, error in scanned plats. Vectorized data can be combined

    with purchased or table-digitized vector data. Accurate

    georeferencing is very important when using scanned data with

    other data.

    Scanned data also have attributes. Adding attributes to scanned

    data is usually necessary. For example, scanned contours need to

    be tagged with their elevation values for use in digital terrain

    projects. If scanning data entry software is integrated with other

    GIS software tools, any or all of those tools can be made part of

    the scanning data entry and data automation process.

    Incremental data automation. Some organizations have

    adopted an incremental approach to database creation. Incremental

    methods include in-house staff digitizing on a time-available basis,

    addition of digital data from other sources, such as CAD data from

    land developers, or conversion of other digital data such as legal

    descriptions. Incremental database generation approaches can

    work. But, since any GIS must have data in order to be effective,

    it can be wise to input some data immediately so as to demonstrate

    immediate GIS benefits.

    Scanning data entry supports incremental database development.

    With scanning data entry, a raster database can be produced

    quickly by scanning maps or air photos and georeferencing the

    scanned images to real-world coordinates. The georeferenced

    raster data can provide immediate benefit as a visual backdrop to

    other data and as a data source for vector conversion. The raster

    database can provide complete seamless coverage for an agency's

    entire area of responsibility (e.g., a city or a county), and vector

    conversion can proceed incrementally, on an as-needed or highest-

    need basis.

  • 8/10/2019 Arcs Can

    30/62

    Evaluating Scanning Data Entry

    28 G-141/3.36.06

    March 1994

    ESRI Scanning Data

    Entry Procedures

    ESRI's experience as a user of scanning data entry technology enables

    us to share a very practical viewpoint. The ESRI Database Develop-

    ment Group has used scanning data entry for many projects. The

    Digital Chart of the World project, performed by ESRI as the prime

    contractor to the Defense Mapping Agency, used scanning data entry

    to develop a 2.8 gigabyte vector database from over 2,800 source

    documents. The ESRI Database Development Group has wide

    experience with scanning data entryand much of this experience is

    reflected in this chapter.

    The group adapts scanning data entry technology and methods for

    each project, as determined by the project goals and available source

    documents. Even so, a standard processing sequence has evolved.This processing sequence is shown in the following flowchart. Note

    that some sort of pre- and/or post-processing is an assumed

    requirement. Scanning data entry can reduce data automation time

    requirements, but it will not eliminate them.

  • 8/10/2019 Arcs Can

    31/62

    Evaluating Scanning Data Entry

    G-141/3.36.06 29

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    Find or draft georeference marks(tics) on source documents

    Hand prep source documents

    Determine scan resolution Determine proposed

    processing flow

    EvaluateProjectGoals

    EvaluateSource

    Documents

    Scanning Data Entry Procedures

    Maps Air photo s

    Post process(vector cleanup)coverage data

    Test and set scannerparameters

    Heads-up digitize

    ScanDocuments

    Pre-process (rasteredit) raster data

    Vectorize

  • 8/10/2019 Arcs Can

    32/62

  • 8/10/2019 Arcs Can

    33/62

    G-141/3.36.06 31

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    ESRI Scanning DataEntry SolutionsA GIS Focus

    ESRI's experience as a manufacturer and user of GIS

    technology enables us to view scanning data entry fromthe special perspective of GIS needs and requirements.

    GIS requirements are different from CAD and engineer-

    ing requirementsESRI solutions are based on GIS

    requirements. For example, GIS data typically depict

    irregular features from the real world, not the right

    angles and straight lines of CAD drawings.

    ESRI provides a wide range of products and services that can support

    GIS scanning data entry projects. ESRI brings its experience in GIS

    and scanning data entry together in a software package calledArcScan. ArcScan is specifically designed to support GIS scanning

    data entry projects. ArcScan is a fully integrated extension to

    ARC/INFO, and takes advantage of ARC/INFO's complete GIS

    functionality.

    ARC/INFO itself provides much ancillary capability to scanning data

    entry projects, including vector data editing and management.

    ARC/INFO provides additional raster data support through its bundled

    IMAGE INTEGRATOR capabilities. ARC/INFO software extensions

    for surface modeling, raster data modeling, network modeling, and

    coordinate geometry can all play a part in scanning data entry projects

    because these capabilities are available in the common ARC/INFOsoftware environment.

    ArcScan, ARC/INFO, ArcView, ArcCAD, third-party software

    integrated with ESRI software, the ArcData program, ESRI services,

  • 8/10/2019 Arcs Can

    34/62

    ESRI Scanning Data Entry SolutionsA GIS Focus

    32 G-141/3.36.06

    March 1994

    and the ESRI hardware reseller program can be flexibly matched to the

    exact requirements of your scanning data entry project and your GIS

    implementation.

    ArcScan ArcScan is a set of software tools that support data automation usingscan digitizing. These tools permit GIS applications to automate

    vector databases using scanned raster data sets as input. ArcScan is

    an extension to ARC/INFO and is fully integrated with the ARC/INFO

    software environment. The ArcToolsArcScan menu system

    provides an easy-to-use interface for raster editing and interactive

    vectorization. ArcScan includes a Users Guideand Command

    Referencethat describe ArcScan capabilities to users.

    ArcScan Capabilities ArcScan users can create ARC/INFO coverages by extracting linefeatures from scanned monochrome document images. This is done

    using interactive, automated line-following software within

    ARCEDIT. Examples of linear features that can be extracted and

    added to a coverage include street lines, utility lines, contour lines,

    parcel boundaries, and soil polygon boundaries. This technique for

    digitizing features is simpler, more accurate, and often faster than

    traditional manual and heads-up digitizing.

    ArcScan provides tools for editing monochrome, grayscale, and

    pseudo-color single-band imagery within ARCEDIT.

    ArcScan provides a powerful and efficient set of tools in ARC/INFO

    for importing, correcting, editing, plotting, and exporting scanned

    raster images. ArcScan supports industry-standard raster data formats

    and can accept data from many types of scanners.

    ArcScan Components ArcScan functional components include raster database constructiontools, raster pre-processing tools, integrated raster-to-vector editing

    tools, and an interactive raster-to-vector conversion tool. ArcScan

    soft-copy and hard-copy raster display is provided using standard

    ARC/INFO display functionality. ArcScan tools work with the highly

    efficient ARC/INFO grid raster data format.

    Raster DatabaseConstruction Tools

    Scanned raster images in a variety of standard formats (e.g., TIFF,

    RLC, SunRASTER) and compressions (e.g., Run Length

    Compressed, CCITT Group III, CCITT Group IV) can be converted

  • 8/10/2019 Arcs Can

    35/62

    ESRI Scanning Data Entry SolutionsA GIS Focus

    G-141/3.36.06 33

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    to and from compressed grids using the IMAGEGRID and

    GRIDIMAGE conversion tools. The GRIDMERGE tool can be used

    to build a raster database from georeferenced input grid raster data

    sets. The ArcScan conversion tools transfer runs of data directly,

    rapidly converting large scanned monochrome documents.

    Geometric Correctionand Noise

    Removal Tools

    The ArcScan raster pre-processing tools prepare raster data sets for

    additional processing. A set of geometric correction tools can be used

    to correct orientation errors during scanning, distortions in the source

    document, and georeferencing. These tools perform the following

    operations:

    Rotate a grid by a multiple of 90 degrees. Commonly used when

    documents are scanned sideways. Flip the contents of a grid from top to bottom. Commonly used

    when documents are scanned upside down.

    Mirror the contents of a grid from left to right. Commonly used

    with translucent documents that are scanned wrong side up.

    Correct a skewed document by converting a user-specified

    parallelogram on the input document to a rectangle on the output

    document. A common distortion encountered in scan digitizing is

    a skew caused by paper feed that is not perfectly aligned.

    Apply a warping transformation to georeference a grid to real-world coordinates.

    The following noise cleanup tools can be applied to either the entire

    image or a selected image area:

    Remove specks of black noise from a scanned image. Most

    scanned documents will show speckling to varying degrees.

    Apply a majority rule filter to a scanned image. Commonly used

    to correct dropout in noisy scanned lines.

    ArcScan raster datamanagement menu.

  • 8/10/2019 Arcs Can

    36/62

    ESRI Scanning Data Entry SolutionsA GIS Focus

    34 G-141/3.36.06

    March 1994

    Integrated RasterVector Editing Tools

    Grids can be edited in conjunction with coverages. The ARC/INFO

    software environment supports editing monochrome, grayscale, and

    pseudo-color raster data. Users can edit multiple grids during an

    ARCEDIT session. ARCEDIT supports full pan and zoom display in

    map coordinate space of both raster and vector data. A multilevel

    undo capability provides the user with a "safety net" during raster

    editing. The user can concurrently edit both raster and vector data.

    The editing tools include

    Filling Tools: Fill the interiors of user-defined boxes, circles,

    and polygons. Fill a connected entity of pixels by pointing to any

    pixel in the entity.

    Drawing Tools: Rasterize the boundaries of user-defined

    boxes, circles, and polygon.

    Pixel Editing Tools: Set and query the value of individual

    pixels by pointing to the screen.

    Brush Tool: Change the value of pixels in the grid by dragging

    a brush on the screen.

    Rasterization Tools: Rasterize the selected set of arcs into theedit grid.

    Selection Tools: Select all cells within a box, circle, or

    polygon.

    Geometric Operations: Move, rotate, flip, and mirror the

    selected region.

    Filtering Operations: Despeckle, smooth, or enhance the

    selected region.

    Georeferencing Tools: These tools allow a user to

    interactively position and rescale the edit grid in map coordinate

    space, deskew the edit grid, and warp the edit grid using a link

    coverage created in ARCEDIT. Georeferencing is accomplished

  • 8/10/2019 Arcs Can

    37/62

    ESRI Scanning Data Entry SolutionsA GIS Focus

    G-141/3.36.06 35

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    by identifying common points in the raster data set and in real-

    world coordinates.

    The mouse is used to register theimage to real-world coordinates.

    InteractiveVectorization Tools

    With ArcScan you can extract the centerlines of linear features from a

    raster document with optimized user intervention. Interactive raster-

    to-vector conversion using an automated line-following, or line-

    tracing, tool is especially useful for selective raster-to-vectorconversion from a raster data set with multiple data layers. The

    ArcScan line tracer, because of its high degree of user control, can

    also vectorize complex and difficult data. With the line tracer tool you

    can efficiently produce high-quality vector output. The trace tool

    performs automatic intersection straightening and automatic line

    generalization based on user parameters.

    The ArcScan Tracing Tool worksin the ARCEDIT environment.

    The trace tool snaps to the center of a raster line. Tracing begins atthat point, stopping at junctions to obtain user input. The user

    interacts with the tracer using the mouse or keyboard, and controls the

    direction taken by the tracer at the junction. Tracer features include the

    ability to jump gaps, the ability to snap to the center of a heavy raster

  • 8/10/2019 Arcs Can

    38/62

    ESRI Scanning Data Entry SolutionsA GIS Focus

    36 G-141/3.36.06

    March 1994

    line, and smart retrace. Built-in junction memory prevents retrace of

    explored paths, offering improved line tracing efficiency. Line

    following can be done with user interaction, or in fully automatic

    mode, all connected line work within a defined area can be vectorized

    without operator intervention. The line tracer can work from bi-tonal

    ARC/INFO grid or RLC raster data.

    ArcScan provides automaticcleanup of line intersections

    guided by user set parameters.

    Because the line tracer is built into ARCEDIT, you can interleave

    manual digitizing with line following, enabling more productive

    processing of noisy data. You can use a menu interface for heads-up

    digitizing to tag features with attribute data shown in the raster

    document. The line trace tool works with the multiple windowing

    capability of ARCEDIT. ArcScan automatically moves a close-up

    view of the tracing activity, following the tracing activity even as itmoves out of view.

    ARC/INFO ARC/INFO is a full-feature GIS capable of meeting the complexrequirements of a wide variety of GIS applications. All the

    capabilities of ARC/INFO can be applied to scanning data entry

    projects. One of the most important features of ARC/INFO for

    scanning data entry projects is ARC/INFO software's user interface

    environment, ArcTools. ArcTools organizes ARC/INFO software's

    thousands of GIS tools and provides a single look and feel to

    ARC/INFO functionality, including the ArcScan extension.ARC/INFO capabilities can be fully customized using the ARC Macro

    Languageprocessing procedures for scanning data entry can be

    tailored to the specific needs of the GIS application.

  • 8/10/2019 Arcs Can

    39/62

    ESRI Scanning Data Entry SolutionsA GIS Focus

    G-141/3.36.06 37

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    The integrated ARC/INFO software environment makes allARC/INFO functionality immediately available. This offers a

    scanning data entry project the ability to take advantage of ARCEDIT

    vector and attribute editing capability. This ARC/INFO data

    automation functionality can be added to the techniques specific to

    scanning data entry. The final result of scanning data entry is a

    topologically correct ARC/INFO georelational database that can

    support GIS analysis and advanced display.

    An important feature of ARC/INFO software's integrated environment

    is that all editing and data entry functions can use the ArcStorm

    database manager. ArcStorm (ARC/INFO STORage Manager)provides feature-level access to seamless spatial databases. ArcStorm

    supports production-level data entry by multiple users.

    Because scanning data entry is implemented in the ARCEDIT

    environment, advanced spatial editing features can be utilized. These

    advanced features include editing of complex and user-defined

    features and interactive topology creation. In short, once the edit

    session is over, no further processing is required.

    The IMAGE INTEGRATOR functionality included with ARC/INFO

    provides image conversion, management, and display capabilities fora wide variety of image data (see Table 2), including industry-standard

    formats used by most scanner vendors. IMAGE INTEGRATOR

    brings image handling capability to scanning data entry projects and

    provides such benefits as concurrent raster and vector display for

    heads-up digitizing and raster plotter support for hard-copy

    production. These images can be easily converted into the user-

    selected format of preference, geo-referenced and kept in an

    ARC/INFO Image Catalog as a seamless raster database.

  • 8/10/2019 Arcs Can

    40/62

  • 8/10/2019 Arcs Can

    41/62

    ESRI Scanning Data Entry SolutionsA GIS Focus

    G-141/3.36.06 39

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    Importantly, the IMAGE INTEGRATOR can also display images that

    do nothave the inherent geographic component that maps and satellite

    images do. This type of image can also be scanned from input

    sources such as photographs, textual documents, and video input.

    This type of image cannot be georeferenced and is commonly used as

    a pictorial attribute of a coverage feature. A DBMS capable of storing

    Binary Large Objects (BLOBs) can be used to manage this type of

    image. ARC/INFO can access BLOB data in a DBMS.

    Video image attribute, stored as aBLOB in an external DBMS

    table, can be displayed with theIMAGE INTEGRATORcommand IMAGEVIEW.

    Scanned document informationcan also be stored as a BLOB and

    displayed using theIMAGEVIEW capability.

  • 8/10/2019 Arcs Can

    42/62

  • 8/10/2019 Arcs Can

    43/62

    ESRI Scanning Data Entry SolutionsA GIS Focus

    G-141/3.36.06 41

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    grabbers can also be displayed during an ArcCAD session. Scanning,

    raster-to-vector conversion, and raster editing are also available.

    ArcView ArcView, ESRI's desktop GIS display and query software, is capableof viewing all the image formats supported by ARC/INFO. ArcView

    can play a role in a scanning data entry project as a "quick look" tool

    to view and verify scanned data. ArcView can display raster and

    vector data simultaneously and can provide quick output of raster

    graphics in industry-standard graphic formats such as PostScript.

    ESRI Services ESRI provides a full range of services including training, databaseautomation, application development, and on-site technology transfer.

    Since 1969, ESRI has supported hundreds of organizations

    throughout the world in the design, development, and implementation

    of GIS. ESRI's services support the complete GIS life cycle,

    including implementation planning, system integration, database

    development, application development, and system operation. ESRI

    is unique within the GIS industry in its ability to provide such

    comprehensive services in combination with a complete set of leading

    GIS software.

    Working with the leading hardware vendors, ESRI has successfully

    provided turnkey geographic information systems to hundreds of

    clients.

    ESRI offers new ArcScan users an on-site ArcScan Start-up Support

    Package. This package is two days of consulting and technical

    training support by an ESRI technical analyst to help the new ArcScan

    user implement this technology. The goal of the support package is to

    transfer the skills necessary to begin using the ArcScan software in a

    production environment. The subjects covered are document

    preparation, system initialization, ArcScan software usage, quality

    assurance techniques, and data structure considerations for scanning.

    The ArcScan Start-up Support Package is provided at your site using

    your equipment.

  • 8/10/2019 Arcs Can

    44/62

    ESRI Scanning Data Entry SolutionsA GIS Focus

    42 G-141/3.36.06

    March 1994

    ArcData ArcData is ESRI's program for providing published spatial andgeographically related digital data in ARC/INFO supported formats.

    Existing off-the-shelf vector and raster databases can complement

    scanning projects. The ArcData program includes satellite and other

    imagery. Through the ArcData program, leading data vendors such as

    EOSAT, Spot Image, and Hughes STX can provide imagery for

    locations worldwide that can be used in scanning data entry projects.

    Supported Devices As mentioned previously, ARC/INFO supports industry-standardraster data formats. For a scanner to be usable with ARC/INFO it

    must output raster data in one of these standard formats, preferably

    TIFF or RLC. Other factors, such as direct output to the UNIX file

    system and UNIX-based controller software also bear on scannerease-of-use. ArcScan has been tested with scanners from leading

    manufacturers. The section on evaluating scanning data entry

    provides more information on scanner features that are important.

    ESRI's machine-independent philosophy allows ARC/INFO users to

    take advantage of new hardware developments as they occur.

    ESRI defines level of support for peripheral devices, such as scanners

    and plotters, using a numerical classification system. TheARC/INFO

    Users Guide, Supported Devices, UNIX Workstations (Rev. 7.0)

    provides classifications for specific scanner, plotters, and other

    devices. The classification categories are outlined below.

    Classes of SupportedDevices

    There are five classes of support for supported devices:

    Class 1: Fully

    Supported,

    In-House at ESRI

    Class 1 devices have been tested at ESRI, run successfully with

    ARC/INFO, and have an interface (driver, interface file, etc.)

    provided with ARC/INFO Rev. 6.1.2. Any problems that occur with

    these devices can be tested on-site because the device must remain on

    ESRI premises for it to remain a Class 1 supported device. This is the

    highest level of support.

  • 8/10/2019 Arcs Can

    45/62

  • 8/10/2019 Arcs Can

    46/62

    ESRI Scanning Data Entry SolutionsA GIS Focus

    44 G-141/3.36.06

    March 1994

    Class 5: Unknown The status or ability to interface this device is unknown at the time of

    this publishing. New devices as well as devices not included in this

    document that have not yet been tested fall into this category.

    Class 6: Not

    Supported

    These devices will not work with ARC/INFO and therefore cannot be

    supported. In most cases, unsuccessful attempts have been made to

    interface these devices.

    Citations Litton, Adrien L. "Automated Data Capture: The ScanningSolution," Proceedings, 1993 ARC/INFO User Conference

    ARC/INFO: GIS Today and Tomorrow, ESRI White PaperSeries, September 1992

    ArcCAD, The Integration of CAD and GIS, ESRI White Paper

    Series, April 1993

    Supported Devices Guide, UNIX Workstations

    (ARC/INFO Rev. 7.0)

    WorkStation ARC/INFO Technical Guide to Hardware Options

    ARC/INFO Users Guide,Cell-based Modeling with GRID

    A Wide Range of High-Quality Support for All Your GIS Needs,

    ESRIBrochure

  • 8/10/2019 Arcs Can

    47/62

    G-141/3.36.06 45

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    GlossaryDefinitions of key terms that will helpyou to understand the concepts discussedin this document.

    ARCEDIT ARCEDIT is the ARC/INFO environment for editing coveragecoordinate data and descriptive data. Its sophisticated graphic andediting capabilities provide the tools necessary for accurate data entry

    and manipulation. These capabilities are important for creating and

    maintaining geographic databases.

    ARCPLOT ARCPLOT is the ARC/INFO environment for providing cartographictools for all of your ARC/INFO mapping needs, including full

    cartographic design, display, and production capabilities.

    ArcScan ArcScan is the ARC/INFO extension that provides capabilities tosupport scanning data entry. ArcScan is closely integrated with other

    ARC/INFO functionality, particularly IMAGE INTEGRATOR and

    ARCEDIT.

    ArcTools ArcTools is a graphical user interface (GUI) environment providedwith ARC/INFO. ArcTools provides a consistent menu interface to

    ARC/INFO and its subsystems and extensions.

    attribute An attribute is a characteristic of a map feature described by numbersor characters, typically stored in tabular format, and linked to the

    feature by a user-assigned identifier. For example, attributes of a

    well, represented by a point, might include depth, pump type, and

  • 8/10/2019 Arcs Can

    48/62

    Glossary

    46 G-141/3.36.06

    March 1994

    owner. Feature attribute information is often present in source

    documents as symbology or annotation. For example, a plat map may

    show Parcel Identification Numbers (PINs).

    attribute table Attribute tables are tabular, flat, or relational files directly associated tothe spatial data and form the "relational" half of the georelational data

    structure. ARC/INFO functions maintain the integration of spatial and

    attribute data in feature attribute tables. Additional attribute

    information may be kept in external attribute tables, perhaps

    maintained as tables in an RDBMS.

    bandwidth Bandwidth is a way of expressing how much data can be transmittedacross a communications medium at any one time. The higher the

    bandwidth, the more activity the communications channel can support.

    For example, local area networks have higher bandwidth than serial

    communications links. Bandwidth is often measured in megabytes

    per second.

    bit The smallest unit of information that can be stored and processed in acomputer. A bit has two possible values, 0 or 1, which can be

    interpreted as BLACK/WHITE or ON/OFF. Bi-tonal data can be

    compressed into images that represent cell values with a single bit.

    bi-tonal Bi-tonal, as applied to raster data sets, means the raster data have onlytwo possible values. Tonal quality is the brightness value, therefore

    bi-tonal data have only two values, black and white.

    black noise

    Black noise, or addition of data.

    Black noise is pixels with black values where the original information

    content had white values. This often has the appearance of speckling,

    tiny black spots on a white background. Noise is data in an commu-

    nication channel that is random or has no informational content.

    Noise is usually caused by low data quality and is unwanted because

    extra pre- or post-processing can be required to remove it. Blacknoise is the addition of data where none should exist. Seewhite

    noise.

  • 8/10/2019 Arcs Can

    49/62

    Glossary

    G-141/3.36.06 47

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    BLOB Binary Large Object. A term often used with database managementsystems. Any large data set handled as binary data; often BLOBs are

    raster image data.

    categorical data Categorical data consist of values representing discrete categories,such as soil or vegetation type. Also referred to as nominal data.

    CCD Charge-coupled device. A CCD is the electronic instrument used inscanners to sense brightness values. CCDs are usually capable of

    distinguishing and outputting grayscale data that have a maximum of

    256 levels of gray.

    cell The basic element of spatial information in a grid data set. Cells arealways square. A group of cells forms agrid.

    1

    234567

    Y-axis

    X-axis(0,0)

    Rows

    Columns

    Upper left corner

    Value1

    234567

    Count8

    114

    125

    3010

    Cover-TypeW Pine

    D FirMixedGrassWaterPavedAgriculture

    }Cell size

    cell based See raster.

  • 8/10/2019 Arcs Can

    50/62

    Glossary

    48 G-141/3.36.06

    March 1994

    clutter

    129

    In this example, the annotation"129" is clutter. It overlays theline and will interfere with

    vectorizing.

    Unwanted data on a scanned map. Clutter, unlike noise, may have

    informational contentbut not the information sought by the data

    entry process. Clutter, like noise, can require extra pre- or post-

    processing in order to remove it. Line-following raster-to-vector

    converters are efficient at dealing with clutter because they utilize

    human capabilities to discern clutter from desired data. Annotation

    that overlays line work is a common type of clutter.

    COGO 1. Coordinate geometry. Software that uses legal descriptions andsurvey information to create spatial vector data.

    2. An ARC/INFO software extension.

    continuous data Continuous data consist of values representing samples from acontinuous surface, such as elevation values. Also referred to as

    ordinal or ratio data.

    control point A control point is a location on the image or map having known real-world coordinates. Control points are also called registration marks,

    or tics.

    coordinates An expression of location in space by the provision of pairs ofnumbers that indicate offset from a known starting point. X,y

    coordinates are an expression of position in Cartesian space. A

    common coordinate, or georeferencing system, is a requirement forthe concurrent use of different types of data.

    corrected photo See orthophoto.

  • 8/10/2019 Arcs Can

    51/62

    Glossary

    G-141/3.36.06 49

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    coverage A digital analog of a single map sheet forming the basic unit of vectordata storage in ARC/INFO software. In a coverage, map features are

    stored as primary features, such as arcs, nodes, polygons, and label

    points; and secondary features, such as tics, extent, links, and

    annotation. Map feature attributes are described and stored

    independently in feature attribute tables.

    data automation The process of converting analog data such as maps, to a digitalrepresentation of the same information.

    data model A data model is a formal method for arranging data to represent thebehavior of real-world entities. Fully developed data models describe

    data types, integrity rules for the data types, and operations on the data

    types. ARC/INFO software uses a georelational data model, a hybrid

    data model that combines spatial data (in coverages and grids) and

    attribute data (in tables). ARC/INFO's integrated data model allows

    easy conversion between, and concurrent use of, raster and vector

    data.

    data quality In the context of scanning data entry, data quality refers to the qualityof the source document, that is, the media itself. Data quality does notrefer to the informational veracity, accuracy or precision of the data on

    the media. Thus, a well-used, folded, wrinkled, and stained third-

    generation blue-line map has less data quality than a new Mylar

    overlay map having crisp, high-contrast line work.

    DBMS Database management system; often a relational database managementsystem. A DBMS is the collection of software required for using and

    manipulating a tabular database, and presenting multiple, different

    views of the data. DBMS can also manage Binary Large Objects. SeeBLOB.

  • 8/10/2019 Arcs Can

    52/62

    Glossary

    50 G-141/3.36.06

    March 1994

    dpi Dots per inch. Dpi is a common measure of resolution in scanners.The more dots per inch (sampling rate) a scanner has, the greater the

    resolution.

    dropout Dropout is an artifact of the scanning process that results in the loss ofdata where they should exist, such as pixel thinning in line work. See

    white noise.

    georeference To georeference is to establish the relationship between an image(row, column) coordinate system and a map (x,y) coordinate system.

    Georeferencing is accomplished establishing control points that can beidentified in both coordinate systems, then creating the displacement

    vectors, or links, between the control points. For example, once a

    raster data set is georeferenced to a vector coverage, the raster and

    vector data should overlay or register.

    georegister See georeference.

    georelational data

    model

    A hybrid data model used to represent spatial features. The

    georelational data model encompasses coordinate, topological (geo),and feature attribute (relational) information.

    GIS A geographic information system (GIS) is an organized collection ofcomputer hardware, software, geographic data, personnel, and

    procedures designed to efficiently capture, store, update, manipulate,

    analyze, and display all forms of geographically referenced

    information. Complex spatial analysis is possible with a GIS that

    would be difficult, time-consuming, or impracticable otherwise.

    GPS Global positioning system. A system of geostationary satellites,ground receivers, and associated software that provides an

    electronically instrumented means of determining position on the

    earth.

  • 8/10/2019 Arcs Can

    53/62

    Glossary

    G-141/3.36.06 51

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    grid 1. A raster geographic data set for use with ARC/INFO software.Each grid cell is referenced by its geographic x,y location. Cells

    store values. ArcScan functionality operates on the ARC/INFO

    grid data structure for most operations.

    2. One of many data structures commonly used to represent map

    features. A raster-based data structure composed of cells of equal

    size arranged in columns and rows. The value of each cell, or

    group of cells, represents the feature value. (Also called

    "Raster.")

    1

    0 0 0 0 0 0

    0 0 0 0 0 0

    0 1 0 0 0 0

    0 0 0 0 0 0

    0 0 0 0 0 0

    0 0 0 0 0 0

    Point features

    1

    0 1 0 0 0 0

    1 1 3 3 3 0

    1 1 0 0 3 0

    1 0 0 0 0 0

    1 2 2 2 0 0

    1 0 0 0 0 0

    Line features

    2

    3

    1

    1 1 2 2 2 2

    1 1 2 2 2 2

    1 1 1 2 2 2

    1 1 3 3 3 3

    1 1 3 3 3 3

    1 1 3 3 3 3

    Area features

    2

    3

    Coordinate

    Grid

    GRID An ARC/INFO software product that provides a fully integratedraster- or cell-based geoprocessing system for use with ARC/INFO.

    GRID supports a map-algebra spatial language allowing sophisticated

    spatial modeling and analysis.

    grid cell A discretely uniform unit that represents a portion of the earth, such asa square meter or square mile. Each grid cell has a value that

    corresponds to the feature or characteristic at that site, such as a soil

    type, census tract, or vegetation class. Seepixel.

    GUI Graphical user interface. A highly visual and interactive method forsupporting human-computer interaction.

  • 8/10/2019 Arcs Can

    54/62

    Glossary

    52 G-141/3.36.06

    March 1994

    heads-up digitizing The process of using a high-resolution, bit-mapped display and mouseto automate vector data by tracing features shown as an image on the

    screen.

    image A graphic representation or description of an object that is typicallyproduced by an optical or electronic device. Common examples

    include remotely sensed data such as satellite data, scanned data, and

    photographs. An image is stored as a raster data set of binary or

    integer values representing the intensity of reflected light, heat, or

    another range of values on the electromagnetic spectrum. Seeraster.

    image catalog An image catalog is an organized set of spatially referenced, possiblyoverlapping, images that can be accessed as one logical image.

    ARC/INFO IMAGE INTEGRATOR can use image catalogs for raster

    data in formats such as TIFF or RLC. The ARC/INFO GRID data

    structure does not use image catalogs.

    IMAGEINTEGRATOR

    A collection of image management and display tools in ARC/INFO

    that allows vector and raster data to be displayed concurrently. Image

    integrator commands are used to georeference and rectify images to

    real-world coordinates, display images, and manage image catalogs.

    image-to-worldtransformation

    Image-to-world transformation is the transformation between image

    locations and real-world or map coordinates.

    interpolatedresolution

    A method employed by scanner vendors to increase output resolution

    by use of softwarethat is, each input pixel is interpolated to produce

    more output pixels. Interpolating pixel values will not improve the

    informational content of the original scanned data and is usually not aneffective method for GIS applications. See optical resolution.

  • 8/10/2019 Arcs Can

    55/62

    Glossary

    G-141/3.36.06 53

    Environmental Systems Research Institute, Inc. (909) 793-2853380 New York Street, Redlands, CA 92373 Fax (909) 793-5953

    Telex 910 332 1317

    LAN 1. Local area network. Computer data communications technologythat connects computers at the same site. When computers are on

    a LAN, they can share data and other computer resources, such as

    printers and plotters. LANs are composed of cabling and special

    data communications hardware and software.

    2. An ERDAS image processing system file type


Recommended