Download - Testing the Spatial Accuracy of GIS Data - PVTS · Testing lines, for ex-ample, requires test-ing the accuracy of the end points of the lines such as at road intersections, and the

IN L INE WITH GIS

common mistake amongGIS folks who do not have

a surveying background, is to relycompletely on the GPS user’smanual to tell them what the ac-curacy of their data will be if theyfollow certain procedures. For in-stance, the user’s manual maystate that in order to achieve sub-meter accuracy you must have aPDOP of four, a SNR of four, trackat least five satellites, and performa differential correction. Many GISfolks read the reverse in thatstatement and assume that if theconditions are met, then the datawill be sub-meter. That is similarto thinking that if you used a steeltape that is graduated to hun-dredths of a foot, all of yourmeasurements with that tape willbe accurate to a hundredth of afoot regardless of how you meas-ure. Of course this is not neces-sarily true. There are many factorsthat affect the quality and accuracy offield measurements. Some of thosefactors are acknowledged and quan-tifiable, some can be controlled, andsome factors are beyond human con-trol or knowledge. Another issue, withrespect to GPS coordinates, is the dif-ference between relative accuracy andabsolute accuracy. Most resourcegrade GPS software will provide an er-ror report for a data set. Yet, typically,what they are reporting is the relativeerror of the measurements, not the ab-solute error. That is a report stating anaccuracy of ±1 meters is reporting thatthe error of that set of measurementsis within a meter of each other. Thatdoes not necessarily mean that thosecoordinates are within a meter of thetrue (absolute) coordinate. In order totest the absolute error, a point withknown higher accuracy coordinatesmust be observed.

GIS data may be created from GPS,by digitizing source documents, per-forming field surveys, aerial mapping,address matching, and other methods.Regardless of the way in which a GISlayer is obtained or acquired, therewill be positional errors in the dataset. Determining the magnitude of thepositional (or location) error is impor-tant because the usability of the dataset may be dependent upon its spatialaccuracy. Metadata, often referred toas the data about the data, is essentialfor providing potential users with theinformation needed to determine aGIS data set’s usability for an intendedpurpose. One of the metadata contentcomponents is a statement of spatialaccuracy. Spatial accuracy is probablythe issue that surveyors focus on mostwhen criticizing GIS. However, al-though spatial accuracy is a big con-

cern to surveyors, it’s important tokeep in mind that it is not necessarilythe most important issue to all GISusers.

Level of AccuracyFor instance, when emergency re-

sponders need information on houselocations to aid in developing evacua-tion plans, their priority is focused ongetting the information as quickly aspossible. They do not care whetherthe houses were mapped to an accura-cy of ±10 meters, 30 meters, etc. Addi-tionally, wildlife biologists studyingland use patterns of a watershed mayuse small scale mapping as low as1:100,000 which, according to Nation-al Mapping Accuracy Standards, wouldrequire accuracy of ±50 meters. Never-theless, spatial accuracy reporting isimportant because it does provide the

Rj Zimmer, LS

Testing the Spatial Accuracyof GIS Data

A

Figure 1: Example of GPS road centerline vs. ortho-photography

DISPLAYED WITH PERMISSION • PROFESSIONAL SURVEYOR • January 2002 • ALL RIGHTS RESERVED

www.profsurv.com

potential data user with the informa-tion needed to determine whether ornot a data set will work for his or herintended use. Although guidelines forspatial and/or mapping accuracy doexist (see sidebar), the data creatormay or may not choose to followthose guidelines. In any case, whendata is mapped or converted there isusually some type of mapping or spa-tial accuracy goal that the project mustachieve. This article suggests somemethods for testing and validating theaccuracy of GIS data sets.Types of GIS Data

GIS data can be described aspoints, lines, areas, and raster data.Each of these data types has its uniquerequirements for spatial accuracy test-ing. However, for all types of datathere are some common considera-tions to observe. In order to determinethe spatial accuracy of a GIS data setthe following tasks must be consid-ered: determining what to test, decid-ing how to test it, (procedure, samplesize, sample method), analyzing thesample data, and reporting the results.

How to TestThere are a variety of ways to per-

form the spatial accuracy testing. Typ-ically the procedures are to use somesort of measurement or test that is ex-trinsic to the data set, such as an inde-pendent data source or computation.Independent sources should be of ahigher accuracy than the data set to betested. Some examples include exist-ing digital or hard copy map data,GPS, or terrestrial survey data. In lieuof extrinsic data, estimates can becomputed from intrinsic sources suchas knowledge of the accuracy of thesource document, map registrationand digitizing accuracy (based onscale and methods used), and so forth.Creating independent measurementsassure the highest reliability of the ac-curacy determination.

The methods selected should de-pend on the objectives and the avail-ability of existing data. If the GIS ob-jective is to fit the data set into otherexisting (higher accuracy) data, thenthe new set may be tested against theexisting data. For example, the State ofTexas created a state-wide GIS layer of

public roads with the requirement thatit correlate with existing digital ortho-rectified photography (see Figure 1).In that case, the accuracy of GPS datacan be tested by overlaying the roadnetwork on the photography, thenmeasuring, on-screen, the differencebetween the image of a road segmentand the GPS road segment. For in-stance, a road intersection on the pho-tography and the GPS centerline wouldbe the test. The distance between themwould be a single sample. Natural re-sources data, such as vegetation cover-ages, are more difficult to test becausesuch data sets do not describe well-de-fined points.

Another way to perform testing is toidentify points in the data set that canbe physically measured in the field,then measuring those samples withhigher accuracy methods. For example,if the accuracy of a manhole inventorywas being checked, then a sample setof manholes would be re-measuredwith higher accuracy equipment andmethods. If the GIS requirement was tomap those manholes to a one-meterlevel of accuracy, then a sample set of

manholes should be tested us-ing methods that yield betterthan a one-meter accuracy.The difference between theoriginal measurements, andthe test measurements will bethe accuracy of the manholedata set.

Some Sample MethodsPoints are the simplest GIS

features to test. Points haveonly a location (coordinates),so the method would be toobtain the coordinates of thetest point, then compare thatto the coordinates of the samepoint as defined by a higheraccuracy source (such asGPS). Lines and areas, howev-er, are more complex featuresto test, because their geometryis more complex. The consid-eration for the more complexgeometry data sets is to testthe geometry as well as the coordinates of discrete points.

IN L INE WITH GIS

Figure 2: Geometry error

DISPLAYED WITH PERMISSION • PROFESSIONAL SURVEYOR • January 2002 • ALL RIGHTS RESERVED

www.profsurv.com

Testing lines, for ex-ample, requires test-ing the accuracy ofthe end points ofthe lines such as atroad intersections,and the geometry ofthe line between theendpoints (see Fig-ure 2 geometry er-ror). The geometryis how the line be-haves between theend points. Thisraises such ques-

tions as: Does it go ina straight line? or, Does it curve left or right?, and so forth.Testing the end points of the lines is the same as testing asimple point. However, testing the accuracy of the geometryof the line requires obtaining a coordinate for some point orpoints along that geometry. Since a road centerline createdfrom parametric modeling (such as typing in the curve data)would create a more mathematically correct geometry than aGPS representation of that same curve, the difference can bedifficult to differentiate. A sharp curve would require moreintermediate GPS points than a flatter curve.

Testing the more complex geometry of a polygon requiresfinding discrete points on the edges of the polygon, such asvertices, which can be correlated to similar points on the da-ta set of higher accuracy. If the polygon were a parcelboundary, for instance, one of the checks for spatial accura-cy would be to test a corner of the parcel for accuracy.

Sample SizeAn important consideration when testing spatial accuracy

is selecting an appropriate size sample set. There must be alarge enough sample size to produce a statistically valid re-sult. On the other hand, the sample size is constrained bythe cost and the time required to perform the sampling. Sur-veyors well understand the value of redundant measure-ments, but there is a point after which the value of moremeasurements diminishes. The balance between the numberof measurements and the value of the measurements isunique for each project and should be dealt with individu-ally. Some of the factors to consider are the size of the dataset (10 points or 10,000 points), the distribution of the data,and the importance of the spatial accuracy. If the spatial ac-curacy is of high importance, then that may be incentive totest a large sample set than the minimum required. There arestatistical programs available for calculating sampling sizesfor known and unknown data set sizes.

For photography, a common sample size is 20 points perimage, distributed randomly throughout the image. Line da-ta, such as river or road networks, could be tested severaldifferent ways. A certain portion of the total length of thedata set could be tested, or a percentage of characteristics,

such as junctions or angle points, could be tested. Addition-ally, a random set (such as 1 percent) of the points along thenetwork could be tested, or measurements could be taken atpredetermined intervals (X distance or Y percent along thenetwork). For example, if a linear network were 100 kilo-meters long, then the sample set may consist of points every1/2 km, or 1 km along the route for a standard interval. Al-ternatively, samples may be taken every 5 percent, or 20percent along the way.

Reporting the ResultsOne method of reporting the spatial accuracy is the

FGDC Metadata Content Standard for Spatial Data Accuracy.The standard requires a quantitative and qualitative state-ment of accuracy. Additionally, most survey adjustment soft-ware and GPS software provide reports in a variety of pro-prietary formats (some are customizable). The importantthings that are generally helpful to the end user are a quan-titative statement of accuracy and some information abouthow that was determined.

An abridged example is shown below, taken from theNorth Texas GIS Consortium Metadata for Pavement (www.ntgisc.org/warehouse/metadata/roadedge.html).

Positional Accuracy Horizontal Positional Accuracy Horizontal Positional Accuracy Report: Horizontal position-al accuracy is 1.0 meter defined by the root mean squareerror (RMSE) method. This requires that two-thirds of allphoto-identifiable arc features fall within the stated accu-racy of 1.0 meter and that 90 percent of all arcs must fallwithin twice the distance specified (i.e., 2.0 meters). A fi-nal inspection and acceptance process was completed byseveral North Texas Consortium members.

Quantitative Horizontal Positional Accuracy AssessmentHorizontal Positional Accuracy Value: 3.2ft Horizontal Po-sitional Accuracy Explanation: Resolution as reported.

Vertical Positional AccuracyVertical Positional Accuracy Report: Vertical positional ac-curacy of 1 meter RMSE.

As the push to share geographic information increases,the need to verify and document the spatial accuracy of da-ta sets becomes ever more important. Spatial accuracy as-sessments of GIS data is a task that surveyors are well-suit-ed for, therefore they should lend their expertise to thosethat can benefit from it.

RJ ZIMMER is the GIS Coordinatior for Lewis &Clark County and the city of Helena, Montana,and a Contributing Writer for the magazine.

IN L INE WITH GIS

FGDC Accuracy Standards:www.fgdc.gov/standards/status/sub1_3.html

“The NSSDA is intended to replace the1947 National Map Accuracy Standard(NMAS). The applicability of NMAS islimited to graphic maps, as accuracy isdefined by map scale. The NSSDA wasdeveloped to report accuracy of digitalgeospatial data that is not constrainedby scale.”

DISPLAYED WITH PERMISSION • PROFESSIONAL SURVEYOR • January 2002 • www.profsurv.com

www.profsurv.com