+ All Categories
Home > Documents > Geog 458: Map Sources and Errors

Geog 458: Map Sources and Errors

Date post: 14-Jan-2016
Category:
Upload: meena
View: 35 times
Download: 0 times
Share this document with a friend
Description:
Geog 458: Map Sources and Errors. Uncertainty January 23, 2006. Outlines. Defining uncertainty How to calculate uncertainty? Nominal case: Confusion matrix Interval/ratio case: RMSE How to validate uncertainty? Internal validation: MAUP External validation: Conflation. - PowerPoint PPT Presentation
15
Geog 458: Geog 458: Map Sources and Map Sources and Errors Errors Uncertainty Uncertainty January 23, 2006 January 23, 2006
Transcript
Page 1: Geog 458: Map Sources and Errors

Geog 458:Geog 458:Map Sources and Map Sources and

ErrorsErrorsUncertaintyUncertainty

January 23, 2006January 23, 2006

Page 2: Geog 458: Map Sources and Errors

OutlinesOutlines

1.1. Defining uncertaintyDefining uncertainty

2.2. How to calculate uncertainty?How to calculate uncertainty?1)1) NNominal case: Confusion matrixominal case: Confusion matrix

2)2) Interval/ratio case: RMSEInterval/ratio case: RMSE

3.3. How to validate uncertainty?How to validate uncertainty?1)1) IInternal validation: MAUPnternal validation: MAUP

2)2) External validation: ConflationExternal validation: Conflation

Page 3: Geog 458: Map Sources and Errors

1. Defining u1. Defining uncertaintyncertainty Definition of uncertaintyDefinition of uncertainty

DiscrepancyDiscrepancy between reality and its representation between reality and its representation Different kinds of uncertaintyDifferent kinds of uncertainty

Vagueness: representation is not well accommodated Vagueness: representation is not well accommodated into the essence of reality (e.g. representing cities as into the essence of reality (e.g. representing cities as a point layer, soil as crisp boundary) a point layer, soil as crisp boundary) better human better human conceptualization neededconceptualization needed

Ambiguity: representation is not unilaterally agreed Ambiguity: representation is not unilaterally agreed by users (e.g. placenames, occupation classification, by users (e.g. placenames, occupation classification, indicator of environmental health) indicator of environmental health) standardization standardization neededneeded

Accuracy vs. precisionAccuracy vs. precision Accuracy: difference between true values and those Accuracy: difference between true values and those

in DB in DB Precision: amount of detail present in dataPrecision: amount of detail present in data

Page 4: Geog 458: Map Sources and Errors

QuestionsQuestions Your diagnostics among {uYour diagnostics among {uncertainty, precision, ncertainty, precision,

positional accuracy, attribute accuracy, vagueness, positional accuracy, attribute accuracy, vagueness, ambiguity} and what are your prescriptions? ambiguity} and what are your prescriptions?

Longitude values in decimal degree are stored as an Longitude values in decimal degree are stored as an integerinteger

Contour lines derived from DEM is not well lined up Contour lines derived from DEM is not well lined up with DRGwith DRG

The map indicates this road is bidirectional, but it turns The map indicates this road is bidirectional, but it turns out to be one-wayout to be one-way

Implementing intelligent geocoding system based on Implementing intelligent geocoding system based on preposition in English (e.g. across, at, over) for preposition in English (e.g. across, at, over) for international users international users

Is the boundary of Mt. Everest well delineated? Is this Is the boundary of Mt. Everest well delineated? Is this polygon boundary a good representation of Mt. Everest?polygon boundary a good representation of Mt. Everest?

Which is broadest? How would you communicate these Which is broadest? How would you communicate these errors in your data quality report?errors in your data quality report?

Page 5: Geog 458: Map Sources and Errors

2. Calculating accuracy2. Calculating accuracy

Nominal caseNominal case CConfusion matrix (a.k.a. onfusion matrix (a.k.a.

misclassification matrix)misclassification matrix) Interval/Ratio caseInterval/Ratio case

Root Mean Square Error (RMSE)Root Mean Square Error (RMSE)• Confusion matrix is widely used to report on attribute accuracy when measured at a nominal scale

• RMSE is widely used to report on position accuracy when measured at a numeric scale (e.g. x, y coordinates are metric)

Page 6: Geog 458: Map Sources and Errors

Confusion MatrixConfusion Matrix Table 6.2 (p. 138): evaluating classification of land Table 6.2 (p. 138): evaluating classification of land

parcel there are five land use code A to Eparcel there are five land use code A to E Rows and columns in misclassification matrixRows and columns in misclassification matrix

Row corresponds to the class as recorded in the databaseRow corresponds to the class as recorded in the database Column corresponds to the class as recorded in the fieldColumn corresponds to the class as recorded in the field

Correctly classified vs. incorrectly classifiedCorrectly classified vs. incorrectly classified DDiagonal entries represent agreement between database iagonal entries represent agreement between database

and fieldand field Off-diagonal entries represent disagreement between Off-diagonal entries represent disagreement between

database and fielddatabase and field So how accurate would you say about this data?So how accurate would you say about this data?

SSince 206 (sum of diagonal entries) is correctly classified ince 206 (sum of diagonal entries) is correctly classified out of 304, it would be 206/304 = 68.6%out of 304, it would be 206/304 = 68.6%

Page 7: Geog 458: Map Sources and Errors

Confusion matrix: Confusion matrix: exerciseexercise

Let’s say you decide to write a test report Let’s say you decide to write a test report on attribute accuracy of land use mapon attribute accuracy of land use map

100 reference points are selected to 100 reference points are selected to represent three classes, 49 points from represent three classes, 49 points from natural, 28 points from agricultural, and natural, 28 points from agricultural, and 23 points from urban land use in your data23 points from urban land use in your data

Field checks resulted in 41 points Field checks resulted in 41 points confirmed to be natural, 21 points confirmed to be natural, 21 points confirmed to be agricultural, and 19 points confirmed to be agricultural, and 19 points confirmed to be urban. confirmed to be urban.

What is overall accuracy of your data? What is overall accuracy of your data?

Page 8: Geog 458: Map Sources and Errors

Root Mean Square ErrorRoot Mean Square Error RMSE = RMSE =

where cwhere cii is observed value and a is observed value and aii is true value is true value RMSE is the square root of sum of squared RMSE is the square root of sum of squared

ddifference between observed value (ci) and its ifference between observed value (ci) and its corresponding true value (ai) corresponding true value (ai)

Indicates how much observed value is Indicates how much observed value is deviated from true valuesdeviated from true values

In the case of positional accuracy, ai will be In the case of positional accuracy, ai will be derived from data with source in higher derived from data with source in higher accuracyaccuracy

Page 9: Geog 458: Map Sources and Errors

RMSE: exerciseRMSE: exercise Let’s say you decide to Let’s say you decide to

write a test report on write a test report on positional accuracy of positional accuracy of NHPN dataNHPN data

You obtain data of You obtain data of sources with a higher sources with a higher positional accuracy such positional accuracy such as geodetic pointsas geodetic points

7 points (intersections) 7 points (intersections) are selected to be are selected to be compared to 7 compared to 7 corresponding control corresponding control points points

Distances for 7 pairs are Distances for 7 pairs are calculated as followscalculated as follows

What is RMSE?What is RMSE?

Page 10: Geog 458: Map Sources and Errors

3. Validating accuracy3. Validating accuracy

Internal validationInternal validation Examines likely impacts of uncertainty Examines likely impacts of uncertainty

upon operation results within GISupon operation results within GIS What would be effects of different dWhat would be effects of different data ata

aggregation schemes on operation aggregation schemes on operation results?: MAUPresults?: MAUP

External validationExternal validation Validates accuracy of test data in Validates accuracy of test data in

reference to external data sourcesreference to external data sources How much is this data set accurate How much is this data set accurate

relative to reference data?: Conflationrelative to reference data?: Conflation

Page 11: Geog 458: Map Sources and Errors

Modifiable Areal Unit Modifiable Areal Unit ProblemProblem

Quite simply, different aggregations yield Quite simply, different aggregations yield different resultsdifferent results FFrom Openshawrom Openshaw

Because sometimes geography does not have a Because sometimes geography does not have a natural unit of analysisnatural unit of analysis PPopulation, vegetationopulation, vegetation

Remember census unit is artificial boundary for Remember census unit is artificial boundary for the purpose of enumerationthe purpose of enumeration SSpace is used as a sampling schemepace is used as a sampling scheme

Question of optimal unit of analysisQuestion of optimal unit of analysis UUrban center boundary for analyzing urban activitiesrban center boundary for analyzing urban activities Metropolitan area for analyzing spatial labor marketMetropolitan area for analyzing spatial labor market

Page 12: Geog 458: Map Sources and Errors
Page 13: Geog 458: Map Sources and Errors
Page 14: Geog 458: Map Sources and Errors
Page 15: Geog 458: Map Sources and Errors

ConflationConflation DDescribes the range of functions that escribes the range of functions that

attempt to overcome differences between attempt to overcome differences between datasets or merge their contents as with datasets or merge their contents as with rubber-sheetingrubber-sheeting

Visual inspection of spatial oVisual inspection of spatial overlay of verlay of TIGER file over GPS measurementsTIGER file over GPS measurements

Lab2: working with data of different Lab2: working with data of different sources, conflating test data with data of sources, conflating test data with data of independent source (higher accuracy), independent source (higher accuracy), visual inspection of positional accuracy, visual inspection of positional accuracy, summarizing positional accuracy of test summarizing positional accuracy of test data with RMSEdata with RMSE


Recommended