iirsiirs
Data Models&
Conceptual Model of Spatial Information
Dr. Sameer SaranGeoinformatics Division
iirsiirs What is a GIS ?
“A GIS is a computer-based system that provides the following four sets of capabilities to handle geo-referenced data:p g1. Input2 D ( d i l)2. Data management (storage and retrieval)3. Manipulation and analysisp y4. Output.” (A ff 1989)(Aronoff, 1989)
iirsiirs What does a GIS?
A GIS works with objects their attributesA GIS works with objects, their attributes, and the relationships among the objects.
The objects are stored in a database usingThe objects are stored in a database using geometric primitives (volumes, areas, lines, points) their attributes and thepoints), their attributes and the relationships between them (topology).
iirsiirs Characteristics of Geographic Data
• Spatial data: features orientation shape, size & structure& structure
N S i l d I f i b i• Non-Spatial data: Information about various attributes like area, length & population
iirsiirs Characteristics of Spatial Data
• spatial reference • where?• spatial reference• attributes
• where?• what?attributes
• spatial relationshipswhat?
• how?• temporal component • when?
iirsiirs Spatial data models
Spatial data models are high-Spatial data models are highlevel data structures that focuson “formalization of theconcepts humans use toconcepts humans use toconceptualize space”
iirsiirs Spatial Data Model
It represents the linkages between the real world It represents the linkages between the real world
domain of geographic data and the computer or GISdomain of geographic data and the computer or GIS
representation of these features. It helps (Marble, 1982)representation of these features. It helps (Marble, 1982)
•• To organize a systematic file structureTo organize a systematic file structure•• To organize a systematic file structureTo organize a systematic file structure
•• Abstracts the real world into properties which are perceivedAbstracts the real world into properties which are perceived
by a specific applicationby a specific applicationy p ppy p pp
iirsiirs GIS structures as representations of reality
Two approaches have been widely adopted Two approaches have been widely adopted for representing the spatial & attribute for representing the spatial & attribute information within a GISinformation within a GISinformation within a GISinformation within a GIS
•• A composite model (raster)A composite model (raster)•• A composite model (raster)A composite model (raster)
GG l ti l d l ( t )l ti l d l ( t )•• GeoGeo--relational model (vector)relational model (vector)
iirsiirs spatial data models
• Two fundamental approaches:Two fundamental approaches:
–raster modelt d l–vector model
iirsiirs Spatial data types
l ll i• Regular tessellations• Irregular tessellations• Irregular tessellations• Point data • Line data
A d• Area data
iirsiirs
Vector Data ConceptVector Data Conceptpp
iirsiirs Vector Data Structure
Point (node): 0-dimensioni l di t i• single x,y coordinate pair
• zero area• tree, oil well, label locationLi ( ) 1 di iLine (arc): 1-dimension• two (or more) connected x,y
coordinates• road, streamPolygon : 2-dimensions• four or more ordered and
connected x,y coordinates• first and last x,y pairs are the
same• encloses an area• census tracts, county, lake
iirsiirs Vector model
• In a vector-based GIS data are handled as:– Points X,Y coordinate pair + label– Lines series of points– Areas line(s) forming their boundary
(series of polygons)
line feature
area featurepoint feature
iirsiirs Vector model
iirsiirs Line Types
Line segment: with two end pointsLine string: a sequence of line segmentsLinearRing (Ring): a sequence of segments withLinearRing (Ring): a sequence of segments with closureC (A )Curves (Arc)
* Sometimes, “Arcs” refer to lines (in ArcGIS/ArcView case))
iirsiirs Vector Structures
How to organize vectors in Computer ?g p
Spaghetti Structure
Whole Polygon Structure
Points and Polygons Structure
Topological Structure
iirsiirs Spaghetti Vector Structure
Spaghetti structure is usually derived p g yfrom manual digitizing
C i li ( i d )Crossing lines (no crossing nodes)The common boundary between adjacent polygons is recorded twiceNo neighbourhood informationNo neighbourhood informationUnlinked data require a large amount of tstorage memory
iirsiirs Whole Polygon Structure
(A Kind of Spaghetti)Wh l P l (b d t t ) lWhole Polygon (boundary structure): polygons described by listing coordinates of points in order as you ‘walk around’ the outside boundary of the y ypolygon• coordinates/borders for adjacent polygons stored twice
t b lti i li ( ) lmay not be same, resulting in slivers (gaps), or overlap• all lines are ‘double’ (except for those on the outside
periphery)• no topological information about polygons
which are adjacent and have common boundary?how to relate different geographies? e.g. zip codes and tracts?g g p g p
iirsiirs Whole Polygon: illustration
iirsiirs Points & Polygons Structure
Points and Polygons: polygons described byPoints and Polygons: polygons described by
listing ID numbers of points in order as you ‘walk around the outside boundary’; a second file lists all points and their coordinatesall points and their coordinates• solves the duplicate coordinate/double border problem
• lines can be handled similar to polygons (list of IDs) ?
• still no topological informationstill no topological information
iirsiirs Points and Polygons:illustration
iirsiirs Topology
Topology is a branch of mathematics that dealsTopology is a branch of mathematics that deals with properties of space that remain invariant under certain transformationsunder certain transformations.
Properties : Three spatial relationshipsProperties : Three spatial relationshipsProperties : Three spatial relationshipsProperties : Three spatial relationships
AreaArea: Polygons can be defined by set of lines enclose them: Polygons can be defined by set of lines enclose themContiguityContiguity: Identification of polygons which touch each other or: Identification of polygons which touch each other or
connect identify contiguos polgons (left or right)connect identify contiguos polgons (left or right)ConnectivityConnectivity: Identification of interconnected arcs, starting point: Identification of interconnected arcs, starting pointCo ect v tyCo ect v ty: de t cat o o te co ected a cs, sta t g po t: de t cat o o te co ected a cs, sta t g po t
& end point of network analysis & end point of network analysis
iirsiirs Rubber Sheet Transformation
11
AC 5
AC 5
A
ED 7A
ED 7
B34
63
B4
6
22
iirsiirs Topology-1
Connections & relationships between objects areConnections & relationships between objects are
independent of their coordinates
Topological properties of an object are preserved
when the object is stretched, distorted and bended
Overcomes major weakness of spaghetti model –
allowing for GIS analysis (Overlaying Network)allowing for GIS analysis (Overlaying, Network)
Requires all lines be connected, polygons closed,
loose ends removed
iirsiirs Topology-2
It describes spatial relationships• Connectivity: relationships betweenConnectivity: relationships between
the arcs in the network• Contiguity (adjacency): relationshipsContiguity (adjacency): relationships
between the polygonsFor example, with respect to line 1, left and
right polygons are A and B respectively
• Containment: this refers to what is i hi lwithin a polygon For example, Polygon B is within Polygon A
iirsiirs Building topology
iirsiirs Topological data model
iirsiirs Popular File Formats
DIME Dual Independent Map EncodingDIME – Dual Independent Map Encoding TIGER – Topologically Integrated Geographic Encoding and Reference
DLG Digital Line GraphDLG – Digital Line GraphShape File, ESRI• Software or data specificGeodatabasesGeodatabases
iirsiirs Vector Data Structures
Advantages• Good modeling of objects (object-view)
• Compact data structureCompact data structure
• Topology can be described explicitly – therefore good f l ifor analysis
• Coordinate transformation & rubber sheeting is easy
• Accurate graphic representation at all scales
• Retrieval, updating and generalization of graphics &Retrieval, updating and generalization of graphics & attributes are possible
iirsiirs Vector Data Structures
Disadvantagesg• Complex data structures
C bi i l l t k b• Combining several polygon networks byintersection & overlay is difficult; usesconsiderable computer power
• Display & plotting often time consuming• Display & plotting often time consuming and expensive; especially high quality drawings, coloring, and shading
iirsiirs
Raster Data ConceptRaster Data Conceptpp
iirsiirs Raster Data Structure
Area is covered by grid with (usually) equal sized cells(usually) equal-sized cellsCells often called pixels (picture elements); raster data(picture elements); raster data often called image dataAttributes are recorded by assigning each cell a single value based on the majority feature (attribute) in the cellfeature (attribute) in the cell, such as land use typeTypically 8 bits assigned to yp y gvalues therefore 256 possible values (0-255)
iirsiirs Raster Data Structures: Tessellation
iirsiirs Raster based data structures
To effectively increase data processing performance and reduce the demand for data storage, two issues involved in raster data structures:• Compression methods:- how to more pefficiently store the data, and
• Scan order: how to scan the data in an• Scan order:- how to scan the data in an array and deals with performance in terms of data processing
iirsiirs Run-length Coding
Describes the interior of
an area by run-lengths,
instead of the boundaryinstead of the boundary
Run-Length Codes:g
• Row 9: 2,3; 6,6; 8,10
• Row 10: 1,10
• Row 11: 1 9• Row 11: 1,9
iirsiirs Raster Compression
Run Length CompressionO f th id l d t d t t ti• One of the widely used raster data representation and compression techniques
• E.g.: Code the raster (shown in the example g ( pimage) using the run-length coding with the row-order
• Run-length Codes: 14, 3; 2,7; 4,3; 4,7; 4,3; 3,7;9,3; 2,7; 6,3; 4,7; 5,3; 3,7; 4,3
• Original image size (assume that each pixel is• Original image size (assume that each pixel iscoded using 1 byte) is 8x8=64 bytes
• The run-length code file needs 13x2 = 26 bytesThe run length code file needs 13x2 26 bytes• The compression radio is, 64:26 = 2.46:1
iirsiirs Quad-tree Coding
iirsiirs Raster Compression Cont’d
Quad-tree Compression• Used widely for spatial data indexing• Quadtree codes with N-order (Peano key–based)
iirsiirs Raster Ordering
Raster is two-dimensional2-D ordering is developed to create a 1-D representation of the 2-D raster, in order to improve the efficiency of raster access.
iirsiirs Raster Compression Cont’d
“lossless” compression vs “lossy”lossless compression vs lossy compression• Can you reproduce exactly the original data• Can you reproduce exactly the original data
from the compressed data?Zip (2 5:1)Zip (2-5:1)GIF (2-4:1), JPEG (10-40:1), MPEG
(50 1)(50:1).ECW , Mr. Sid etc.
iirsiirs Raster Data Structures
AdvantagesSi l d• Simple data structures
• Location-specific manipulation of attribute d t idata is easy
• Many kinds of spatial analysis and filtering b dmay be used
• Mathematical modeling is easy because all ti l titi h i l l hspatial entities have a simple, regular shape
iirsiirs Raster Data Structures
Disadvantagesg• Large data volumes• Using large grid cells to reduce data volumesUsing large grid cells to reduce data volumes
reduces spatial resolution; loss of information & inability to recognize phenomena that have & ab ty to ecog e p e o e a t at avelogically defined structures
• Crude raster maps are inelegant though graphic p g g g pelegance is becoming less of a problem
• Coordinate transformations are difficult and time consuming unless special algorithms are employed
iirsiirs Distortion of shapes in raster data
iirsiirs GIS Data Models: Raster vs. Vector
iirsiirs Choices: Raster vs. Vector
iirsiirs
THANK YOUTHANK YOU