Download - Fundamentals of GIS and Database Management for Disaster Management

Geographic information system and database

management

Dept. of Disaster Science and Management

University of Dhaka

GEOGRAPHIC INFORMATION SYSTEM AND DATABASE MANAGEMENT [email protected]

1

L-1

Definition:

Therefore, a geographic information system (GIS) integrates hardware, software, and

data for capturing, managing, analyzing, and displaying all forms of geographically

referenced information.(Sir)

A geographic information system (GIS) is a system designed to capture, store,

manipulate, analyze, manage, and present all types of spatial or geographical data.

Geographic information system, a system for storing and manipulating geographical

information on computer.

A geographic information system (GIS) lets us visualize, question, analyze, and interpret

data to understand relationships, patterns,

and trends.

Components of GIS:

Hardwar

Software

People

Data

Vector and Raster Data:

Vector data model: [data models] A representation of the world using points, lines, and

polygons. Vector models are useful for storing data that has discrete boundaries, such as

country borders, land parcels, and streets.

Raster data model: [data models] A representation of the world as a surface divided into

a regular grid of cells. Raster models are useful for storing data that varies continuously,

as in an aerial photograph, a satellite image, a surface of chemical concentrations, or

an elevation surface.

Concept of Layer:

Layers are the mechanism used to display geographic datasets in ArcMap, ArcGlobe,

and ArcScene. Each layer references a dataset and specifies how that dataset is

portrayed using symbols and text labels. When you add a layer to a map, you specify its

dataset and set its map symbols and labeling properties.

Remote Sensing:

Remote Sensing is the science and art of acquiring information (spectral, spatial,

and temporal) about objects, area, or phenomenon, without coming into physical


2

contact with the objects, or area, or phenomenon under investigation. Without direct

contact, some means of transferring information through space must be utilized.

RS Applications:

Change Detection

Land use

Sea Surface Temperature

Rainfall estimation

Yield Monitoring

Drought Monitoring

Urban sprawl Monitoring

GPS:

A GPS receiver calculates its position by precisely timing the signals sent by GPS satellites

high above the Earth. Each satellite continually transmits messages that include.

GPS Applications:

New data capturing

Navigation

Vehicle Tracking System

Geo-referencing etc.

L-2

ArcGIS Suite:

Desktop GIS

Server GIS

Online GIS

Mobile GIS

ESRI Data


3

ArcGIS Suite: Functionality:

L-3

RS and GIS Tools for Early Warning and Response

Use of GIS in Early Warning system and Response:

FFWC and BMD use for weather and flood forecast

Storm surge inundation

Flood inundation

Salinity intrusion

Cyclone map

Disaster incidence database (DIDb)

GIS-based building inventory database

Building age and building density

Damage and loss estimation


4

GIS in Bangladesh

In Bangladesh, many organizations are providing Geographic Information Services in

various fields like infrastructure planning, agriculture, weather forecasting, change

monitoring, damage assessment etc.

Key Geographic Information Service Providers are -

1. Space Research and Remote Sensing Organization (SPARRSO)

2. Survey of Bangladesh (SoB)

3. Local Government Engineering Department (LGED)

4. Forest Department

5. Bangladesh Meteorological Department (BMD)

6. Flood Forecasting Warning Center (FFWC)

7. Geological Survey of Bangladesh (GSB)

8. Roads and Highways (RHD)

9. Directorate of Land Records and Surveys (DLRS)

10. Soil Resources Development Institute (SRDI)

11. Bangladesh Agricultural Research Council (BARC)

12. Institute of Water Modeling (IWM)

13. Centre for Environmental and Geographic Information Services (CEGIS)

14. Comprehensive Disaster Management Programme, MoDMR

L-4 GIS as an Information System

Information System:

A combination of hardware, software, infrastructure and trained personnel organized to

facilitate planning, control, coordination, and decision making in an organization.

Geographic Information System:

A Geographic Information system (GIS) integrates hardware, software, and data for

capturing, managing, analyzing, and displaying all forms of geographically

referenced information.(Sir)


5

Problem in Geospatial Data Management (BD):

No uniform standard

Poorly maintained

Out of date

Inaccurate

No data sharing

No data retrieval service

Benefits of GIS Implementation:

Geospatial data are better maintained in a standard format

Revision and updating are easier

Geospatial data and information are easier to search, analysis and represent.

More value added product

Geospatial data can be shared and exchanged freely.

Productivity of the staff improved and more efficient.

Time and money are saved.

Better decision can be made.

Difference between GIS Manual Works:

MAPS GIS MANUAL WORKS

STORAGE Standardized

&integrated

Different scales on different

standard

RETRIEVAL Digital Database Paper Maps, Census, Tables

UPDATING Search by Computer Manual Check

OVERLAY Very Fast Expensive & Time consuming

SPATIAL ANALYSIS Easy Complicated

DISPLAY Cheap & Fast Expensive

Basic Functions of GIS:

Functions Sub Function

Data Acquisition and

prepossessing

Digitizing, Editing, Topology Building, Projection

Transformation, Format Conversion.

Database Management

and Retrieval

Data Archival, Hierarchical Modeling, Network

Modeling, Relational Modeling, Attribute Query etc.

Spatial Measurement

and Analysis

Buffering, Overlay operations, connectivity

Operations.


6

Graphic output and

Visualization

Scale Transformation, Generalization, Topography

Map, Statistical Map

Area of GIS Applications:

Area GIS Application

Facilities Management Locating underground pipes & cables, planning

facility maintenance

Environmental and

Natural Resources

Management

Environmental impact analysis, disaster

management and mitigation

Street Network Locating houses and streets, car navigation,

transportation planning, workforce distribution

Planning and

Engineering

Urban planning, regional planning, development of

public facilities

Land Information Taxation, land use zonation, land acquisition

Technologies that Contributes to GIS:

Geography

Cartography

Computer aided design

Surveying(GSP)

Photogrammetry

Statistics

Remote Sensing

Data-base Design

Modelling

GIS Information Infrastructures:

Social Infrastructure: Land Use, Religious Inst. Cadaster etc.

Urban Infrastructure: Fire station, Cable and Pipe Network Transportations

Environmental Infrastructure: Natural Resources Pollution, Climate Change, Disaster.

Economic Infrastructure: Marketing, Banking, Car Navigations

Educational Infrastructure: School Location, Literacy, Enroll/Drop out


7

L-4 GIS Data Model

Data Model:

A data model in geographic information systems is a mathematical construct for

representing geographic objects or surfaces as data. For example, the vector data

model represents geography as collections of points, lines, and polygons; the raster

data model represent geography as cell matrices that store numeric values

Types of geometric data model

– Vector Model uses discrete points, lines and areas corresponding to discrete objects

with name or code number of attribute

– Raster Model uses regularly spaced grid cells in specific sequence. An element of grid

cell is called a pixel (picture cell)

Cell value: Each cell has a value

Cell size: Each cell has a width and height and is a portion of the entire area represented

by the raster

Cell location: The location of each cell is defined by its row and column location within

the raster matrix.

Vector Data Model:

Geometry

– The real world features\ objects can be classified as –

Point object such as electric pole,

Line object such as Upazila road

Area object such as lake

Topology

Means the relationships or connectivity between the spatial objects.

Point

Are zero-dimensional objects that contain only a single coordinate pair? Points are

typically used to model singular, discrete features such as buildings, wells etc. Points have

only the property of location. Other types of point features include the node and the

vertex. Specifically, a point is a stand-alone feature, Vertices are defined as each bend

along a line or polygon feature that is not the intersection of lines or polygons


8

Node

An intersect of more than two lines or strings or polygons, or start and end point of string

with node number

Chain /Edge/Arc

A line or a string with chain number, start and end node number, left and right

neighbored polygons

Polygon

An area with polygon number, series of chains that form the area in clockwise order

(minus sign is assigned in case of anti-clockwise order).

Chain/Arc/Edge:

Chain ID, Start Node ID, End Node ID, Attributes.

Node:

Node ID, (x, y), adjacent chain IDs (positive for to node, negative for from node).

Chain geometry:

Chain ID, Start Coordinates, Point Coordinates, End Coordinates.

Chain topology:

Chain ID, Start Node ID, End Node ID, Left Polygon ID, Right Polygon ID, (Attributes).

Polygon topology:

Polygon ID, Series of Chain ID, in clockwise order (Attributes).

L-5 Raster Data Model

Raster Model

- Model uses regularly spaced grid cells in specific sequence. An element of grid cell is

called a pixel (picture cell)

Raster Data Model

The JPEG, B M P, and TIFF file formats (among others) are based on the raster data

model

If you zoom deeply into the image, you will notice that it is composed of an array

of tiny square pixels (or picture elements). Each of these uniquely colored pixels, when

viewed as a whole, combines to form a coherent image


9

These pixels are used as building blocks for creating points, lines, areas, networks, and

surfaces

Accordingly, the vast majority of available raster GIS data are built on the square pixel.

The raster data model is referred to as a grid-based system. Each cell in a raster carries a

single value, which represents the characteristic of the spatial phenomenon at a location

denoted by its row and column. The data type for that cell value can be either integer

or floating-point.

The raster model will average all values within a given pixel to yield a single value.

Therefore, the more area covered per pixel, the less accurate the associated data

values.

The area covered by each pixel determines the spatial resolution of the raster model from

which it is derived. Specifically, resolution is determined by measuring one side of the

square pixel. A raster model with pixels measuring 1 km by 1 km (1 square kilometer) in

the real world would be said to have a spatial resolution of 1 km

Raster Data Encoding:

a. Cell by cell Encoding

b. Run-Length Encoding

c. Quad tree Encoding

Cell by cell Encoding


10

Quad Tree Encoding:

Comparison of Vector and Raster Model Data

Advantage:

Raster model Vector model

It is a simple data structure It provides a more compact data

structure

Overlay operations can be done

easily

Topology building is easy and

hence good for network analysis

Representation of high spatial

variability possible

Smooth graphical representation

Disadvantage

It less compact and requires more

storage capacity

It is relatively complex data structure.

Topological relationships are more

difficult to represent.

Overlay operations are more difficult to

implement.


11

Blocky appearance does not give

smooth image especially along the

edges.

This is discrete, so difficult to represent

high spatial variability

L-7 Map Projection

Coordinate System:

A coordinate system is a reference system used to represent the locations of

geographic features, imagery, GPS locations etc. within a common geographic

framework.

Each coordinate system is defined by:

Its measurement framework which is either geographic (in which spherical

coordinates are measured from the earth's center) or planmetric.

Unit of measurement (feet or meters for projected coordinate systems or decimal

degrees for latitude–longitude).

The definition of the map projection for projected coordinate systems.

Other parameters such as spheroid of reference, datum, and projection parameters like

one or more standard parallels, central meridian, shifts in the x- and y-directions.

Types of Coordinate Systems:

GCS(Geographic Coordinate System)

A global or spherical coordinate system such as latitude/longitude. A geographic

coordinate system is a coordinate system that enables every location on the Earth to be

specified by a set of numbers or letters

Projected coordinate system:

A projected coordinate system, based on a map

projection, which provides various mechanisms to project

maps of the earth's spherical surface onto a two-

dimensional Cartesian coordinate plane. Projected

coordinate systems are sometimes referred to as map

projections

Geographic Coordinate Systems

A geographic coordinate system (GCS) uses a three-

dimensional spherical surface to define locations on the earth.

A GCS includes an angular unit measure, a prime meridian, and


12

a datum (based on a spheroid). A point is referenced by its longitude and latitude values.

Longitude and latitude are angles measured from the earth's center to a point on the

earth's surface. The angles often are measured in degrees (or in grads).

Some Definition:

1. Sphere

2. Spheroid

3. Meridian

4. Grater circle

5. Equator

DATUM

Types of Datum:

Geocentric Datum

Local Datum

PROJECTED COORDINATE SYSTEM

Map Projection

Map projections refer to the methods and procedures that are used to transform the

spherical three-dimensional earth into two-dimensional planar surfaces. Specifically, map

projections are mathematical formulas that are used to translate latitude and longitude

on the surface of the earth to x and y coordinates on a plane.

Since there are an infinite number of ways this translation can be performed, there

are an infinite number of map projections. Generally, the paper is either flat and

placed tangent to the globe (a planar or azimuthal projection) or formed into a cone

or cylinder and placed over the globe (cylindrical and conical projections).

Every map projection distorts distance, area, shape, direction, or some combination

thereof.

Concept of map projection

To illustrate the concept of a map projection, imagine that we place a light bulb in

the center of a globe. When we turn the light bulb on, the outline of the continents

and the graticule will be “projected” as shadows on nearby surface.

This is what is meant by map “projection.


13

Distortion due to map projection

Distortions are inevitable during projection from 3D to 2D plane

Map projections introduce distortions in shape, area, distance, and direction. A series of

trade-offs will need to be made with respect to such distortions considering purpose of

the map

Characteristic of Coordinate projection system

A projected coordinate system is defined on a flat, two-dimensional surface.

A projected coordinate system has constant lengths, angles, and areas across the

two dimensions.

A projected coordinate system is always based on a geographic coordinate system

that is based on a sphere or spheroid. In a projected coordinate system, locations are

identified by x,y coordinates on a grid, with the origin at the center of the grid. The

two values are called the x-coordinate and y-coordinate the coordinates at the origin

are x = 0 and y = 0. Horizontal lines above the origin and vertical lines to the right of

the origin have positive values; those below or to the left have negative values.

When working with data in a geographic coordinate system, it is sometimes useful

to equate the longitude values with the X axis and the latitude values with the Y

axis

Types of map projection

A map projection uses mathematical formulas to relate spherical coordinates on

the globe to flat, planar coordinates.

Different projections cause different types of distortions. Some projections are designed

to minimize the distortion of one or two of the data's characteristics. A projection

could maintain the area of a feature but alter its shape

Map projections are designed for specific purposes. One map projection might be used

for large-scale data in a limited area, while another is used for a small-scale map

of the world. Map projections designed for small-scale data are usually based on

spherical rather than spheroidal geographic coordinate systems.

Conformal projections (Shape)

Conformal projections preserve local shape. A map projection accomplishes this by

maintaining all angles. In this projections, the meridians and parallels intersect at right

angles The area enclosed by a series of arcs may be greatly distorted in the process.No

map projection can preserve shapes of larger regions.


14

Equal area projections (Area)

Equal area projections preserve the area of displayed features.

To do this, the other properties—shape, angle, and scale—are distorted.

In this projections, the meridians and parallels may not intersect at right angles.

Equidistant projections (Distance)

Equidistant maps preserve the distances between certain points.

Most Equidistant projections have one or more lines in which the length of the line on a

map is the same length (at map scale) as the same line on the globe, regardless of

whether it is a great or small circle, or straight or curved.

For example, in the Sinusoidal projection, the equator and all parallels are their true

lengths.

In other Equidistant projections, the equator and all meridians are true.

No projection is equidistant to and from all points on a map.

True-direction projections (Direction)

The shortest route between two points on a curved surface such

Some True-direction projections are also conformal, equal area, or equidistant.

Map projection method

Because maps are flat, some of the simplest projections are made onto geometric

shapes that can be flattened without stretching their surfaces. These are called

developable surfaces. Some common examples are cones, cylinders, and planes. A

map projection systematically projects locations from the surface of a spheroid to

representative positions on a flat surface using mathematical algorithms.

In projecting from one surface to another is creating one or more points of contact. Each

contact is called a point (or line) of tangency.

A Planar projection is tangential to the globe at one point. Tangential cones and

cylinders touch the globe along a line. If the projection surface intersects the globe

instead of only touching its surface, the resulting projection is a secant. Whether the

contact is tangent or secant, the contact points or lines are significant because

they define locations of zero distortion.


15

In general, distortion increases with the distance from the point of contact. Many

common map projections are classified according to the projection surface used: conic,

cylindrical, or planar

Types of Method:

Conic

The simplest Conic projection is tangent to the globe along a line of latitude. This line is

called the standard parallel. The meridians are projected onto the conical surface,

meeting at the apex, or point, of the cone. Parallel lines of latitude are projected onto

the cone as rings.

The cone is then "cut" along any meridian to produce the final conic projection,

which has straight converging lines for meridians and concentric circular arcs for

parallels. The meridian opposite the cut line becomes the central meridian.

In general, the further you get from the standard parallel, the more distortion

increases. Thus, cutting off the top of the cone produces a more accurate

projection. You can accomplish this by not using the polar region of the projected data.

Conic projections are used for multitude zones that have an east–west orientation.

Somewhat more complex Conic projections contact the global surface at two

locations. These projections are called Secant projections and are defined by two

standard parallels. The distortion pattern for Secant projections is different between the

standard parallels than beyond them. Generally, a Secant projection has less overall

distortion than a Tangent projection. More complex Conic projections, the axis of the

cone does not line up with the polar axis of the globe. These types of projections are

called oblique

Cylindrical

Like Conic projections, cylindrical projections can also have tangent or secant cases. The

Mercator projection is one of the most common cylindrical projections, and the equator

is usually its line of tangency. Meridians are geometrically projected onto the cylindrical

surface, and parallels are mathematically projected. This produces graticular angles of

90 degrees. The cylinder is "cut" along any meridian to produce the final cylindrical

projection.

The meridians are equally spaced, while the spacing between parallel lines of

latitude increases toward the poles. This projection is conformal and displays true

direction along straight lines. For more complex Cylindrical projections the cylinder is

rotated, thus changing the tangent or secant lines.

Transverse Cylindrical projections, such as the Transverse Mercator, use a meridian as the

tangential contact or lines parallel to meridians as lines of secancy. The standard


16

lines then run north–south, along which the scale is true. Oblique cylinders are rotated

around a great circle line located anywhere between the equator and the meridians. In

these more complex projections, most meridians and lines of latitude are no longer

straight. In all Cylindrical projections, the line of tangency or lines of secancy have

no distortion

Planar

It projects map data onto a flat surface touching the globe. This type of projection is

usually tangent to the globe at one point but may be secant. The point of contact

may be the North Pole, the South Pole, a point on the equator, or any point in between.

This point specifies the aspect and is the focus of the projection. The focus is

identified by a central longitude and a central latitude. Possible aspects are

Polar

Equatorial

Oblique

Polar aspects are the simplest form. Parallels of latitude are concentric circles

centered on the pole, and meridians are straight lines Patterns of area and shape

distortion are circular about the focus. For this reason, Planer projections accommodate

circular regions better than rectangular regions. Planar projections are used most often

to map Polar Regions.

Some Planar projections view surface data from a specific point in space. The point of

view determines how the spherical data is projected onto the flat surface. The

perspective from which all locations are viewed varies between the different

Planar projections. The perspective point may be –

Gnomonic -the center of the earth

Stereographic- a surface point directly opposite from the focus

Orthographic- a point external to the globe, as if seen from a satellite or another planet

Map Projection – UTM

The Universal Transverse Mercator (UTM) conformal projection uses a 2-dimensional

Cartesian coordinate system to give locations on the surface of the Earth. The UTM system

is not a single map projection. The system instead divides the Earth into sixty zones, each

a six-degree band of longitude, and uses a secant transverse Mercator projection

in each zone.


17

It make a straight north-south cut like in the peel of the orange and repeating this

north-south cut, at equal intervals, until 60 strips (6° each) or zones have been

detached. Each of these zones will then form the basis of a separate map projection.

This flattening action results in a slight distortion of the geographical features within

the zone, but because the zone is relatively narrow, the distortion is small and may

be ignored by most map-users.

The UTM system divides the Earth between 80°S and 84°N latitude into 60 zones, each

6° of longitude in width (as earth circumference is 360°). These zones have been

numbered 1 to 60.

Each zone is segmented into 20 latitude bands. Each latitude band is 8 degrees

high, and is lettered starting from "C" at 80°S, increasing up to "X“ except I & O.

The last latitude band, "X", is extended an extra 4 degrees, so it ends at 84°N

latitude, thus covering the northernmost land on Earth. Latitude bands "A" & "B" and

"Y" & "Z" cover the polar region using UPS (Universal Polar Stereographic)

Coordinate System. North hemisphere starts from “N”.

Exceptions

UTM grid zones are uniform over the

globe, except in two areas.

On the southwest coast of Norway, grid

zone 32V is 9° wide, and grid zone 31V is 3° (correspondingly shrunk) to cover only open

water.

Around the region Svalbard, the four grid zones 31X (9° wide), 33X (12°), 35X (12°), and

37X (9° wide) are extended. The three grid zones 32X, 34X and 36X are not used

Summary

Universal Transverse Mercator. Conformal projection (shapes are preserved), Cylindrical

surface, Two standard meridians, Zones are 6 degrees of longitude wide, UTM is

commonly used and is a good choice when the east-west width of area does not exceed

6 degrees, Scale distortion is 0.9996 along the central meridian of a zone, There is no scale

distortion along the standard meridians, Scale distortion gets to unacceptable levels

beyond the edges of the zones

Scale and Scale Factor


18

1:100,000 is a ratio, on a map it is referred to as map scale. The map scale is found by

dividing the distance on the map by the distance in the real world if it is measured 100E

by 40N to 100E by 35N on a map, it is found 2.77 cm. Then measure the real world distance

which is 55,493,612.86 cm.

Then scale is 2.77cm ÷ 55,493,612.86cm = 1:20,000,000. For this particular map 1:20,000,000

is the principle scale, i.e. the displayed scale. But map scale is not static, it is not the

same everywhere on the map

Scale depends on the projection used and how the projection distorts the world. It the

new location from105E,30N to 100E,30N on a map, it is measured with 2.6 cm. The

real distance on the earth was 48,239,311.01 cm. Then scale is 2.6 cm ÷ 48,239,311.01 cm

= 1:18,555,581

This scale was taken from a location that was NOT along the standard parallels or along

the meridians, it is referred to as local scale. To find the scale factor, divide local

scale by principle scale. Using the examples above, the scale factor comes up as

1:18,555,581 ÷ 1:20,000,000 = 1.077959.

Scale and Scale Factor So the local scale has been exaggerated by 107%

Map Transformation Method

Database Management

Database

A database is a collection of related, logically rational data used by the application

programs in an organization.

Database Management System (DBMS)

A database management system (DBMS) defines, creates and maintains a database.

The DBMS also allows controlled access to data in the database.

A DBMS is a combination of five components: hardware, software, data, users and

procedures.

Attribute Data management in GIS

Attribute data are stored in table. A table is organized by row and column. Each row

represents a spatial feature. Each column describes a characteristic/property. The

intersection of a column and row shows the value of a particular characteristic for a

particular feature. A row is also called record or tuple and column is also called

field or item or attribute.


19

Types of Attribute Table:

1. Feature Attribute Table:

Having access to the spatial data. Every vector data set must have feature

attribute table.

2. Non-Spatial Attribute Table:

It means the table does not have direct access to the geometry of features but

has a field that can link the table to the feature attribute table whenever

necessary.

Database Management

The presence of feature attribute table and non-spatial data tables means that a GIS

requires a database management system (DBMS) to manage these tables.

Database Model

1. Hierarchical

Stores data as hierarchically related to each other. Record shape are tree

structure.

2. Network

In the network model, the entities are organized in a way that some entities

can be accessed through several paths.

3. Relational

A relational database is a collection of tables, also called relations, which

can be connected to each other by keys.

In Relational database, a table is a collection of data elements organized in terms of rows

and columns. A table is also considered as convenient representation of relations. But a

table can have duplicate rows while a true relation cannot have duplicate rows.

Table is the simplest form of data storage. Below is an example of Employee table.

A Primary Key represents one or more attributes whose value can uniquely identify a

record in a table. Its counterpart in another table for the purpose of linkage is called

a Foreign Key. A key common to two tables can establish connections between

corresponding records in the tables.

Record:

A single entry in a table is called a Record or Row. A Record in a table represents set of

related data. For example, the above Student table has 4 records. Following is an

example of single record.


20

Field:

A table consists of several records(row),

each record can be broken into

several smaller entities known as Fields.

The above Student table consist of four

fields, ID, Name, Roll and Add Heading of

Column

Column:

In Relational table, a column is a set of value of a particular type. The term Attribute is

also used to represent a column. For example, in Student table, Name is a column that

represent names of employee.

Database Key

Keys are very important part of Relational database. They are used to establish and

identify relation between tables.

They also ensure that each record within a table can be uniquely identified by

combination of one or more fields within a table.

Types of Key:

Keys are very important part of Relational database. They are used to establish and

identify relation between tables. They also ensure that each record within a table

can be uniquely identified by combination of one or more fields within a table.

1. Primary Key

A primary key is a candidate key that is most appropriate to be the main reference

key for the table. As its name suggests, it is the primary key of reference for the

table and is used throughout the database to help establish relationships with other

tables. As with any candidate key the primary key must contain unique values,

must never be null and uniquely identify each record in the table.

2. Composite Key

When a primary key is created from a combination of 2 or more columns, the

primary key is called a composite key. Each column may not be unique by itself

within the database table but when combined with the other column(s) in the

composite key, the combination is unique.


21

3. Foreign Key

A foreign key is generally a primary key from one table that appears as a field in

another where the first table has a relationship to the second. In other words, if we

had a table A with a primary key X that linked to a table B where X was a field in B,

then X would be a foreign key in B.

Relation Types in Relational Database

One to one

One to Many

Many to One

Many to Many

Object-oriented models define a database as a collection of objects with features and

methods. A detailed discussion of object-oriented databases follows in an advanced

module.

Relational Database Design

The design of any database is a lengthy process and involved task that can only

be done following a step-by-step procedure.

The first step normally involves interviewing potential users of the database [Need

Assessment]

The second step is to build an entity-relationship model (ERM)

that defines the entities, the attributes of those entities and

the relationship between those entities.

The E-R (entity-relationship) data model views the real world

as a set of basic objects (entities) and relationships among these objects.

Entity-Relationship Model (ERM)

The database designer creates an entity-relationship (E-R) diagram to show the entities

for which information needs to be stored and the relationship between those

entities. E-R diagrams uses several geometric shapes.

Rectangles - represent entity sets

Ellipses - represent attributes

Diamonds - represent relationship sets

Lines - link attributes to entity sets and link entity sets to

relationships sets.


22

Database Normalization

Normalization means of breaking data into its related groups and defining the

relationships between those groups. Database Normalization is a technique of organizing

the data in the database. Normalization is a systematic approach of decomposing

tables to eliminate data redundancy and undesirable characteristics like Insertion,

Update and Deletion Ana molies. It is a multi-step process that puts data into tabular form

by removing duplicated data from the relation tables.

Normalization is the process of efficiently organizing data in a database. There are several

goals of the normalization process:

To avoid redundant data in tables that waste space in the database and may

cause data integrity problems

To ensure that attribute data in separate tables can be maintained and

updated separately and can be linked whenever necessary

To facilitate a distributed data base

They are worthy goals as they reduce the amount of space a database

consumes and ensure that data is logically stored.

There are different steps or forms of normalization, normally denoted as 1NF (First

Normal Form, 2NF (Second Normal Form, 3NF (Third Normal Form)

Normalization can also be thought of as a trade-off between data redundancy

and performance.

Normalizing a relation reduces data redundancy but introduces the need for

joins when all of the data is required by an application such as a report query.


23


24

Advantages

The relational database is simple and flexible

Each table can be prepared, maintained and edited separately

The tables can remain separate until query and analysis requires that is attribute

data from different tables establish a temporary link generally

As link are temporary that makes data management and data processing easy

Object Base

Object-oriented models define a database as a collection of objects with features and

methods. A detailed discussion of object-oriented databases follows in an advanced

module.

Operation on Database

In a relational database we can define several operations to create new relations

based on existing ones.

Some operations are

Insert

Delete

Update

Select

Structured Query Language

Structured Query Language (SQL) is the language standardized by the American

National Standards Institute (ANSI) and the International Organization for Standardization

(ISO) for use on relational databases. It is a declarative rather than procedural language,

which means that users declare what they want without having to write a step-by-step

procedure. The SQL language was first implemented by the Oracle Corporation in 1979,

with various versions of SQL being released since then.

Insert

The operation inserts a new row into the relation. The insert operation uses the following

format.


25

“INSERT TO[RELAION_NAME] VALUES[(ss, ss, dd)]”

Delete operation

The operation delete row(s) based on criteria from the relation. The Delete operation uses

the following format:

“DELETE FROM [RELATION_NAME] WHERE [CRITERIA]”

Update Operation

The operation changes the values of some field(s) (attribute) of row(s) in a relation. The

update operation uses the following format

“UPDATE [RELATION_NAME] SET [FIELD1= VALUE1], SET [FIELD2=[VALUE2] ,… WHERE [CRITERIA]”


26

Select Operation

The tuples (rows) in the resulting relation are a subset of the tuples in the original

relation

“SELECT [F1], [F2] FROM [RELATION_NAME] WHERE [CRITERIA]”

Select Distinct Operation

SELECT DISTINCT statement is used to

return only distinct (different) values.

SELECT DISTINCT [F1] , [F2] FROM

[RELATION_NAME] ;

SELECT DISTINCT [FIELD NAME] FROM

[RELATION_NAME]

Join Operation

The join operation is a binary

operation that combines two relations

on common attributes.

Union Operation

The SQL UNION operator combines the result of two or more SELECT statements. Each

SELECT statement within the UNION must have the same number of columns. The columns

must also have similar data types. The columns in each SELECT statement must be in the

same order. The UNION operator selects only distinct values by default. To allow

duplicate values, use the ALL keyword with UNION.

“SELECT [FIELD_NAMES] FROM [RELATION1]

UNION

SELECT [FIELD_NAMES] FROM [RELATION2]”


27

Suitable Land for Cyclone Shelter Construction

Mapping units of land type: Khas land, Private land, Classified Land

Mapping units of Elevation: 0 – 100m

Set A: Khas land

Set B: Elevation >= 50m

X = A AND B finds all occurrence of Khas land with elevation >= 50 m

X = A OR B finds all occurrence of Khas land, and all elevation >=50 m

X = A NOT B finds all occurrences that are Khas land where the elevation is less than 50m

X = A XOR B finds all occurrence that are neither Khas land or have elevation >=50 m.

Raster Algebra

Is an algebraic framework for performing operations on data stored in a geographical

information system (GIS)? Allows the user to model different problems and to obtain

new information from the existing data set. Mathematical combinations of layers.

What you have just seen is the basis for the map algebra language in ArcGIS Grid and

Spatial Analyst

Local functions

Focal functions

Zonal functions

Global functions Local Focal Zonal Global


28

a. Local

Sometimes called layer functions. Work on every single cell in a raster layer Cells

are processed without reference to surrounding cells Operations can be

arithmetic, trigonometric, exponential, logical or logarithmic functions. New layer

is a function of two or more input layers Output value for each cell is a function of

the values of the corresponding cells in the input layers.

• Arithmetic operations +, -, *, /, Abs, …

• Relational operators >, <, …

• Statistic operations Min, Max, Mean, Majority, …

• Trigonometric operations Sine, Cosine, Tan, Arcsine, Arccosine,

…

• Exponential and logarithmic operations Sqr, sqrt, exp, exp2, …

b. Focal

Compute an output value for each cell as a

function of the cells that are within its

neighborhood.

Widely used in image processing with different

names

– Convolution, filtering, kernel or moving

window


29

The simplest and most common neighborhood is a 3 by 3 rectangle window.

Others are a rectangle, a circle, an annulus (a donut) or a wedge

c. Zonal

Compute a new value for each cell as a

function of the cell values within a zone

containing the cell

Zone layer

o defines zones

Value layer

o contains input cell values

Zonal Statistical Operation

• Calculate statistics for each cell by using all the cell values within a zone

o Zonal Mean, Zonal Median, Zonal Sum, Zonal Minimum, Zonal Maximum,

Zonal Range, Zonal Majority, Zonal Variety

Outputs of Zonal Operations

• Raster layer

– All the cells within a zone have the same value on the output raster layer

• Table

– Each row in the table contains the statistics for a zone.

– The first column is the value (or ID) of each zone.

– The table can be joined back to the zone layer.


30

d. Global

Operations that compute an output raster where the value of each output cell is

a function of all the cells in the input raster

Global statistical operations

Distance operations.

o Euclidean distance

o Cost distance

1. Distance operations

Characterize the relationships between each cell and source cells (usually

representing features)

o Distance to nearest source cell

Euclidean Distance

Calculates the shortest straight distance from each

cell to its nearest source cell (EucDistance)

Assigns each cell the value of its nearest source cell

(EucAllocation)

Calculates the direction from each cell to its nearest source cell

(EucDirection)

Cost Distance

Compute the least accumulative cost from each

cell to its least-cost source cell

Source raster

o Representing features (points, lines, and

polygons)

o No-source cells are set to NODATA value

Friction raster

o Cost encountered while moving in a cell (distance, time, dollars and efforts)

o Unit is: cost per unit distance

o Can have barriers (NODATA cells)

Resampling

Cell Size

Different raster datasets may not have the same cell resolution. But during processing

between multiple datasets, the cell resolution ideally should be the same. When multiple

raster datasets are input into any analysis and their resolutions are different, one or more

Friction

s


31

of the input datasets will be automatically resampled to the coarsest resolution of the

input datasets.

Resampling

To find the value each cell in the resampled output raster, the center of each

cell in the output must be mapped to the original input coordinate system. Each cell

center coordinate is transformed backward to identify the location of the point on the

original input raster.

Once the input location is identified, a value can be assigned to the output location

based on the nearby cells in the input. It is rare that an output cell center will align exactly

with any cell center of the input raster. Therefore, techniques have been developed

to determine the output value depending on where the point falls relative to the

center of cells of the input raster and the values associated with these cells.

The three techniques for determining output values are-

• Nearest neighbor assignment,

• Bilinear interpolation,

• Cubic convolution.

Each of these techniques assigns values to the output differently. Thus the values assigned

to the cells of an output raster may differ according to the technique used.

Nearest Neighbor

The nearest neighbor technique assigns the value of the cell

whose center is closest to the center of the output cell. It is

the resampling technique of choice for discrete, or

categorical, raster data, such as land-use raster’s, because

it does not change the value of the input cells.

In the image below, the output raster is resampled from a

rotated input raster. The cell centers of the input raster are in

gray. The value for one of the cell on the output raster (in

red) is derived by identifying the nearest cell center on the

input raster (the blue spot) and assigning its value to the

output cell center.

Bilinear Interpolation

Bilinear interpolation uses the value of the four nearest input cell

centers to determine the value on the output raster. The new value

for the output cell is a weighted average of these four values,

adjusted to account for their distance from the center of the output

cell. Since the values for the output cells are weighted based on

distance, and then averaged, the bilinear interpolation is best used

for data where the location from a known point or phenomenon


32

determines the value assigned to the cell. For example, elevation, slope, magnitude of

earthquake from the epicenter.

In the image below, the output raster is resampled from a rotated input raster. The cell

centers of the input raster are in gray. The value for one of the cell on the output raster

(in red) is derived by identifying the four nearest cell centers on the input raster (the four blue spots) and assigning the weighted average of the four values to the output cell.

Cubic Convolution

Cubic convolution is a resampling technique similiar to bilinear

interpolation except that the weighted average is calculated

from the values of the 16 nearest input cell centers. Compared

with bilinear interpolation, cubic convolution has a tendency to

sharpen the edges of the input data since more cells are involved

in the calculation of the output value.

In the image below, the output raster is resampled from a rotated

input raster. The cell centers of the input raster are in gray. The

value for one of the cell on the output raster (in red) is derived by identifying the sixteen

nearest cell centers on the input raster (the four blue spots) and assigning the weighted

average of the sixteen values to the output cell.

Interpolation

Interpolation is the procedure of estimating the value of properties at unsampled points

or areas using a limited number of sampled observations.

Point wise interpolation

o Thiessen polygon

o Weighted Average

Interpolation by curve fitting

Exact interpolation

o Nearest neighbor

o Linear interpolation

o Cubic interpolation

Approximate interpolation

o Moving Average

o B-spline

o Curve Fitting by Least Square Method

Interpolation by surface fitting

3.1 Regular grid

o Bilinear Interpolation

o Cubic Interpolation

3.2 Random points


33

o TIN

Point wise interpolation

Point wise interpolation is used in case the sampled points are not densely

located with a limited influence or continuity in surrounding observations, for

example climate observations such as rainfall and temperature, or ground

water level measurements at wells.

a. Thiessen Polygon

Thiessen polygons can be generated using distance operator which

creates the polygon boundaries as the

intersections of radial expansions from the

observation points.

This method is also known as Voronoi

tessellation.

Each Thiessen polygon contains only a

single point input feature. Any location

within a Thiessen polygon is closer to its associated point than to any

other point input feature

b. Weighted Average

A window of circular shape with the radius of dmax is drawn at a point

to be interpolated, so as to involve six to eight surrounding observed points.

Then the value of a point is calculated from the summation of the

product of the observed value zi and weight wi, divided by the summation

of the weights. The weight functions commonly

used are the function of distance as follows

Weight Function Properties

0 order -Average without consideration of distance

1st order -Nearest points have a little influence

2nd order -Nearest points have moderate influence

3rd order -Nearest points have very strong influence


34

Curve fitting is to interpolate

The principle of curve fitting is to interpolate the value at an unsampled point using

surrounding sampled points.

Curve fitting is an important type of interpolation in many applications of GIS.

Curve fitting is divided into

two categories;

Exact interpolation :

a fitted curve passes

through all given

points

Approximate

interpolation : a

fitted curve does

not always pass

through all given

points

Exact Interpolation

1. Nearest Neighbor

The same value as that of the observation is given within the proximal

distance

2. Linear

A piecewise linear function is applied between two adjacent points

3. Cubic interpolation

A third order polynomial is applied between two adjacent points under

the condition that the first and second order differentials should be

continuous. Such a curve is called "spline"

Nearest Cubic Linear


35

Approximate Interpolation

a. Moving Average

A window with a range of -d to +d is set to average

the observation

within the region

b. B-Spline

A cubic curve is determined by using four adjacent observations

c. Least square method

(Sometimes called regression model) is a statistical approach to estimate an

expected value or function with the highest probability from the observations with

random errors. The highest probability is

replaced by minimizing the sum of square

of residuals in the least square method.

Residual is defined as the difference

between the observation and an

estimated value of a function.