ENGRG 59910Introduction to
GISMichael Piasecki
October 06, 2017
Lecture 05: GIS and Database Basics
•Basic geographic concepts• Introduction to GIS, coordinate system, projection, datum
•Data: Acquisition, Input and Management• Data model: vector vs. raster• Data source: map, attribute data (geocoding), GPS, remote sensing• Data input: digitizing• Data quality and meta data• Data management: database
•Analysis
•Output: map design
Where are we now?
October 4, 2017 ENGRG 59910 Intro to GIS 2
Context of what we are learning
October 4, 2017 ENGRG 59910 Intro to GIS 3
Satellite Images
Aerial Photographs
Maps
Digitizing
GPS (later)
Non‐spatial data(Attribute Data)
Spatial Database
Geo‐Reference
Join, Relate, Geocoding
• Introduction to Database Management System (DBMS)•Recognize four database types
•Relational Database Basics•Basic understanding of database theory
•GeoDatabase Overview
Today’s Outline
October 4, 2017 ENGRG 59910 Intro to GIS 4
Evolution of GIS Environments
October 4, 2017 ENGRG 59910 Intro to GIS 5
• Flat Files• Flat files are easier to understand• Difficult to manage and manipulate• Large file size
• Database• Data is organized or structured using a database model• Reduce data redundancy• Data integrity is improved• Can be “queried”‐‐many databases use the same language, SQL
(Structured Query Language), for formulating queries.
Flat File vs. Database
October 4, 2017 ENGRG 59910 Intro to GIS 6
DATABASE = Data file(s) + data organization + processing ready
•Goal for any DBMS: efficient searching and linking of tabular data.
•GIS DBMS Goal: efficient manipulation (including search and linking) of spatial objects (points, lines, polygons, polylines), relationships between objects and tabular data (i.e., topology, attributes).
Database Management Systems
October 4, 2017 ENGRG 59910 Intro to GIS 7
Data Software Hardware
Database Management SystemDBMS
•Field: One item of information per object (column)
Forest Trail Feature
Nantahala Bryson’s Knob Vista
Cherokee Slickrock Falls Ogrth
Pisgah Chimney Rock Wlife
Field vs. Column; Record vs. Row
October 4, 2017 ENGRG 59910 Intro to GIS 8
•Record: Information items about one object (row)
Typically you operate on a field or column and select records or rows. A map object lights up when a row is selected
• There are four basic database structures:• Traditional
• Hierarchical• Network• Relational
• Recent development • Object Oriented (O‐O)
•Relational database is most widely used.
Ways of Organizing Information
October 4, 2017 ENGRG 59910 Intro to GIS 9
Evolution of DBMS Technology
October 4, 2017 ENGRG 59910 Intro to GIS 10
File System
Hierarchical DBMS Network DBMS
Relational DBMS Object-Oriented
System (OODBMS)
Object-Relational ORDBMS
Example of Data Organization
October 4, 2017 ENGRG 59910 Intro to GIS 11
Example of Data Organization
October 4, 2017 ENGRG 59910 Intro to GIS 12
The Basic of Relational Database
October 4, 2017 ENGRG 59910 Intro to GIS 13
Database includes multiple tables
Tables are joined by relationships
•Relational model is grounded in mathematics: relational algebra defines the mathematical rules by which tables are manipulated.
•Any kind of attribute search (lateral, vertical) is possible.
Examples of relational database programsMicrosoft Access, Microsoft® SQL Server™, Oracle, DB2, FoxPro,MySQL, postgrSQL
Relational Database
October 4, 2017 ENGRG 59910 Intro to GIS 14
Eliminate duplicate information
Assist in querying data
Simpler to manipulate data
Reduce disk space
Relational model has been most successful within GIS (and within the database world in general)
Why Use a Relational Database?
October 4, 2017 ENGRG 59910 Intro to GIS 15
• Keys – used to create uniqueness and link tables together• Primary Key: Uniqueness, eliminate Redundancy• Foreign Key: Linking tables, establishes relationships between tables
Relational Database Terminology: Key fields
October 4, 2017 ENGRG 59910 Intro to GIS 16
Primary Key
October 4, 2017 ENGRG 59910 Intro to GIS 17
• Primary keys uniquely identify each record in a table.• Primary keys become the foreign key in another table
Foreign Key
October 4, 2017 ENGRG 59910 Intro to GIS 18
Relational Database Terminology: Cardinalities
October 4, 2017 ENGRG 59910 Intro to GIS 19
ENGRG59910
ENG Courses
StudentsOne‐to‐Many (1:M)
Many students attend this class
StudentsCCNY ID
One‐to‐One (1:1)
Each CCNY student has a unique ID number (i.e., functional redundancy)
Students Classes
Many‐to‐Many (M:M)
Many students are enrolled in many classes
Each student is taking manyclasses
Each class has many students
One‐to‐one Relationships
October 4, 2017 ENGRG 59910 Intro to GIS 20
•Only one matching record•Uses primary key for both tables•Use to limit access or isolate information
One‐to‐Many Relationships
October 4, 2017 ENGRG 59910 Intro to GIS 21
• Most common type of relationship• Related between primary and foreign keys
Many‐to‐Many Relationships
October 4, 2017 ENGRG 59910 Intro to GIS 22
• Not directly supported between tables• Use a junction table to relate• One order, many products• One product, many orders
Relational Database Terminology: Referential Integrity
October 4, 2017 ENGRG 59910 Intro to GIS 23
PK
FK
• Maintain data accuracy• Prevents orphan records• Keeps relationships intact
Relational Database Terminology: Referential Integrity
October 4, 2017 ENGRG 59910 Intro to GIS 24
•Reduce Duplication
• Improve Accuracy
•Data Maintenance
Form Analysis: Normalization
October 4, 2017 ENGRG 59910 Intro to GIS 25
Looking for:When creating tables you usually try to get a single subject to one table.
No duplication (redundancy)
Create relationships between tables.
Form Analysis: Normal Forms
October 4, 2017 ENGRG 59910 Intro to GIS 26
•Notice the redundancies in this table
• Put database into the First Normal Form
•Basically speaking is a rearrangement of data
•Redundant columns are removed
• Functional dependencies are rampant• E.g., Tship_ID Tship_name, Thall_add
• Tship_ID = 12 always is named “Birch” and is located at latitude 15W
•Allows you to identify groups!
First Normal Form
October 4, 2017 ENGRG 59910 Intro to GIS 27
Parcels
Parcel_ID
Alderman
Tship_ID
Thall_add
Tship_name
Own_ID
Own_name
Own_add
Fields in theparcel dataset
First Normal Form
October 4, 2017 ENGRG 59910 Intro to GIS 28
Second Normal Form
October 4, 2017 ENGRG 59910 Intro to GIS 29
Parcel_ID
Alderman
Tship_ID
Tship_name
Own_ID
Thall_add
Own_name
Own_add
Functional dependency reduced by splitting data into like groups, possibly establishing a table to define the relationship between the first two
Third Normal Form
October 4, 2017 ENGRG 59910 Intro to GIS 30
Third Normal Form
October 4, 2017 ENGRG 59910 Intro to GIS 31
Parcel_ID
Alderman Tship_ID
Tship_ID
Tship_name Thall_add
Own_ID
Own_name Own_add
Transitive functional dependencies are removed
Parcel_ID Own_ID
•Most GIS packages still keep using hybrid solution: spatial data + attribute data (Arc+Info)
• The emergence of spatial database changes the way. Now many DBMS support spatial database: Oracle, DB2, MS SQL Server (commercial), and MySql, PostGreSql (open‐source)
Summary of Databases and GIS
October 4, 2017 ENGRG 59910 Intro to GIS 32
Spatial Data
DBMSGeometric
DBMSAttribute
Geometric data•Usually hierarchical
• Invisible to the user
Attribute data• Almost entirely relational
•Manipulated by the user
•What is Geodatabase?
• Type of Geodatabase
•Geodatabase objects
Geodatabase
October 4, 2017 ENGRG 59910 Intro to GIS 33
What is geodatabase ?
October 4, 2017 ENGRG 59910 Intro to GIS 34
A geodatabase (short for geographic database) is a physical store of geographic information (spatial, attribute, metadata, and relationships) inside a relational database management system (RDBMS).
• Personal Geodatabase for Microsoft Access• File Geodatabase (new since V9.2)•Workgroup Geodatabase (new since V9.2)
• SQL Server Express• Enterprise Geodatabase:
• 5 supported DBMSs:DB2, Informix, Oracle, MS SQl Server, PostgreSQL
Geodatabase Types Since Ver9.2
October 4, 2017 ENGRG 59910 Intro to GIS 35
Increasing size and functionality
http://resources.arcgis.com/en/help/main/10.1/index.html#//003n00000007000000
What does a Geodatabase look like?
October 4, 2017 ENGRG 59910 Intro to GIS 36
What does a Geodatabase look like?
October 4, 2017 ENGRG 59910 Intro to GIS 37
Personal GeoDatabase (Access)
October 4, 2017 ENGRG 59910 Intro to GIS 38
Geodatabase (file‐based)
October 4, 2017 ENGRG 59910 Intro to GIS 39
Geodatabase objects
October 4, 2017 ENGRG 59910 Intro to GIS 40
•basic objects: ‐ feature classes, ‐ feature datasets,‐ nonspatial tables.
• complex objects building on the basic objects:
‐ topology, ‐ relationship classes, ‐ geometric networks
•A feature class is a geographic feature include points, lines, polygons, and annotation feature class.
• Feature classes may exist independently in a geodatabase as stand‐alone feature classes or you can group them into feature datasets
Feature classes
October 4, 2017 ENGRG 59910 Intro to GIS 41
The SouthAmerica geodatabase contains four stand-alone feature classes:a point feature class of cities, a dimension feature class of distances between cities, a polygon feature class of countries, and an annotation feature class of country names
Source: www.esri.com
Feature datasets
October 4, 2017 ENGRG 59910 Intro to GIS 42
•A feature dataset is composed of feature classes that have been grouped together so they can participate in topological relationships with each other. All the feature classes in a feature dataset must share the same spatial reference (or coordinate system)
• Edits you make to one feature class may result in edits being made automatically to some or all of the other feature classes in the feature dataset
In the CityWater geodatabase, three point feature classes and one line feature class were groupedinto the PublicWater feature datasetto create a geometric network called WaterNet.
Source: www.esri.com
• Feature class tables and nonspatial attribute tables.
•Both types of tables are created and managed in ArcCatalog and edited in ArcMap. Both display in the traditional row‐and‐column format. The difference is that feature class tables have one or more columns that store feature geometry.
•Nonspatial tables contain only attribute data (no feature geometry) and display in ArcCatalog with the table icon . They can exist in a geodatabase as stand‐alone tables, or they can be related to other tables or feature classes.
Tables
October 4, 2017 ENGRG 59910 Intro to GIS 43
The cfcc_desc table in the SantaBarbara geodatabase contains attribute data for the Roads feature class (stored inside the Roads feature dataset).
Source: www.esri.com
• Feature: A geographic representation of a spatial object• Features: One row in a table represents one feature• Feature Classes: one table or more than one table• Feature Dataset: a set of feature classes
Organizing Geographic Features
October 4, 2017 ENGRG 59910 Intro to GIS 44
GeoDatabase Elements
October 4, 2017 ENGRG 59910 Intro to GIS 45
Feature class
Geometric network
Annotation class
Geodatabase
Relationship class
Table
Feature data set
• In a GIS, spatial relationships among feature classes in a feature dataset are defined by topology. You can choose whether to create topology for features.
• The primary spatial relationships that you can model using topology are adjacency, coincidence, and connectivity
• There are three types of topology available in the geodatabase: geodatabase topology (over 20 topology rules), map topology, and geometric network topology. Each type of topology is created from feature classes that are stored within a feature dataset. A feature class can participate in only one topology at a time
Topology
October 4, 2017 ENGRG 59910 Intro to GIS 46
Example of Topology in a Geodatabase
October 4, 2017 ENGRG 59910 Intro to GIS 47
Geometric Networks
October 4, 2017 ENGRG 59910 Intro to GIS 48
• In the real world, examples of networks abound: streams joining together to form larger streams, pipes carrying water to homes and businesses throughout a city, and power lines carrying electricity.
• In a geodatabase, you can model each of these real‐world networks with a geometric network. Starting with simple point and line feature classes, you use ArcCatalog to create a geometric network that will enable you to answer questions such as: Which streams will be affected by a proposed dam? Which areas will be affected by a water main repair? What is the quickest route between two points in the network? Source: www.esri.com
Geometric Network example
October 4, 2017 ENGRG 59910 Intro to GIS 49
Lateral
Service
Main
Feed
ValveFeature Classes
Source: ESRI European User Conference
Geometric Network
Relationship Classes
October 4, 2017 ENGRG 59910 Intro to GIS 50
• In a geodatabase, relationship classes provide a way to model real‐world relationships that exist between objects such as parcels and buildings or streams and water sample data. By using relationship classes, you can make your GIS database more accurately reflect the real world and facilitate data maintenance.
The relationships stored in a relationship class can be between two feature classes (such as buildings and parcels, top) or between a feature class and a nonspatial attribute table (such as streams and water quality sampling data, bottom).
Source: www.esri.com
• Provided by ESRI http://support.esri.com/index.cfm?fa=downloads.dataModels.gateway
•Goal: provide a practical template for implementing GIS projects
• Start to think about your final project now
•Great start point for your GIS project
ESRI data models
51October 4, 2017 ENGRG 59910 Intro to GIS
• Address • Agriculture • Archiving • Atmospheric • Basemap• Biodiversity • Census‐Administrative Boundaries • Defense‐Intel • Energy Utilities • Energy Utilities ‐ MultiSpeak TM • Environmental Regulated Facilities • Forestry• Geology• GIS for the nation• Groundwater
Industry‐specific Data models
October 4, 2017 ENGRG 59910 Intro to GIS 52
• Health • Historic Preservation and Archaeology • Homeland Security • Hydro • International Hydrographic Organization (IHO) S‐57 for ENC
• Land Parcels • Local Government • Marine • National Cadastre• Petroleum • Pipeline • Raster • Telecommunications • Transportation • Water Utilities
From: http://support.esri.com/index.cfm?fa=downloads.dataModels.gateway
Data model: national GIS
53
You can download it from ESRI website directly
October 4, 2017 ENGRG 59910 Intro to GIS
•Definition• Query is the action or result of selecting a subset of records based on specific attribute values
•General Categories:• Attribute (Tabular) • Spatial
• Two main methods:• Boolean operators (AND, OR, NOT)• SQL operators (< > + = …)\
• Structured querying language (SQL)• The mathematical basis of relational databases led to a standard languages for querying data (SQL) that uses simple mathematical operators
•Relational databases allow the user to “nest” operations for complex queries
Queries
October 4, 2017 ENGRG 59910 Intro to GIS 54
Set Operators
= Equal< > Not Equal< Less Than> Greater Than<= Less Than or Equal>= Greater Than or Equal
Relational Operators UnionIntersectionDifferenceProduct
Database Use: Structured Query Language (SQL)
October 4, 2017 ENGRG 59910 Intro to GIS 55
Aggregate Functions: “Summarize”
• Sum of values for all rows for a given column.
•Average of given column•Column Maximum•Column Minimum•Number of Rows (Count) that Satisfy a Condition
• Two Statements:•Females / Pop1990 < 0.55, then •Pop1990 > 100000 and Pop1990 < 200000
• Or, one statement:(Females/Pop1990 < 0.55) and ((Pop1990 > 100000) and (Pop1990 < 200000))
Compute all instances where the % of females in the 1990 population isless than 55%
Then identify all population centers for the 1990 census where this trueif these are larger than 100,000 and less than 200,000
A Nested SQL Statement
October 4, 2017 ENGRG 59910 Intro to GIS 56
• Point Queries• what is at a particular location?
•Range Queries• what is in a particular area?
•Nearest Neighbor Queries• where is the nearest object to a particular location?
• Spatial Join Queries• where are the areas that have water supply and power supply?
• Spatial Aggregate Queries• where is the most populated region?
Spatial Queries
October 4, 2017 ENGRG 59910 Intro to GIS 57
•Queries – selection operations that produce data subsets
• Join and Relate – bringing data together(one table with non‐spatial attribute, one table with features)
Common Attribute Operations
October 4, 2017 ENGRG 59910 Intro to GIS 58
• General Categories:• Tabular – based on some
information within the attribute tables, e.g., a common field
• Spatial – based on location: nearest, within or aggregate
• Geodatabase Relationship class
• Strategies• Join• Relate
Relations
October 4, 2017 ENGRG 59910 Intro to GIS 59
CountyPerson
Age
Polygon_id = 157
Gpsid = 29LC =
Agriculture
Forest-ID ForestName
1 Nantahala
2 Cherokee
Join vs. Relate
October 4, 2017 ENGRG 59910 Intro to GIS 60
• Join• Appends fields from second table —with data for each record where a key field match is found (empty, otherwise)
• For 1:1 or M:1 only• In 1:M or M:M, it stops with first hit (can’t add rows/records for additional relationships)
•Relate• Allows automatic access to a related table’s records; keep tables physically separate
• For 1:M or M:M• Doesn’t add records to layer’s table, so not limited by initial table’s size
Forest-ID Trail_Name Features Trailhead
1 Bryson's Knob Vista X1, Y12 Slickrock Falls Ogrth X2, Y21 North Fork Wfall X3, Y32 Cade's Cave Wlife X4, Y41 Appalachian Cmp X5, Y5
•Related two attribute tables via a shared location in space, rather than common field
•Concept: A type of join operation in which fields from one layer’s attribute table are appended to another layer’s attribute table based on the relative locations of the features in the two layers.
•Examples• Finding the nearest feature • Finding what's inside a polygon • Finding what intersects a feature•Where is there an incompatibility between zoning and current land use? •How many Toxic Release Inventory sites are there in each county and what are the total releases per county?
Spatial Join
October 4, 2017 ENGRG 59910 Intro to GIS 61
URL: http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=Learn%20more%20about%20spatial%20relationships
• Input layers: Population in US cities; US states;•Object: the city population in each state; how many cities in each state;• Solutions:
• Count/summarize manually?• Simple Query in DBMS (very easy but limited by data organization in many cases): select state, sum(pop) as cityPop , count(city) as totalcityfrom cities group by state
• Spatial join
Spatial Join: An example
October 4, 2017 ENGRG 59910 Intro to GIS 62
Spatial Join Example
October 4, 2017 ENGRG 59910 Intro to GIS 63
• Introduction to Database Management System (DBMS)• 4 types DBMS
•Relational Database Basics• 2 keys• 3 relations• Data Normalization Form analysis
• Spatial Database Overview and Operation• Join vs relate• Spatial join
What did we learn today?
October 4, 2017 ENGRG 59910 Intro to GIS 64