Post on 03-Apr-2022
transcript
Chapter 2. Concepts & arch of DB sys 1. Introduction processing & analysis of spatial data – becoming increasingly dependent on the use of DBMS rather than conventional GIS → WHY ? 1. cost effective sys security, DB integrity, backup & recovery, data replication 2. close integration w/ mainstream business computing environments
2. DBs & DB sys
2.1 DB terminology problem space : business functions the DB is designed to address
data model : describing the problem space in a comprehensible way
DB schema : description of the DB including its tables & the relations among them
DB engine (or DB server): a collection of programs that manipulate the data in DB
data dictionary : describes the contents of the DB
DB integrity rules : rules to be enforced to protect the data in the DB
stored procedures : blocks of SQL code for defining, managing & querying data in DB
user interface : DB front-end by which users access & interact w/ DB
middleware : communication SW tools that support data transmission & data processing over networks
DBMS : composed of DB, DB engine, user interface, application programs, middleware
SoftwareDeveloper
DatabaseAdministrator
Database Management System(DBMS)
Development Tools andApplication Programs
End User
User Interface andMiddleware
DatabaseEngine /Server
Database
Data Files DataDictionary
Integrity RulesStored
Procedures
Data Model
DatabaseSchema
Problem Space
Data FilesData FilesData FilesData Files
Figure 2-1. Database terminology
Chapter 2. Concepts & arch of DB sys 2.2 Computer data organization & DB In early days – data were organized as independent data files – file processing systems development of modern DB
stores data + characteristics of data + integrity rules
separate data from applications – independence from business functions
provide essential tools for info resources or asset management in business, education, government
Zoning DataFiles
Buildidng PermitData Files
Assessment Data Files
ApplicationProgram in PL/1
ApplicationProgram in PL/1
ApplicationProgram in
COBOL
ApplicationProgram in
COBOL
(a) File-based data processing
Database Management System(DBMS)
Database
Data Files
IntegrityRules Stored
Procedures
DatabaseEngine /Server
UserInterface
UserInterface
(b) Integration of data files and processing procedures in a DBMS
ZoningReport
ZoningReport
BuildingPermit
BuildingPermit
AssessmentNotification
AssessmentNotification
PropertyTax Invoice
PropertyTax Invoice
UserInterface
OtherApplicationsOther
ApplicationsOtherApplications
Figure 2-2. File & DB processing
Chapter 2. Concepts & arch of DB sys 3. DB operations 3.1 DB storage & manipulation DB - large volumes of data ex. Banking & retailing : terabytes, RS : several petabytes/year
thus, DB system usually stores data in secondary storage devices
(cf. central processing unit (CPU) – primary storage)
even bigger is tertiary storage devices : terabyte storage capacities
transmission of data – by disk blocks : storage locations of 4,000-16,000 bytes of data
not by data files
DB systems control the storage of and access to data by means of the DB engine
: 2 data manipulation SW in DB engine
a. buffer manager : handles the main memory by allocating the data read from secondary storage to
a specific page
b. file manager : keeps track of the location of data files & their relationships w/ disk blocks
Chapter 2. Concepts & arch of DB sys 3.2 DB security & integrity constraints designed to protect the data in DB from being corrupted, compromised, destroyed
security : all rules & measures that are designed to protect the DB against unauthorized use, modification,
destruction of contents
discretionary security : controls the ability of users to access specific data files ex. read-only read-write
mandatory security : classifies users & data into different security levels - access control
integrity : to protect the value of the data by safeguarding their accuracy, correctness, validity
enforced by applying certain rules – called business rules
form an integral part of the DB schema
3 integrity constraints – domain c. : specify the types of data values ex. Numeric, character, Boolean
key & relationship c. : govern the use of entities as primary, secondary, foreign
keys
semantic integrity c. : written rules stating what is allowed & not allowed in
data structure & data management
Chapter 2. Concepts & arch of DB sys 3.3 DB query query : question / task a user asks of a DB handled by a DB tool – called query manager : turn users SQL into a sequence of operations
query manager – perform query optimization : answer the query in the most efficient way
use an index created for the DB query is made up of one / more operators
ex. relational DB model – 8 operators
: SELECT : lists all of the row / those which match specific criteria
PROJECT : generate a subset of columns from a table, removing duplicate values from the result
JOIN : combine one row of a table w/ rows from another table, use columns relationships
PRODUCT : concatenate every row in one table w/ every row in another table
UNION : generate a new table by appending rows from one table w/ those of another table
INTERSECT : generate a new table consisting of all rows appearing in both of two tables
DIFFERENCE : generate a new table consisting of all rows that appear in the 1st table but not in
the 2nd of two tables
DIVIDE : create a new table consisting of all values of one column of the binary table that match,
in the other column all values in the unary table
K x y1 A 322 B 743 C 56
R
K x y1 A 322 B 743 C 56
S
SELECT all
T xB
SELECT x WHERE K = 2
SELECT x, y WHERE K = 3 x yC 56
U
(a) The SELECT relational operator
K x y1 A 322 B 743 C 32
R y3274
SPROJECT y
(b) The PROJECT relational operator
K x y1 A 322 B 743 C 56
R K z 1 273 749 88
S
K x y K z 1 A 32 1 273 C 56 3 74
T
K x y z 1 A 32 273 C 56 74
U
R JOIN S Equi-JOIN
Natural JOIN
(c) The JOIN relational operator
K x 1 A 2 B 3 C
RRK Rx SK Sz 1 A1 A2 B 2 B 3 C 3 C
T1 273 741 273 741 273 74
K z 1 273 74
SR TIME S
(d) The PRODUCT relational operator
Figure 2-3. Relational operators
(e) The UNION relational operator
(g) The DIFFERENCE relational operator
(h) The DIVIDE relational operator
R UNION S
K x y1 A 322 B 743 C 56
R K x y8 X 311 A 327 C 62
SK x y1 A 322 B 743 C 568 X 317 C 62
U
K x y1 A 322 B 743 C 56
R K x y8 X 311 A 327 C 62
SK x y1 A 32
I
(f) The INTERSECT relational operator
R INTERSECT S
K x y1 A 322 B 743 C 56
R
K x y1 A 322 B 743 C 56
R
K x y8 X 311 A 327 C 62
SR DIFFERENCE S
S DIFFERENCE R
K x y2 B 743 C 56
D
K x yE8 X 317 C 62
K x y8 X 311 A 327 C 62
S
vXY
K58
5 X 5 Y 5 Z 6 Q 6 W 7 M 7 N 8 G 8 X 8 Y 8 V
K v
S DIVIDE R
R
S Q
Figure 2-3. Relational operators
Chapter 2. Concepts & arch of DB sys 3.4 DB transactions transactions : changing values in DB
transaction design principles
: atomicity : transaction can never be completed only partially
consistency preservation : data remain in a consistent state as specified by the DB schema,
constraints, integrity rules
isolation : transactions to be independent of each other
durability / permanency : after transaction complete, its results can always be traced
transaction is controlled by the transaction manager of the DB engine
4 transaction control mechanisms
: concurrent control : locks the data items involved in the transaction
logging the transactions : keeps track of all changes made to the DB in a redo log
transaction commitment : prevents any changes to the DB unless the transaction is ready to complete
rollback : allows the DB to undo an incomplete transaction process
Database
Main Memory
Redo LogBuffer
Shared Memory
Database Buffer Cache
Redo Log
Control Files
Retrieve data
Lock
Update
Save updateProcess update
Write to log
Record update
Commit
Committed
UnlockServerComputer
2
48
3
6 5
7
2Lock
Unlock
Backup
Retrieve data
9
1Client Computer
Database Engine
1
10
10
Figure 2-4. Steps in a DB transaction
Chapter 2. Concepts & arch of DB sys 3.5 DB backup & recovery to restore the DB in a catastrophic failure such as a disk crash copy periodically the entire DB & the transaction log on to an external storage medium 3.6 DB replication & synchronization to support business needs in distributed DB systems : improve system performance, DB availability process of making a copy of a DB onto one / more additional computers located at different sites
Chapter 2. Concepts & arch of DB sys 3.7 Structured query language (SQL) standard language for querying & managing DB
non-procedural computer language – does not have IF, FOR, WHILE, GOTO, CASE
DB sub-language of about 200 words
SQL statements can be used in 5 ways
: interactive processing thru a command-line user interface
embeded in a high-level computer language
using a call level interface(CLI)
using 2 standard Java application programming interface(API) protocols : Java DB connectivity(JDBC)
embeded SQL for Java(SQLJ)
in a stand alone application program modules in the form of stored procedures, function, packages
used in typical DB operations
: DB query, data definition, data manipulation, DB connection & access control, data sharing, data
integrity
Figure 2-5. Examples of DB operations using SQL
SELECT parcel_id, areaFROM lu_2002WHERE lu_code = 'agr'
CREATE TABLE lu_2002 (parcel_id VARCHAR2(8) PRIMARY KEY, lu_code VARCHAR2(3) NOT NULL, area NUM(8,2) NOT NULL, survey_date DATE DEFAULT SYSDATE)
INSERT INTO lu_codeVALUES ('12322', 'res', 4500.00, '06-may-2002')
CONNECT system/passwordxxxCREATE ROLE db_maintenanceGRANT INSERT, UPDATE ON lu_2002 TO db_maintenanceGRANT db_maintenance TO john.young
CREATE TRIGGER copy_dataAFTER INSERT ON lu_2002FOR EACH ROWBEGIN INSERT INTO lu_2002_copy@fes.uwaterloo.ca VALUES (:new.parcel_id, :new.lu_code, :new.area, :new.survey_date);END;/
CREATE TABLE lu_2002 (parcel_id ARCHAR2(8) PRIMARY KEY, lu_code VARCHAR2(3) NOT NULL, area NUM(8,2) NOT NULL, survey_date DATE DEFAULT SYSDATE,CONSTRAINT UNIQUE (parcel_id),CONSTRAINT area_chk CHECK (area > 0.00)).
SQL Function / Example Explanation
(a) Data retrieval These SQL commands retrieve the identification numbers and areas of parcels whose land use code is 'agr' (agricultural) from data table lu_2002
(b) Data definition The SQL command CREATE TABLE specifies the structure and data types of the data table lu_2002. In the example, VARCHARS, NUM and DATE specify the data types of the four columns of the table; the numbers in the brackets specify the number of characters and digits string and numeric data types respectively. The value of survey_date will be set to the system date automatically if no date value is supplied duirng data entry. Note how the constraints PRIMARY KEY and NOT NULL are specified.
(c) Data manipulation These SQL statements add data into the table lu_2002. Character strings are put inside quotation marks but numerical values are not. Since a date value is given, the default system date as specified in the data structure in the previous example will be not used.
(d) Database connection and access control These SQL commands connect to the system, create a database maintenance role, and give this role to a user called john.young
This sequence of SQL commands copy newly input data to a data table at a remote site
(e) Data sharing
(f) Data integrity This example shows how data input constraints are imposed to ensure that the parcel_id is a unique number for each land parcel and that the area is not a negative value
Chapter 2. Concepts & arch of DB sys 4. HW & SW arch 4.1 Centralized & distributed DB arch 1) Centralized DB arch
early generations of DB systems
all processing is done in a single computer, all data are stored in the same secondary memory 2) Distributed DB arch
since 1990s, most DB systems today
include several components
: computers, OS, DB system SW, network cards & SW, communication network protocols(ex. TCP/IP),
DB engine, transaction processor(manager)
allow physical separation of processes & data
allow partitioning of a particular process / data file into smaller units
it has a number of transparency features
: distribution transparency : access any DB w/o knowing where & how the data are stored
performance transparency : behave as if it were a single centralized DB system
transaction transparency : update the DB at different sites
heterogeneity transparency : integration of different DB systems under a common schema
Chapter 2. Concepts & arch of DB sys 4.2 Client/server computing client – requests services, server – provides services
can be thin / fat client / server
two-tier client / server arch : one server one client
three-tier client / server arch : extension of the two-tier
client : used for interaction w/ the DB
application server : application programs are stored & executed
DB server : used for storage & retrieve of data
web-based DB systems are three-tier
Query and Display
Application Processing
Database Management
Query and Display
Application Processing
Application Processing
Database Management
Query and Display
Application Processing
Database Management
Query and Display
Application Processing
Database Management
Database Management
(a) (b) (c) (d)
Cli ent-side Server-s ide
Function s Fun ctions
Centralised Database Systems
Distributed Database Systems
Query and Display
Application Processing
Database Management
(e)
Two-tier
ArchitectureThree-tier
Architecture
Figure 2-8. Client-server computing for DB systems
Chapter 2. Concepts & arch of DB sys 4.3 DB SW DB SW comprises different modules of application programs
DB engine : storing, retrieving, manipulating, conversion, transaction logging, memory management
SQL (or program written in SQL extensions) : managing data storage & retrieval routines
network middleware : DB connectivity middleware ex. CORBA, JDBC, ODBC, ADO
* middleware tool : made up of API on client / communication SW on client & server
Operating System (OS)
Commands and Functions
DATABASEENGINE
SQL
NETWROK MIDDLEWARE
COMPILED DATABASEAPPLICATION INTERFACES
DYNAMIC DATA ANALYSIS TOOLS
Developed with Developed with
Data Analysis RoutinesSupplied by Database
Software Vendor
Third-party Data AnalysisRoutines
Third-party Prepackaged Applications
Prepackaged DatabaseApplications Supplied by
Database SoftwareVendor
Application Development
Tools Supplied by Database Software
Vendor
Third-party ApplicationDevelopment Tools
Client-sideApplications
Figure 2-9. SW layers of a DB system
Chapter 2. Concepts & arch of DB sys 4.4 Web-based DB arch WWW has significant impact on the development of DB system
Web browser interface enables a user to access data anywhere in the world
: universal access arch is introduced based on Internet standard
web-based DB – accessible to internal & external users
three-tier client/server configuration
components : a. client computer – user submits a request by HTML-formatted page thru HTTP
b. web server – equipped w/ program called a server-side extension
= web-to-DB middleware : understand, validate, process DB queries
using CGI / API protocol
functionality of web browser can be enhanced by adding client-side extensions
: plug-ins, Java, JavaScript, ActiveX, VBScript
Server Computer
Web Server
ScriptPage
HTMLPage
Web-to-dbMiddleware
Database System
Database DatabaseEngine
12
3
4
5
67
8
HTMLPage9
TCP/IP Network
Figure 2-10. Arch of a typical web-based DB
Chapter 2. Concepts & arch of DB sys 5. Data structure 5.1 Logical data structure how data are organized in a DB for optimal performance & ease of administration ex. Oracle Optimal Flexible Arch
: logical structure
table space – data file - dbf
TBLSPC-1 TBLSPC-2
d01.dbf d22.dbf
Datafile-1 Datafile-2 Datafile-3
d45.dbf
TBLSPC-1 consists of twodatafiles Datafile-1 and Datafile-2, and the table d01.dbf isallocated to Datafile-1
TBLSPC-2 consists of onedatafile Datafile-3, whichhouses two tables d22.dbfand d45.dbf
Figure 2-11. Relationship between tablespace, data files & tables in Oracle’s optimal flexible arch(OFA) logical DB structure
Chapter 2. Concepts & arch of DB sys 5.2 Physical data structure actual organization & placement of data files in the DB dependent on the model of its DB system ex. relational – values of attributes are stored in a table in a certain data type * data types character / string data type numeric data type date data type others including BLOB & user-defined abstract data type(ADT)
Chapter 2. Concepts & arch of DB sys 5.3 DB indexing an index = an element of data structure, used to speed up access to a specific part of the DB many indexing methods have been proposed
ex. B-tree – the most commonly used form
use case ) users issue a command to index → column for a row identification + index table are created
index table : root block – branch block – leaf block structure
since all leaf are at the same depth – all retrieval require the same amount of I/O
000049999999
0000166733334999
5000666683329999
0000(2311)0001(3400)
........1501(0010)
........1666(1001)1667(5423)
1668(0000)1669(9003)
........
...............
3332(8312)3333(1121)
3334(9872)3335(7500)
........
........
........4998(0200)4999(5601)
6667(0003)6668(0055)6669(0620)
........
........8331(9999)8332(8001)
5000(0211)........
6010(6089)................
6665(9001)6666(9004)
8333(0232)8334(1345)
.......................
9998(7500)9999(2001)
0000 1668 .......... ...................0010 1501 ............. ...................0620 6669 ............. .................. ........ ............. ..........6089 6010 ............. .................. ........ ............. ..........7500 3335 ............. .................. ........ ............. ..........9999 8331 ............. .........
ROWID FID Attribute1 Attribute 2 .....
Root (Header) Block
Leaf Blocks
Branch Blocks
Data File
Index File
Figure 2-12. A B-tree index