of 40
7/30/2019 FIT328 06 Physical Database Design and Performance
1/40
LECTURE 6
PHYSICAL DATABASE DESIGNAND PERFORMANCE
[FIT328] Database Systems
7/30/2019 FIT328 06 Physical Database Design and Performance
2/40
Credits
Modern Database Management, 8th edition
by: Jeffrey A. Hoffer, Mary B. Prescott,
Fred R. McFadden 2007 by Prentice Hall
7/30/2019 FIT328 06 Physical Database Design and Performance
3/40
Objectives
Definition of terms
Describe the physical database design process
Choose storage formats for attributes Select appropriate file organizations
Describe three types of file organization
Describe indexes and their appropriate use
Translate a database model into efficientstructures
Know when and how to use denormalization
7/30/2019 FIT328 06 Physical Database Design and Performance
4/40
Physical Database Design
Purposetranslate the logical description of datainto the technical specifications for storing and
retrieving data Goalcreate a design for storing data that will
provide adequate performance and insuredatabase integrity, security, and recoverability
7/30/2019 FIT328 06 Physical Database Design and Performance
5/40
Physical Design Process
Normalized relations
Volume estimates
Attribute definitions
Response time expectations
Data security needs
Backup/recovery needs
Integrity expectations
DBMS technology used
Inputs
Attribute data types
Physical record descriptions
(doesnt always match
logical design)
File organizations
Indexes and databasearchitectures
Query optimization
Leads to
Decisions
7/30/2019 FIT328 06 Physical Database Design and Performance
6/40
Figure 6-1 Composite usage map (Pine Valley Furniture Company)
7/30/2019 FIT328 06 Physical Database Design and Performance
7/40
7/30/2019 FIT328 06 Physical Database Design and Performance
8/40
Figure 6-1 Composite usage map (Pine Valley Furniture Company) (cont.)
Access Frequencies
(per hour)
7/30/2019 FIT328 06 Physical Database Design and Performance
9/40
Figure 6-1 Composite usage map (Pine Valley Furniture Company) (cont.)
Usage analysis:140 purchased parts accessed
per hour
80 quotations accessed from
these 140 purchased part
accesses
70 suppliers accessed fromthese 80 quotation accesses
7/30/2019 FIT328 06 Physical Database Design and Performance
10/40
7/30/2019 FIT328 06 Physical Database Design and Performance
11/40
Designing Fields
Field: smallest unit of data in database
Field design
Choosing data type Coding, compression, encryption
Controlling data integrity
7/30/2019 FIT328 06 Physical Database Design and Performance
12/40
Choosing Data Types
CHARfixed-length character
VARCHAR2variable-length character (memo)
LONG
large number NUMBERpositive/negative number
INEGERpositive/negative whole number
DATE
actual date BLOBbinary large object (good for graphics,
sound clips, etc.)
7/30/2019 FIT328 06 Physical Database Design and Performance
13/40
Figure 6-2 Example code look-up table (Pine Valley Furniture Company)
Code saves space, but costsan additional lookup to
obtain actual value
7/30/2019 FIT328 06 Physical Database Design and Performance
14/40
Field Data Integrity
Default valueassumed value if no explicit value
Range controlallowable value limitations(constraints or validation rules)
Null value controlallowing or prohibitingempty fields
Referential integrityrange control (and null
value allowances) for foreign-key to primary-keymatch-ups
Sarbanes-Oxley Act (SOX) legislates importance of financial data integrity
7/30/2019 FIT328 06 Physical Database Design and Performance
15/40
7/30/2019 FIT328 06 Physical Database Design and Performance
16/40
Physical Records
Physical Record: A group of fields stored inadjacent memory locations and retrievedtogether as a unit
Page: The amount of data read or written in oneI/O operation
Blocking Factor: The number of physical
records per page
7/30/2019 FIT328 06 Physical Database Design and Performance
17/40
Denormalization
Transforming normalizedrelations into unnormalizedphysical record specifications
Benefits:Can improve performance (speed) by reducing number of table
lookups (i.e. reduce number of necessary join queries)
Costs (due to data duplication)Wasted storage space
Data integrity/consistency threats
Common denormalization opportunitiesOne-to-one relationship (Fig. 6-3)
Many-to-many relationship with attributes (Fig. 6-4)
Reference data (1:N relationship where 1-side has data not used inany other relationship) (Fig. 6-5)
7/30/2019 FIT328 06 Physical Database Design and Performance
18/40
Figure 6-3 A possible denormalization situation: two entities with
one-to-one relationship
7/30/2019 FIT328 06 Physical Database Design and Performance
19/40
Figure 6-4 A possible denormalization situation: a many-to-many
relationship with nonkey attributes
Extra table
access
required
Null description possible
7/30/2019 FIT328 06 Physical Database Design and Performance
20/40
Figure 6-5
A possible
denormalization
situation:
reference data
Extra table
access
required
Data duplication
7/30/2019 FIT328 06 Physical Database Design and Performance
21/40
Partitioning
Horizontal Partitioning: Distributing the rows of atable into several separate files Useful for situations where different users need access to
different rows
Three types: Key Range Partitioning, Hash Partitioning, orComposite Partitioning
Vertical Partitioning: Distributing the columns of atable into several separate relations Useful for situations where different users need access to
different columns The primary key must be repeated in each file
Combinations of Horizontal and Vertical
Partitions often correspond with User Schemas (user views)
7/30/2019 FIT328 06 Physical Database Design and Performance
22/40
7/30/2019 FIT328 06 Physical Database Design and Performance
23/40
Data Replication
Purposely storing the same data in multiplelocations of the database
Improves performance by allowing multiple users toaccess the same data at the same time withminimum contention
Sacrifices data integrity due to data duplication
Best for data that is not updated often
7/30/2019 FIT328 06 Physical Database Design and Performance
24/40
Designing Physical Files
Physical File:A named portion of secondary memory allocated for the
purpose of storing physical records
Tablespace
named set of disk storage elements in whichphysical files for database tables can be stored
Extentcontiguous section of disk space
Constructs to link two pieces of data: Sequential storage Pointersfield of data that can be used to locate related
fields or records
7/30/2019 FIT328 06 Physical Database Design and Performance
25/40
Figure 6-6 Physical file terminology in an Oracle environment
7/30/2019 FIT328 06 Physical Database Design and Performance
26/40
7/30/2019 FIT328 06 Physical Database Design and Performance
27/40
Figure 6-7a
Sequential file
organization
If not sorted
Average time to
find desired record
= n/2
1
2
n
Records of the
file are stored in
sequence by the
primary key
field values
If sorted every insert ordelete requires
resort
7/30/2019 FIT328 06 Physical Database Design and Performance
28/40
Indexed File Organizations
Indexa separate table that containsorganization of records for quick retrieval
Primary keys are automatically indexed
Oracle has a CREATE INDEX operation, andMS ACCESS allows indexes to be created formost field types
Indexing approaches:
B-tree index, Fig. 6-7b Bitmap index, Fig. 6-8
Hash Index, Fig. 6-7c
Join Index, Fig 6-9
7/30/2019 FIT328 06 Physical Database Design and Performance
29/40
Figure 6-7b B-tree index
uses a tree searchAverage time to find desired
record = depth of the tree
Leaves of the tree
are all at samelevel
consistent access
time
7/30/2019 FIT328 06 Physical Database Design and Performance
30/40
7/30/2019 FIT328 06 Physical Database Design and Performance
31/40
Figure 6-8
Bitmap index
index
organization
Bitmap saves on space requirementsRows - possible values of the attribute
Columns - table rows
Bit indicates whether the attribute of a row has the values
7/30/2019 FIT328 06 Physical Database Design and Performance
32/40
7/30/2019 FIT328 06 Physical Database Design and Performance
33/40
7/30/2019 FIT328 06 Physical Database Design and Performance
34/40
7/30/2019 FIT328 06 Physical Database Design and Performance
35/40
7/30/2019 FIT328 06 Physical Database Design and Performance
36/40
Rules for Using Indexes (cont.)
6. Avoid use of indexes for fields with long values;perhaps compress values first
7. DBMS may have limit on number of indexes
per table and number of bytes per indexedfield(s)
8. Null values will not be referenced from an index
9. Use indexes heavily for non-volatile databases;limit the use of indexes for volatile databases
Why? Because modifications (e.g. inserts, deletes) requireupdates to occur in index files
7/30/2019 FIT328 06 Physical Database Design and Performance
37/40
RAID
Redundant Array of Inexpensive Disks
A set of disk drives that appear to the user to be a
single disk drive Allows parallel access to data (improves access
speed)
Pages are arranged in stripes
7/30/2019 FIT328 06 Physical Database Design and Performance
38/40
Figure 6-10
RAID with four disks
and striping
Here, pages 1-4can be
read/written
simultaneously
7/30/2019 FIT328 06 Physical Database Design and Performance
39/40
7/30/2019 FIT328 06 Physical Database Design and Performance
40/40
Legacy
Systems
Current
Technology
DataWarehouses
Figure 6-11
Database
Architectures