ITEC313 Database Programming
Lecture 1: Database Design Methodology : Introduction
Learning Objectives
• Database Design Terminology• Purpose of Database Design• Phases of Database Design
Database and Database System
• A database is a shared collection of logically related data designed to meet the information needs of an organization.
• Components of a Database Systems– Database– Hardware– Software - DBMS– Users
3
Database
• The data in the database will be expected to be both integrated and shared particularly on multi-user systems
• Integration - The database may be thought of as a unification of several otherwise distinct files, with any redundancy among these files eliminated
• Shared - individual pieces of data in the database may be shared among several different users 4
Hardware
These are secondary storage on which the database physically resides, together with the associated I/O devices, device controllers etc.
5
DBMS
Examples of DBMS Products Oracle Informix Access DB2 Fox pro dBase SQL Server My SQL
6
Typical Functions of DBMS
7
Functions of a DBMS
Data storage, retrieval
and update A user-accessible
catalog
Transaction support
Concurrency and control
services
Recovery services
Authorization
services
Support of data
communication
Integrity Services
Services to promote
data independen
ce
Utility services
Users
• Application Programmer - writes programs that use the database
• Database Designers - designs conceptual and logical database
• Database Administrator (DBA)• Data Administrator• End - user - interacts with the system
from an on-line terminal by using Query Languages etc.
8
Data & Database Administration
• Data Administrator – a business manager responsible for controlling the overall corporate data resources
• Database Administrator (DBA) - a technical person responsible for development of the total system
9
Sample Applications
• Student Records• Banking • Insurance• Billing Systems e.g.
Electricity, Phone• ISPs• Accounting Systems• Reservation Systems
e.g. Airline, Hotel• Medical Records
10
• Stock control• Personnel systems• Product catalogues• Telephone directories• Train timetables• Airline bookings• Credit card details• Customer histories• Stock market prices• Discussion boards• Web indexes• Library catalogues
Advantages
• Control of data redundancy
• Data consistency
• Multipurpose use of data
• Sharing of data,
• Enforcement of standards
• Economy of scale
• Balance conflicting user
requirement
• Improved data accessibility and
responsiveness
• Increased productivity
• Improved maintenance through
data independence
• Increased concurrency
• Improved backup and recovery
services.
11
Disadvantages
Complexity
Size
Cost of DBMS
Additional hardware costs
Cost of conversion
12
Data Independence
• Software maintenance is a large part (50%) of information system budgets
• Reduce impact of changes by separating database description from applications
• Change database definition with minimal effect on applications that use the database
Three Schema Architecture
Three Schema Architecture
Database Architecture
External Level – concerned with the way users perceive the database
Conceptual Level – concerned with abstract representation of the database in its entirety
Internal Level – concerned with the way data is actually stored
17
Differences among Levels
• External– Course Registration Form– Instructor load assignments
• Conceptual: –Tables: student, course,takes, …
• Internal– Files needed to store the tables– Extra files to improve performance
Architecture of Db System
19
DBMS
Application 2Application 1 Application 3
Database
Conceptual Level
Internal Level
External Level
Logical Data Independence
Physical Data Independence
Data Independence
• Logical Data Independence – users and user programs are independent of logical structure of the database
• Physical Data Independence – the separation of structural information about the data from the programs that manipulate and use the data i.e. the immunity of application programs to changes in the storage structure and access strategy
20
Data Independence
• Different applications will need different views of the same data, so that if they are not interested in a part of the database, that part need not be included in their view. This feature is also important for controlling access to parts of database
• The DBA must have the freedom to change the storage structure or access strategy in response to changing requirements, without having to modify the existing applications
21
Client-Server Architecture
Database
Database
a) Client, server, anddatabase on thesame computer
b) Mulitple clients and 1 serveron different computers
c) Multiple servers and databases on different computers
Client
Server
Client Server
Client Server Server
DatabaseDatabase
Client
Client
Client
Client
Client
Database Development
• In the past many software development projects were unsuccessful due to:– requirements were not properly collected/specified– Lack of development methodology
• The stages in the DB development cycle has been identified:– Clearly specified– Not sequential, but involve some repetition.– Contain feedback loops (even back to the
requirements stage)
Db Development Life Cycle
Database planning System definition Requirement collection and analysis Database design DBMS selection Application design Prototyping Implementation Data conversion and loading Testing Operational maintenance
24
Database Design
DATABASE PLANNING
SYSTEMS DEFINITION
REQUIREMENTS ANALYSIS
IMPLEMENTATION
CONCEPTUAL DESIGN
DISTRIBUTED DB DESIGN
PHYSICAL DESIGN
APPLICATION DESIGN
DBMS SELECTION
PROTOTYPING
DATA LOADING
TESTING
MAINTENANCE
LOGICAL DESIGN
Database Application
Lifecycle
Optional
Database Application Lifecycle
• Management activities that allow the stages of the database application to be realized as efficiently as possible
Database Planning :
• The scope and boundaries of the application including its major application areas and user groups
System Definition :
• Encompasses tasks that determine the needs or conditions to meet for a new or altered product, taking account of the possibly conflicting, vague and incomplete requirements of the various stakeholders
Requirements Analysis:
Database Application Lifecycle
• Design of the user interface and the application programs that use and process the database.
Application Design :
• Building a working model of a database application
Prototyping :
• Physical realization of the database and application design
Implementation :
Database Application Lifecycle
• Transferring any existing data into the new database and converting any existing processes to run on the new database.
Data Conversion and Loading :
• Process of executing the application programs with the intent of finding errors.
Testing :
• Process of monitoring and maintaining the system following installation.
Operational Maintenance :
Planning
Slide 29
Planning Factors
The work to
be done
The resources
to do it
The cost
Planning Objectives
Organisational Units
Consist of various
departments
Locations
List of operational locations
Business Functions
Identify related
business processes
Entity Types
Something for which
data is collected
Two stages
System Definition
• Identify boundaries– Want to know at a very high level what the
boundaries of the system are, e.g.• Current users• Current application areas
• Identify interfaces within organization
Requirements Analysis
• Database design should reflect the information within the organisation
• Many ways of gathering information• interviewing• observing• examining documents• using questionnaires• using experience from the design of other systems• …
Requirements Analysis• Critical information
– Main application areas and user groups– Documentation used– Details of transactions needed
• A prioritized user requirement specification• Amount gathered depends on size of
organization and scope of application• Documentation is VERY important
– DFD, matrices etc.• Identifying the required functionality for a database system is crucial:• systems with inadequate functionality will fail
Database Design
MAIN AIMS• To represent data & relationships required
by users and applications• To provide a data model which supports
transactions• To specify a design that meets performance
requirements
Database Design Approaches
begins at the level of attributes and then adds entities as new relationships are seen. Normalization is an example of this.
starts with the development of the data model that contains a few high level entities and then it refines them in ever increasing detail. Data modeling comes under this.
BOTT
OM
UP
TOP-D
OW
N
Phases of database Design
• Remember the main phases:
– Conceptual Database Design– Logical Database design– Distributed Database Design (optional)– Physical Database Design
Conceptual Database Design
• Create a conceptual data model– Use data modeling to understand
• each users perspective of data• the data• Use of data across applications
• Independent of any implementation details– DBMS or physical aspects are immaterial
• Based on user requirements specification– assists in understanding data– facilitates communication
Logical database design
• The data model created in the previous phase is refined
• At this point you know – which type of DBMS you will implementing in - e.g.
relational, object-oriented …– but not the actual DBMS
• Test the correctness of the data model through– Normalization– Validation against user transactions
A crucial stage in the database Application lifecycle is choosing the DB.
The aim is to choose a system that
• allows expansion• enables speedy retrieval• gives easy application development etc.
All data should have been collected and documented before DB selection
Many organizations in practice choose a DBMS purely on the basis of cost.
Database selection
Define terms of reference• the scope of the study should be stated• potential list of the products to be assessed• the criteria to be used, timescales …
Identify products• hardware, • compatibility with existing systems, • cost ..• User support • upgrades …
Produce shortlist of products• Shortlist 2-3 products
Evaluate products• Ask Vendors• Involve Users
Recommend selection and produce report• Give details of criteria used• Compare/Contrast alternatives
Database selection
Physical Database Design
HOW to physically implement the logical data model
– derive tables & constraints– identify storage structures and access methods– design security features
Application Design
• Design transactions– data to be used by transactions– functions of the transactions– output of transactions– programs
• Design human interface– Various guidelines
Design of software programs which will process the data
Prototyping
• used to check – developer’s understanding of what is required– interpretation of requirements
• Building a working model
• Inexpensive & quick to build
Implementation
• Database created using DDL• Implement application programs using
selected language• Implement security & integrity controls
Data Loading/Conversion
• Transfer any existing data• Insert any new data• Usually there is a facility within the DBMS to
load data into a database
Testing
• The process of executing the application programs with the intention of finding errors.– Use realistic data– Involve users
• There are various strategies that can be used:– White Box – Black box testing
Slide 45
Maintenance
• Monitoring Performance– Various tools are available
• Maintaining and Upgrading
Slide 46
Overview of Database Design• Assist in understanding of the semantics of data• Facilitate the communication about information
requirements
Purpose of Data Modeling
Criteria for Optimal Data Models
Shareability
Diagrammatic Representation
Extensibility
Expressability
Structural Validity
Nonredundancy
Integrity
Simplicity
Database Design Methodology• A structured approach that uses procedures, techniques,
tools and documentation aids to support and facilitate the process of design
Interaction with users
Structured methodology
Data-driven approach
Structural and integrity
considerations
Data dictionaryvalidate
diagrams
DBDL
Repeat
Broad Goals of Database Development
• Develop a common vocabulary• Define data meaning• Ensure data quality• Provide efficient implementation
Develop a Common Vocabulary
• Diverse groups of users• Difficult to obtain acceptance of a common
vocabulary• Compromise to find least objectionable
solution• Unify organization by establishing a common
vocabulary
Define Meaning of Data
• Business rules support organizational policies
• Restrictiveness of business rules– Too restrictive: reject valid business
interactions– Too loose: allow erroneous business
interactions• Exceptions allow flexibility
Data Quality
• Poor data quality leads to poor decision making– Difficult customer communication– Inventory shortages
• Cost-benefit tradeoff to achieve desired level of data quality
• Long-term effects of poor data quality
Data Quality Measures
• Completeness• Lack of ambiguity• Timeliness• Correctness• Consistency• Reliability
Data Quality Measures• Completeness:
– database represents all important parts of an information system• Lack of ambiguity:
– each part of a database has only one meaning• Timeliness:
– business changes are posted to a database without excessive delays• Correctness:
– database contains values perceived by the user• Consistency:
– different parts of a database do not conflict• Reliability:
– failures or interference do not corrupt database
Importance of measure depends on the database, system, and organizationEach measure can be quantified
Efficient Implementation
• Supersedes other goals• Optimization problem
– Maximize performance– Subject to constraints of data quality, data
meaning, and resource usage• Difficult problem:
– Number of choices– Relationships among choices– DBMS specific
Database Development Phases
Conceptual Data Modeling
Logical Database Design
Distributed Database Design
Physical Database Design
ERD
Tables
Distribution Schema
Internal Schema, Populated DB
Data requirements
OPTIONAL
Database Design
• Conceptual database design - the process of constructing a model of the information used in an organization, independent of all physical considerations
Step 1 Build local conceptual data model for each user view
58
Database Design
• Logical database design for the relational model - the process of constructing a model of the info used in an organization based on a specific data model, but independent of a particular DBMS and other physical considerations
Step 2 Build and validate local data model for each user viewStep 3 Build and validate global logical data model
59
Database Design
• Physical database design for relational databases - the process of producing a description of the implementation of the database on secondary storage.
Step 4 Translate global data model for target DBMS
Step 5 Design physical representationStep 6 Design security mechanismsStep 7 Monitor and tune the operational
system 60
Phases of Database Design
• Process of constructing a model of the information used in an enterprise independent of all physical considerations
Conceptual Database Design
• Process of constructing a model of information used in an enterprise based on a specific data model but independent of a particular DBMS or any other physical considerations
Logical Database Design
• (Optional)Process of deciding about the placement of data across the sites of a computer network. Involves designing the network itself, as well as distribution of DBMS software, DB applications and data
Distributed Database Design
• Description of the implementation of the database on secondary storage. It describes the storage structures and access methods for efficient access.
Physical Database Design
Overview of Database Design
Build local conceptual data model for each user view
Build and Validate local logical data model for each user view
Build and validate global logical Model
Translate global logical model for target DBMS
Design Physical representation
Design Security Mechanisms
Monitor and Tune operational system
Conceptual
Logical
Physical
Centralized Approach to Managing Multiple User Views
63Pearson Education © 2009
View Integration Approach to Managing Multiple User Views
64
Conceptual Database Design
1.1 • Identify entity
types
1.2 • Identify
relationship types
1.3 • Identify and
associate attributes with entity or relationship types
1.4 • Determine
Attribute Domains
1.5 • Determine
candidate and primary key attributes
1.6 • Specialize/
generalize entity types
1.7 • Draw Entity-
Relationship Diagram
1.8 • Review local
conceptual data model with user
1. Build local conceptual data model for each user view
Logical Database Design
2.1 • Map local
Conceptual data model to local logical data model
2.2 • Derive relations
from local logical data model
2.3 • Validate model
using normalization
2.4 • Validate model
against user transactions
2.5 • Draw Entity
relationship Diagram
2.6 • Define integrity
constraints
2.7 • Review Local
logical data model with user
2. Build and validate local logical data model
Logical Database Design
3.1 • Merge local logical
data models into global model
3.2 • Validate global
logical data model
3.3 • Check for future
growth
3.4 • Draw final Entity
Relationship diagram
3.5 • Review global
logical data model with users
3. Build and Validate Global Logical data model
Physical Database Design
4. Translate Global Logical Data Model for target DBMS
4.1 Design base relations for target DBMS4.2 Design enterprise constraints for target DBMS
5. Design Physical Representations5.1 Analyze transactions5.2 Choose file organizations
Physical Database design
5.3 Choose secondary indexes5.4 Consider introduction of controlled redundancy
6. Design Security Mechanisms6.1 Design user views6.2 Design access rules
7. Monitor and tune operational system
END OF LECTURE