+ All Categories
Home > Documents > FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management...

FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management...

Date post: 15-Mar-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
40
www.infotech.monash.edu.au/FIT1004/ FIT1004 Database Topic 1: Introduction Learning Objectives : Data, Database, DBMS Understand the motivation for the Database Approach The Database System Environment Objectives of Database Technology DBMS Functions DB Models Relational Database Model References : Rob, P. & Coronel, C., Database Systems, 7 th Edition, Chapter 1, Chapter 2 - Sections 2.1, 2.3, 2.4
Transcript
Page 1: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

www.infotech.monash.edu.au/FIT1004/

FIT1004 DatabaseTopic 1: Introduction

Learning Objectives:• Data, Database, DBMS• Understand the motivation for the Database Approach• The Database System Environment• Objectives of Database Technology• DBMS Functions• DB Models• Relational Database Model

References:• Rob, P. & Coronel, C., Database Systems, 7th Edition, Chapter 1,

Chapter 2 - Sections 2.1, 2.3, 2.4

Page 2: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

2

Where We Are

Introduction to Database Systems The Relational Model

Conceptual Design Logical Design Normalisation

Database Lifecycle Physical Design

SQL (DML) SQL (DDL & DCL) Implementation Transaction Management

Database Administration

Data Warehousing & Data Mining

Page 3: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

3

Data vs. Information

• Data are raw facts. Facts concerning people, places, events, or other objects/concepts.

• Data by itself is useless unless it is some how aggregated, organised and prepared in a form convenient for decision making or other organisational activities.

• Data that is processed to reveal their meaning becomes information, eg. total sales per quarter

• A lack of data leads to inadequate information and thus ill-informed decisions and business failure.

• Data is a valuable corporate resource which needs adequate integrity and security controls.

• Data management is a discipline that focuses on the proper generation, storage and retrieval of data

Page 4: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

4

Data vs. Information

Page 5: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

5

Database and DBMS

• A database is a shared integrated computer structure that houses a collection of:

– end user data– Metadata, or data about data, through which the data is integrated and

managed• A database management system (DBMS) is

─ a collection of programs that manages the database structure and controls access to the data stored in the database─contains a query language

Page 6: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

6

Traditional File Systems

Databases are often contrasted to traditional file systems though they are now rarely used. But as the problems that existed were the impetus for the development of the “Database Concept” it is worth noting some of these.

• Problems:– Requires extensive programming in third-generation

language (3GL)– Data and structural dependence– Data redundancy

Page 7: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

7

• Requests for information (reports) required a DP specialist to write programs for the department that required the report

• File systems developed to address needs

• Data was organized in the files according to expected use - led to islands of information

Traditional File Systems

• To retrieve data required extensive programming in third-generation language (3GL), this was time consuming and made ad hoc queries impossible

• Often the same data was stored in many different locations, eg. agent details occurred in both the CUSTOMER and AGENT files

Page 8: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

8

Traditional File Systems

• In the past as new applications were written they used existing files or created a new file for their use.

• Sometimes several existing files need to be sorted and merged toobtain the new file. Often several files contained the same information stored in different ways. In other words, there would be redundant and possibly inconsistent data.

• Example of an insurance company file

Page 9: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

9

Traditional File Systems

• Data Dependence– Changes in file’s data characteristics requires modification of data

access programs– Must tell program what to do and how– Makes file systems cumbersome from a programming and data

management views• Structural Dependence

– Change in file structure requires modification of related programs• Data Redundancy

– Different and conflicting versions of same data– Results of uncontrolled data redundancy

> Data anomalies– Modification, Insertion, Deletion

Page 10: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

10

Traditional File Systems

• Data Redundancy (cont)– Data inconsistency

> Lack of data integrity• File Terminology

– Field > group of characters with specific meaning

– Record > logically connected fields that describe a person, place, or thing

– File > collection of related records

Page 11: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

11

• Applications were often considered in relative isolation.

• Data that should have been together was not.• The potential for flexible enquiry and reporting

was limited.• All validations were in the programs.• Procedures were required for backup and

recovery.• All programmers had access to all records.• There was limited concurrent access.

Traditional File Systems

Page 12: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

12

Database Systems

• Consists of logically related data stored in a single repository

• Provides advantages over file system management approach

– Eliminates inconsistency, data anomalies, data dependency, and structural dependency problems

– Stores data structures, relationships, and access paths• The centralised control of data means that for many

applications the data already exists.• The data is no longer related by application programs, but

by the structure defined in the database.

Page 13: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

13

Database vs. File Systems

Page 14: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

14

The Database System Environment

Page 15: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

15

Objectives of Database Technology

• Data Independence• Minimal Data Redundancy• Increased Data Sharing• Improved Data Quality • Improved Security of Data• Improved Access to Data• Reduced Program Maintenance• Inter-relate data thru the model

Page 16: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

16

• Is the property of being able to change the logical or physical structure of data without requiring changes to application programs that manipulate that data

• Data is stored independently of the programs• The degree to which descriptions of data are

embedded in application programs.• Can the database structure be changed with no impact

on programs ?• Role of the database catalog or dictionary.• Maintenance costs are high !

Data Independence

Page 17: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

17

GLOBALLOGICAL

DATABASEDESCRIPTION

Application ProgramLocal Views

PhysicalFiles

Logical DataIndependence

Physical DataIndependence

Logical vs Physical Data Independence

Page 18: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

18

Minimal Data Redundancy

• Minimise the duplication of data– data stored in more than one location

CUSTOMERCNbr, CName, CAddress

INVOICEInvNbr, InvDate, CNbr, InvTotal, CAddress

CUSTOMERCNbr, CName, CAddress, Last_InvNbr, Last_InvDate

INVOICEInvNbr, InvDate, CNbr, InvTotal

CUSTOMERCNbr, CName, CAddress, CBalance

INVOICEInvNbr, InvDate, CNbr, InvTotal

Page 19: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

19

• The DBMS should support multiple concurrent users of the same data and ensure that the data remains consistent at all times.

Part 2 QOH 10

Part 2 QOH 10

Part 2 QOH 5

Part 2 QOH 20Part 2 QOH 10QOH=QOH+10

QOH=QOH-5

TX 1 TX 2

Sharing of Data

Page 20: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

20

Sharing of Data

Trans 1Part # QOHP1 20

Xlock(P1) Read P1 (20)QOH = QOH + 15Write P1 (35)Unlock

Part # QOHP1 35

Part # QOHP1 25

Trans 2

Attempt to LockWait for Trans 1

Read P1 (35)QOH = QOH - 10Write P1 (25)

To avoid concurrency problems, transactions must be made logically serial. One common technique used is record locking. That is, a transaction can lock a record, preventing update by another transaction, until the update has completed.

Page 21: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

21

• A condition in which given data always yield the same result

• Validation or integrity rules should be defined and automatically invoked at run time by the DBMS regardless of the source of update i.e. application program, web page or query language.

• Significant variation exists among DBMSs in the level of support for data integrity.

• ANSI/ISO suggest that 100% of all enterprise rules should be held in the conceptual schema, and specifically none in application programs.

• An area of significant development during the 1990's.

Data Integrity

Page 22: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

22

• Protecting data against accidental or intentional use by unauthorised users

• Each user requires identification with a user-id and password.

• Users can be limited in the data they can see and what actions they can perform on that data.

• The DBMS encrypts and decrypts data as it is stored and retrieved.

• Many DBMS now provide data value sensitive security.

• Views are often used to limit user’s access to data

Security

Page 23: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

23

• Objects and data in the database can be created, modified and accessed by executing structured query language (SQL) statements

• SQL provides easy access to data

20 READ #1, CUSTNO, NAME, BAL30 IF END #1 GO TO 8040 IF BAL > 20050 PRINT CUSTNO, NAME, BAL60 ENDIF70 etc

SELECT CUSTNO, NAME, BAL FROM CUST WHERE BAL > 200;

Easy Access to Data

Page 24: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

24

DatabaseDatabase

SQL> SELECT loc 2 FROM dept;

SQL> SELECT loc 2 FROM dept;

SQL statementis entered Statement is sent

to database

Data is displayed

LOCATION----------------------------NEW YORKCHICAGOBOSTON

Data Access Using SQL

Page 25: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

25

DDL Data Definition Language• the language component of a DBMS that is

used to describe the logical, and sometimes physical, structure of a database

• is used to specify the conceptual and internal schemas for the database

DML Data Manipulation Language• a language component of a DBMS that is

used to access and modify the contents (data) of a database

Overview of SQL

Page 26: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

26

• DDL– the SQL commands for data definition are CREATE, ALTER,

DROP

– CREATE TABLE> define table

– ALTER TABLE> add new columns or modify existing columns

– DROP TABLE> delete table

• DML– the SQL commands for data manipulation are SELECT, INSERT,

UPDATE, DELETE

– SELECT> retrieve data from table

– INSERT> add a single row or copy rows from other table(s)

Overview of SQL

Page 27: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

27

• DML (cont)– UPDATE

> modify column values– DELETE

> delete rows of data• Data Control

– COMMIT> commit changes to the database

– ROLLBACK> rollback (undo) changes

• Data Security– GRANT

> grant access privileges to users– REVOKE

> remove access privileges

Overview of SQL

Page 28: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

28

• Data dictionary management– stores the definitions of data and their relationships

(metadata) in a data dictionary; any changes made are automatically recorded in the data dictionary.

– creates a security system and enforces security within that system.

• Data storage management– creates and manages the complex structures

required for data storage.• Data transformation and presentation

– transforms entered data to conform to the data structures that are required to store the data

DBMS Functions

Page 29: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

29

• Security management– creates a security system and enforces security

within that system.• Multi-user access control

– creates complex structures that allow multiple-user access to the data.

• Backup and recovery management– performs backup and data recovery procedures to

ensure data safety.

DBMS Functions

Page 30: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

30

• Data integrity management– promotes and enforces integrity rules to eliminate

data integrity problems• Database language and application

programming interfaces – provides access to the data via utility programs and

from programming languages interfaces.• Database communication interfaces

– provides end-user access to data within a computer network environment.

DBMS Functions

Page 31: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

31

• Collection of logical constructs used to represent data structure and relationships within the database

– Conceptual models: logical nature of data representation– Implementation models: emphasis on how the data are

represented in the database• Relationships in Conceptual Models

– One-to-one (1:1)– One-to-many (1:M)– Many-to-many (M:N)

• Implementation Database Models– Hierarchical – Network – Relational

Database Models

Page 32: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

32

Database Models

• Hierachical– Logically represented by an upside down tree

> Each parent can have many children> Each child has only one parent

• Network– Each record can have multiple parents

> Composed of sets> Each set has owner record and member record> Member may have several owners

• Relational– Perceived by user as a collection of tables for data

storage– Tables are a series of row/column intersections– Tables related by sharing common entity characteristic(s)

Page 33: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

33

Relational Database Model

• Advantages– Structural independence– Improved conceptual simplicity– Easier database design, implementation, management,

and use – Ad hoc query capability with SQL– Powerful database management system

• Disadvantages– Substantial hardware and system software overhead– Poor design and implementation is made easy– May promote “islands of information” problems

Page 34: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

34

A relational A relational database is a collection of relations is a collection of relations or twoor two--dimensional tables.dimensional tables.

DatabaseDatabase

DEPTNO DNAME LOC

10 ACCOUNTING NEW YORK

20 RESEARCH DALLAS

30 SALES CHICAGO

40 OPERATIONS BOSTON

DEPTNO DNAME LOC

10 ACCOUNTING NEW YORK

20 RESEARCH DALLAS

30 SALES CHICAGO

40 OPERATIONS BOSTON

Table Name: : DEPTEMPNO ENAME JOB DEPTNO

7839 KING PRESIDENT 10

7698 BLAKE MANAGER 30

7782 CLARK MANAGER 10

7566 JONES MANAGER 20

EMPNO ENAME JOB DEPTNO

7839 KING PRESIDENT 10

7698 BLAKE MANAGER 30

7782 CLARK MANAGER 10

7566 JONES MANAGER 20

Table Name: : EMP

Definition of a Relational Database

Page 35: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

35

Relational Database Management System

User tablesUser tables Data Data dictionarydictionary

ServerServer

Page 36: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

36

Relational Database Terminology

1

2 3 4

5

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO

------------- ------------ --------------------- -------- ---------------- ----------- -------------- -----------

7839 KING PRESIDENT 17-NOV-81 5000 10

7698 BLAKE MANAGER 7839 01-MAY-81 2850 30

7782 CLARK MANAGER 7839 09-JUN-81 2450 10

7566 JONES MANAGER 7839 02-APR-81 2975 20

7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30

7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30

7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30

7900 JAMES CLERK 7698 03-DEC-81 950 30

7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30

7902 FORD ANALYST 7566 03-DEC-81 3000 20

7369 SMITH CLERK 7902 17-DEC-80 800 20

7788 SCOTT ANALYST 7566 09-DEC-82 3000 20

7876 ADAMS CLERK 7788 12-JAN-83 1100 20

7934 MILLER CLERK 7782 23-JAN-82 1300 10

6

Page 37: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

37

Relational Database Terminology

Relation• a named collection of attributes

Tuple• the collection of values that compose one row of a relation

Attribute• a named characteristic or property of an entity

Properties of Relations•the tuples of a relation have no ordering (top to bottom)•the attibutes of a relation have no ordering (left to right)•the entries in the table (attributes) are single valued

The degree of a relation is the number of attributes in that relationThe cardinality is the number of tuples in the relation

Page 38: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

38

Relationships between entities are supported by attributes which are common to both entities

Primary Key•an attribute or attributes that uniquely identify a record instanceor tuple in a relation

Foreign Key•an attribute or combination of attributes of one relation R2 whose values are required to match those of the PK of relation R1, whereR1 and R2 are not necessarily distinct

•a FK and the corresponding PK should be defined on the samedomain

Relational Database Terminology

Page 39: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

39

• Each row of data in a table is uniquely identified by a primary key (PK).

Table Name: Table Name: EMP Table Name: DEPT

Primary key Primary key

• You can logically relate data from multiple tables using foreign keys (FK).

Foreign key

EMPNO ENAME JOB DEPTNO

7839 KING PRESIDENT 10

7698 BLAKE MANAGER 30

7782 CLARK MANAGER 10

7566 JONES MANAGER 20

DEPTNO DNAME LOC

10 ACCOUNTING NEW YORK

20 RESEARCH DALLAS

30 SALES CHICAGO

40 OPERATIONS BOSTON

Relating Multiple Tables

Page 40: FIT1004 Database Topic 1: Introduction · • Provides advantages over file system management approach – Eliminates inconsistency, data anomalies, data dependency, and structural

40

Summary

• Information is derived from data, which are usually stored in a database

• A DBMS is software that implements and manages a database

• Databases were developed to address the weaknesses of file systems

• A DBMS – presents to the user a single data repository that

promotes data sharing – Enforces data integrity, eliminates redundancy and

promotes data security


Recommended