+ All Categories
Home > Documents > Introduction to Database

Introduction to Database

Date post: 31-Dec-2015
Category:
Upload: dorothy-harding
View: 27 times
Download: 1 times
Share this document with a friend
Description:
Introduction to Database. CHAPTER 1 INTRODUCTION. Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Database Users Database Administrator Transaction Management Storage Management Overall System Structure. Contents. - PowerPoint PPT Presentation
42
1-1 Source: Database System Concepts, Silberschatz etc. 2001 Edited: Wei-Pang Yang, IM.NDHU, 2005 Introduction to Database CHAPTER 1 INTRODUCTION Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Database Users Database Administrator Transaction Management Storage Management Overall System Structure
Transcript

1-1Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Introduction to Database

CHAPTER 1

INTRODUCTION Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Database Users Database Administrator Transaction Management Storage Management Overall System Structure

1-2Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Chapter 1: Introduction

PART 1 DATA MODELS

Chapter 2: Entity-Relationship Model

Chapter 3: Relational Model

PART 2 RELATIONAL DATABASES

Chapter 4: SQL

Chapter 5: Other Relational Languages

Chapter 6: Integrity and Security

Chapter 7: Relational Database Design

PART 3 OBJECT-BASED DATABASES AND XML

PART 4 DATA STORAGE AND QUERYING

Chapter 11: Storage and File Structure

Chapter 12: Indexing and Hashing

Contents

1-3Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Database System: Introduction Database Management System (DBMS)

Contains a large bodies of information Collection of interrelated data (database) Set of programs to access the data

Goal of a DBMS: provides a way to store and retrieve database information that is both

• convenient and • efficient.

Functions of DBMS: Management of Data (MOD) Defining structure for storage data Providing mechanisms for manipulation of data Ensure safety of data (system crashes, unauthorized access, misused, …) Concurrent control in multi-user environment

Computer Scientists: developed a lot of concepts and technique for MOD concepts and technique form the focus of this book, and this course

1-4Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.1 Database System Applications Database Applications:

Banking: all transactions Airlines: reservations, schedules Universities: registration, grades, student profile, .. Sales: customers, products, purchases Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions

Databases touch all aspects of our lives

1-5Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.2 Database Systems vs. File Systems In the early days, database applications were built on top of file

systems Drawbacks of using file systems to store data:

Data redundancy and inconsistency

• Multiple file formats, duplication of information in different files

Difficulty in accessing data

• Need to write a new program to carry out each new task Data isolation — multiple files and formats Integrity problems

• Integrity constraints (e.g. account balance > 0) become part of program code

• Hard to add new constraints or change existing ones

1-6Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Drawbacks of using file systems (cont.)

Drawbacks of using file systems to store data: (cont.)

Atomicity of updates

• Failures may leave database in an inconsistent state with partial updates carried out

• E.g. transfer of funds from one account to another should either complete or not happen at all

Concurrent access by multiple users

• Concurrent accessed needed for performance

• Uncontrolled concurrent accesses can lead to inconsistencies E.g. two people reading a balance and updating it at the same

time

Security problems

Database systems offer solutions to all the above problems

Solution

原子性

1-7Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.3 View of Data: Levels of Abstraction Physical level: describes how a record (e.g., customer nformation)

is stored in disk. By sequential file, pointer, or hash structure, …

Logical level: describes data stored in database, and the relationships among the data.

type customer = recordname : string;street : string;city : string;

income : integer; end;

View level: application programs hide details of data types. Views can also hide information (e.g., income) for security purposes.

1-8Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

View of Data: Three Levels

An architecture for a database system

1-9Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.3.2 Instances and Schemas Schema – the logical structure of the database

e.g., the database consists of information about a set of customers and accounts and the relationship between them

Analogous to type information of a variable in a program Physical schema: database design at the physical level Logical schema: database design at the logical level

type customer = recordname : string;street : string;city : integer;

end;

create table account (account-number char(10), balance integer)

account

1-10Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Instances and Schemas (cont.)

Instance – the actual content of the database at a particular point in time Analogous to the value of a variable

Physical Data Independence – the ability to modify the physical

schema without changing the logical schema Applications depend on the logical schema In general, the interfaces between the various levels and components

should be well defined so that changes in some parts do not seriously influence others.

create table account (account-number char(10), balance integer)

Schema

Instance

1-11Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

View of Data: Three Levels

An architecture for a database system

Physical Data Independence

1-12Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Host Language

+ DSL

Host Language

+ DSL

Host Language

+ DSL

Host Language

+ DSL

Host Language

+ DSL

User A1 User A2 User B1 User B2 User B3

External View@ # &

External ViewB

External/conceptualmapping A

ConceptualView

External/conceptualmapping B

Conceptual/internalmapping

Stored database (Internal View)

Databasemanagementsystem Dictionary

(DBMS) e.g. system

catalog

<

DBA

Storagestructuredefinition(Internalschema)

Conceptualschema

Externalschema

A

Externalschema

B

(Build andmaintainschemas

andmappings)

# @&

DSL (Data Sub. Language)

C, C++

e.g. SQL1 2 3

1 2 3 ... 100

0.2 Architecture for a Database System View 2: Three Levels ( 補 )

1-13Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.4 Data Models A collection of tools for describing

data (entities, objects) data relationships data semantics data constraints

Data Models: Entity-Relationship model Relational model Object-oriented model Semi-structured data models Older models:

• Network model and

• Hierarchical model

1-14Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.4.1 Entity-Relationship Model Example: Schema in the Entity-Relationship model

存款戶

帳戶

客戶 ( 存款戶 , 貸款戶 , 信用卡戶 )

1-15Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Relational Database: A Sample

Account A-101 is held by customer Johnson

1-16Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Entity Relationship Model (cont.)

E-R model of real world Entities (objects)

• E.g. customers, accounts, bank branch Relationships between entities

• E.g. Account A-101 is held by customer Johnson

• Relationship set depositor associates customers with accounts

Widely used for database design

Database design in E-R model usually converted to design in the

Relational model (coming up next) which is used for storage and

processing

E-R model (ch. 2)

Relational Model (ch. 3)

1-17Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.4.2 Relational Model Example: Tabular data (instants) in the Relational model

customer-name

Customer-idcustomer-street

customer-city

account-number

Johnson

Smith

Johnson

Jones

Smith

192-83-7465

019-28-3746

192-83-7465

321-12-3123

019-28-3746

Alma

North

Alma

Main

North

Palo Alto

Rye

Palo Alto

Harrison

Rye

A-101

A-215

A-201

A-217

A-201

Attributes

1-18Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Relational Database: A Sample

1-19Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.4.3 Other Data Models

Hierarchical Data Model

Network Data Model

Object-oriented Data Model

Object-relational Data Model

Extensible Markup Language (XML)

1-20Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Introduction to Database

CHAPTER 1

INTRODUCTION Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Database Users Database Administrator Transaction Management Storage Management Overall System Structure

1-21Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.5 Database Languages Data Definition Language (DDL):

Specification notation for defining the database schema E.g.

create table account (account-number char(10), balance integer)

Data Manipulation Language (DML) To express database queries or updates E.g.

Select account-number from account where balance >1000

SQL (Structured Query Language): a single language for both

1-22Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.5.1 Data Definition Language (DDL) Specification notation for defining the database schema

E.g. create table account

(account-number char(10), balance integer)

DDL compiler generates a set of tables stored in a data dictionary Data dictionary contains metadata (i.e., data about data)

Database schema Data storage and definition language

• Language in which the storage structure and access methods used by the database system are specified

• Usually an extension of the data definition language

(See p.1-12)

1-23Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.5.2 Data Manipulation Language (DML) Language for accessing and manipulating the data organized

by the appropriate data model DML also known as query language For retrieval, insertion, deletion, modification (update)

Two classes of languages Procedural – user specifies what data is required and how to

get those data

• E.g. … in C Declarative DML (Nonprocedural) – user specifies what data

is required without specifying how to get those data

• E.g. In SQL: Select account-number from account where balance > 700

SQL is the most widely used query language

1-24Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

SQL (Structured Query Language) SQL: widely used non-procedural language

E.g. find the name of the customer with customer-id 192-83-7465select customer.customer-namefrom customerwhere customer.customer-id = ‘192-83-7465’

customer

customer-name

Johnson

Output:

1-25Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

SQL (Structured Query Language) E.g. find the balances of all accounts held by the customer with

customer-id 192-83-7465 select account.balance from depositor, account where depositor.customer-id = ‘192-83-7465’ and depositor.account-number = account.account-

number

Application programs generally access databases through one of Language extensions to allow embedded SQL Application program interface (e.g. ODBC/JDBC) which allow SQL

queries to be sent to a database

1-26Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.6 Database Users and Administrators

Host Language

+ DSL

Host Language

+ DSL

Host Language

+ DSL

Host Language

+ DSL

Host Language

+ DSL

User A1 User A2 User B1 User B2 User B3

External View@ # &

External ViewB

External/conceptualmapping A

ConceptualView

External/conceptualmapping B

Conceptual/internalmapping

Stored database (Internal View)

Databasemanagementsystem Dictionary

(DBMS) e.g. system

catalog

<

DBA

Storagestructuredefinition(Internalschema)

Conceptualschema

Externalschema

A

Externalschema

B

(Build andmaintainschemas

andmappings)

# @&

DSL (Data Sub. Language)

C, C++

e.g. SQL1 2 3

1 2 3 ... 100

1-27Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.6.1 Database Users Application programmers

interact with system through DML calls Sophisticated users

Submit query without write program E.g. OLAP (Online analytical processing), data mining tools

Specialized users write specialized database applications that do not fit into the

traditional data processing framework E.g. CAD, expert system, complex data type (graphics, audio)

Naive users (end user) invoke one of the permanent application programs that have

been written previously E.g. people accessing database over the web, bank tellers,

clerical staff

單純的

複雜 , 多用途

辦事員

1-28Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.6.2 Database Administrator Database Administrator:

Coordinates all the activities of the database system; has a good understanding of the enterprise’s information

resources and needs. Database Administrator's Duties:

Schema definition Storage structure and access method definition Schema and physical organization modification Granting of authorization for data access Routine maintenance

• Periodically backup database

• Upgrade system e.g. disk

• Monitoring performance …

1-29Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Introduction to Database

CHAPTER 1

INTRODUCTION Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Database Users Database Administrator Transaction Management Storage Management Overall System Structure

1-30Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.7 Transaction Management Transaction:

A transaction is a collection of operations that performs a single logical function in a database application

Atomicity: all or nothing

Transaction-management component ensures that the database remains in a consistent (correct) state, Failure recovery manager Failure:

• system failures (e.g., power failures and operating system crashes)

• transaction failures.

Concurrency-control manager controls the interaction among the concurrent transactions, to

ensure the consistency of the database.

1-31Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.8 Database System Structure Components of Database System

Storage Manager

• Require a large amount of space

• Can not store in main memory

• Disk speed is slower

• Minimize the need to move data between disk and main memory

Query Processor

• Helps to simplify to access data

• High-level view

• Users are not be burdened unnecessarily with the physical details

Language Processor

Optimizer

Operation Processor

Access Method

File Manager

Database

QueryDBMS

Goal of a DBMS: provides a way to store and retrieve data that is both convenient and efficient. p.1

1-32Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Overall System Structure

Overall System Structure

low-level data stored

database

1-33Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.8.1 Storage Management Storage Manager

is a program module that provides the interface between the low-level data stored and the

application programs and queries submitted to the system.

Tasks of the Storage Manager: interaction with the file manager (part of Operating System) Translates DML into low-level file-system commands, i.e. responsible for storing, retrieving and updating of data in database

Data Structures of the Storage Manager Data files: store database itself Data Dictionary: store metadata Indices: provide fast access to data items that hold particular values

1-34Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Storage Management (cont.)

Components of Storage manager: Authorization and Integrity Manager

• Tests for the satisfaction of integrity constraints

• Checks the authority of users to access data Transaction Manager

• Ensure the database in a consistent state (correct) after failures

• Ensure that concurrent transaction executions proceed without conflicting

File Manager

• Manages the allocation of space on disk

• Manages the data structures used to representation data stored Buffer manager

• Fetches data from disk into main memory

• Decides what data to cache in main memory

1-35Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.8.2 The Query Processor DDL Interpreter

Interprets DDL statements write the definitions (schema, view, ..) into the data dictionary

DML Compiler Translates DML statements into an evaluation plan (or some

evaluation plans) which consists low-level instructions Query Optimization: picks the lowest cost evaluation plan

Query Evaluation Engine: execute low-level instructions generated by the DML Compiler

1-36Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Example: A Simple Query Processing ( 補 )

Query in SQL : SELECT CUSTOMER. NAME FROM CUSTOMER, INVOICE WHERE REGION = 'N.Y.' AND AMOUNT > 10000 AND CUTOMER.C#=INVOICE.C

Internal Form :

( (S SP)

Operator : SCAN C using region index, create C SCAN I using amount index, create I SORT C?and I?on C# JOIN C?and I?on C# EXTRACT name field

Calls to Access Method : OPEN SCAN on C with region index GET next tuple . . .

Calls to file system : GET10th to 25th bytes from block #6 of file #5

Language Processor

Optimizer

Operator Processor

Access Method

File System

database

LanguageProcessor

AccessMethod

e.g.B-tree; Index; Hashing

DBMS

Storage Manager

Query Processor

1-37Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.9 Application Architectures Application Structure

User uses database at the site Users uses database through a network

• Client: remote database users work

• Sever: database system runs here

Partition of Database Application Two-tier architecture Three-tier architecture

1-38Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.9 Application Architectures

Two-tier Architecture: e.g. client programs using ODBC/JDBC to communicate with a database

Three-tier Architecture: e.g. web-based applications, and applications built using “middleware”

ODBC/JDBC

1-39Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

1.10 History of Database Systems 1950s – early 1960:

Tapes: sequentially Application: Payroll, Input: punched decks, Output: printer

Late 1960s -- 1970s: Disk: direct access Codd proposed Relational Model, … Turing Award

1980s: System R: IBM Res. Lab. IBM DB2, Oracle, Ingress, DEC Rdb Replaced Network/Hierarchical model Research: parallel database, distributed database, object-oriented, …

Early 1990s: Parallel database Object-Relational

Late 1990s: World Wide Web was explosive growth Database were used much more than ever before Database had to support Web interfaces to data

1-40Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Data Model

Database Hardware

UserInterface

ProgramInterface

Presentationand displayprocessing

Network

Hierarchical

Mainframes

NoneForms

Procedural

ReportsProcessingdata

Relation proposed

MainframesMinisPCs

DL/ICOBOL+DL/I

EmbeddedQuerynon-Procedural

ReportgeneratorsInformationand transactionprocessing

SemanticObject-orientedLogicRelation

Faster PCsWorkstationsDatabase machines

Graphics, Menus SQL, QUELQuery-by-forms

4GLLogic programming

Business graphicsImage outputKnowledgeprocessing

Merging data models, knowledge-baseRelation

Parallel Optical memories

Natural languageSpeech input

Integrated database and programming language

History of Database Systems ( 補 )

1990-19951980-19891965-19791950-1965 1995-present

Network

Hierarchical

Object-OrientedOO-relationXMLRelation

WWW Web interface

Mainframes

Procedural

ReportsProcessingdata

Multimedia

1-41Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

計算機科學的諾貝爾獎 – 杜林獎 ( 趙坤茂 )  象徵最崇高學術桂冠的諾貝爾獎,從 1901 年開始頒發,根據瑞典發明家諾貝爾的遺

囑,設有物理、化學、生理醫學、文學及和平等五個獎項;自 1969 年起,增設了經濟學諾貝爾獎。不知您是否曾有這樣的疑問,為什麼諾貝爾獎沒有數學獎項呢?坊間流傳的說法是,當初諾貝爾的夫人,曾經和瑞典一位很有成就的數學家米塔雷符勒有過一段婚外情,所以諾貝爾決定不設數學獎項。

英國數學家亞蘭杜林 (Alan Turing , 1912-1954) ,雖然無緣在有生之年得到諾貝爾獎,但後人為了紀念他在數位計算理論貢獻而設立的杜林獎 (Turing Award) ,已被公認是計算機科學領域最崇高的獎項。

杜林獎從 1966 年開始頒發,受獎人都是對計算機科學有深遠影響的大師級學者。例如,在計算複雜度理論上有卓越貢獻的庫克 (Cook) 、 C 程式語言的創始人理奇(Ritchie) 、 Unix 作業系統製作人湯普生 (Thompson) 及資料庫管理系統的先驅卡德(Codd) 等。

1936 年時,杜林提出了一個假想性的計算工具,稱為杜林機器 (Turing machine) ,這個機器有一個長條型、無窮多格的儲存磁帶,每一格位置是空白或一個符號;附帶在磁帶上的是一個可讀寫的磁頭,它可以在磁帶的格子往左或往右,並在每次移動時讀、寫或擦拭該格子;還有一個有限狀態控制機,可運用狀態的改變,配合目前磁頭所在的位置,來決定這些移動讀寫的動作。

1-42Source: Database System Concepts, Silberschatz etc. 2001

Edited: Wei-Pang Yang, IM.NDHU, 2005

Mail: 台大醫院資訊室國防役徵才楊老師您好 :

台大醫院資訊室國防役徵才啟事資格:符合國防役甄選資格之資訊、電機、醫工相關領域碩士及博士各一名待遇:比照大學講師,表現良好者提供在職進修機會工作內容:  目前台大醫院的醫療資訊系統 (HIS) 是二十年前開始規劃開發的,運作的平台是 IBM mainframe 和它的

hierarchical database 。所謂的 HIS 處理的是門診、急診、住院等的掛號、看診、檢查、處方、批價、領藥,乃至於向健保局的申報等等作業。二十年來經過多手的維護,因應新的法規、業務等的增修等等因素,已經使得現有 HIS 變成龐然巨物,幾乎沒有人可以從容的掌控了。

另外因為 mainframe 的維護費用居高不下,實在有必要在開放式平台上用新的架構,如 relational database ,開發新的醫療資訊系統。過去這類案子,大部分的醫院都是採用委外開發的方式進行。不過過份的依賴委外開發使得臺大醫院資訊室無法掌握程式碼,以致於較難做有效的維護。所以這次我們組織新的開發團隊負責程式撰寫,搭配臺大醫院資訊室現有人員的系統分析,共同完成開放式平台上新的醫療資訊系統。

今年預計可以完成門診系統,隨後將進行住院、急診以及行政系統。醫療資訊系統有其專業,而且永遠有市場需求。台大醫院的複雜度,將使這些 know how 可以涵蓋台灣大多數的醫院。目前臺大醫院除了總院 (含公館院區 ) 外還有、雲林分院、北護分院等,涵蓋了醫學中心、區域醫院以及地區醫院。將來還將擴展到結盟的基層診所,組合成一個完整的醫療體系。進而可以把研發的成果推廣到各醫療機構 (含國軍醫院體系 ) 。

有關團隊人員的長期生涯規劃,我們將會朝著鼓勵他們協助他們成立育成公司的方向前進。醫療資訊系統的 know how 將是該育成公司的最大資產。意者請於 94/3/15 前將履歷及自傳 eMail 至 [email protected] (翟家屏組長 )


Recommended