Database Design
Relational Database
Relational Database Before
File system• organized data
Hierarchical and Network database• data + metadata + data structure ⇒ database• addressed limitations of file system • tied to complex physical structure.
AfterConceptual simplicity• store a collection of related entities in a “relational” table
Focus on logical representation (human view of data)• how data are physically stored is no longer an issue
Database RDBMS application• conducive to more effective design strategies
Database System 2
Logical View of Data Entity
a person, place, event, or thing about which data is collected.• e.g. a student
Entity Seta collection of entities that share common characteristicsnamed to reflect its content• e.g. STUDENT
Attributescharacteristics of the entity.• e.g. student number, name, birthdatenamed to reflect its content• e.g. STU_NUM, STU_NAME, STU_DOB
Tablescontains a group of related entities or entity set2-dimensional structure composed of rows and columnsalso called relations
Database System 3
Relational DB Table: Characteristics 2-dimensional structure with rows & columns 2차원 구조
Rows (tuples) • Represent single entity occurrenceColumns• Represent attributes• Have a specific range of values
→ attribute domain
Each table must have a primary key 기본키Primary key is an attribute (or a combination of attributes) that uniquely identify each row
Relational database vs. File system terminology Rows == Records, Columns == Fields, Tables == Files
Database Design 4
column 열 (attribute, field)
row 행(tuple, record)
Table Characteristics Table and Column names
Max. 8 & 10 characters in older DBMSCannot use special charcters (e.g. */.)Use descriptive names (e.g. STUDENT, STU_DOB)
Column characteristicsData type• number, character, date, logical (Boolean)
Format• 999.99, Xxxxxx, mm-dd-yy, Yes/No
Range• 0-4, 35-65, {A,B,C,D}
Database System 5
Relational DB Table: Example
8 rows & 9 columns Row = single entity occurrence
row 1 describes a student named Jone Doe Column = an attribute
has specific characteristics (data type, format, value range)• stClass: char(2), {fr,jr,so,sr}
all values adhere to the attribute characteristics Each row/column intersection contains a single data value Primary key = stID
Database Design 6
Table: Keys Consists of one or more attributes that determine other attributes
→ Given the value of a key, you can look up (determine) the value of other attributese.g., student_ID ⇒ student’s name, major, status, grade, etc.
Composite key: Composed of more than one attributee.g., building name + room number ⇒ location, size, function/purpose, etc.
Superkeyany key that uniquely identifies each row
Candidate key 후보키Any key that uniquely identifies each row (without redundancies)
Primary Key (PK) 기본키The candidate key selected as the unique identifier
Foreign Key (FK) 외래키An attribute whose values match the primary key values in a related tableJoins tables to derive information
Secondary Keyfacilitates querying of the databaserestrictive secondary key narrow search result (e.g. STU_LNAME vs. STU_DOB)
Database Design 7
Table: Keys Superkey
attribute(s) that uniquely identifies each row• STU_ID; STU_SSN; STU_ID + any; STU_SSN + any; STU_DOB + STU_LNAME + STU_FNAME?
Candidate Key 후보키 minimal superkey (without redundancies)
• STU_ID; STU_SSN; STU_DOB + STU_LNAME + STU_FNAME?
Primary Key candidate key selected as the unique identifier 기본키
• STU_ID
Foreign Key 외래키 primary key from another table
• DEPT_CODE
Secondary Key attribute(s) used for data retrieval
• STU_LNAME + STU_DOB
DEPT_CODE DEPT_NAME
243 Astronomy
245 Computer Science
423 Sociology
STU_ID STU_SSN STU_DOB STU_LNAME STU_FNAME DEPT_CODE
12345 111-11-1111 12/12/1985 Doe John 245
12346 222-22-2222 10/10/1985 Dew John 243
12348 123-45-6789 11/11/1982 Dew Jane 423
Database Design 8
Integrity Rules Entity Integrity 개체 무결성
Ensures uniqueness of entities• Primary key values must be unique and not empty→e.g., no department can have duplicate or null DEPT_CODE
Referential Integrity 참조 무결성
Prevents invalid data entryForeign key value is null or matches primary key values in related table
→ i.e., foreign key cannot contain values that does not exist in the related table.
Most RDBMS enforce integrity rules automatically.
STU_ID STU_LNAME STU_FNAME DEPT_CODE
12345 Doe John 245
12346 Dew John 243
22134 Dew James
23456 Doe Jane 249
DEPT_CODE DEPT_NAME
243 Astronomy
244 Computer Science
245 Sociology
243 246 Physics
Database Design 9
Example: Simple RDB
Database System 10
Database Systems: Design, Implementation, & Management: Rob & Coronel
Relationships in RDB Representation of relationships among entities 개체간의 관계 표현
By shared attributes between tables (RDB model)• primary key foreign keyE-R model provides a simplified picture
One-to-One (1:1)Could be due to improper data modeling • e.g. PILOT (id, name, dob) to EMPLOYEE (id, name, dob)
Commonly used to represent entity with uncommon attributes• e.g. PILOT (id, license) & MECHANIC (id, certificate) to EMPLOYEE (id, name, dob, title)
One-to-Many (1:M)Most common relationship in RDBPrimary key of the One should be the foreign key in the Many
Many-to-Many (M:N)Should not be accommodated in RDB directlyImplement by breaking it into a set of 1:M relationships• Create a composite/bridge entity
Database Design 11
EMPLOYEE
PILOT MECHANIC
M:N to 1:M Conversion
Database Systems: Design, Implementation, & Management: Rob & Coronel
Database Design 12
M:N to 1:M Conversion
STU_ID STU_Name Sex CLS_ID1234 John Doe M IT-s161234 John Doe M DB-s162345 Jane Doe F IT-s162345 Jane Doe F DB-s163456 GI Joe M DB-s16
CLS_ID CRS_Name Room STU_IDIT-s16 Web Authoring 403 1234IT-s16 Web Authoring 403 2345DB-s16 Database 421 1234DB-s16 Database 421 2345DB-s16 Database 421 3456
STU_ID STU_Name Sex1234 John Doe M2345 Jane Doe F3456 GI Joe M
CLS_ID CRS_Name RoomIT-s16 Web Authoring 403DB-s16 Database 421
STU_ID CLS_ID grade1234 IT-s16 B1234 DB-s16 C2345 IT-s16 A2345 DB-s16 A3456 DB-s16 A
Composite Table:• Must contain at least the primary keys of original tables
→ Contains multiple occurrences of the foreign key values• Additional attributes may be assigned as needed
Database Design 13
STUDENT CLASS
CLASSSTUDENT
ENROLL
Data Redundancy Uncontrolled Redundancy 불필요한 중복
Unnecessary duplication of data• Repeated attribute values → Normalize (e.g., M:N to 1:M conversion)
• Derived attributes → Compute as needed
Controlled Redundancy 필요한 중복
Shared attributes in multiple tables• Makes RDB work (e.g. foreign key)
For information requirements or transaction speed• e.g. INV_Price records historical product price• e.g. Account Balance = account receivable - payments
Database Design 14
PRD_ID PRD_Name PRD_Price
C1234 Chainsaw $100
H2341 Hammer $10
INV_ID PRD_ID Date INV_Price CUST_ID
121 C1234 2015/12/24 $80 KY123
122 H2341 2015/12/25 $5 JJ122
123 C1234 2016/01/11 $100 SH002
PRODUCT INVOICE
CUSTOMER
Kiduk Yang’sAccount Balance?
CUST_ID = KY123
INVOICE PAYMENT
15/11/01 $28015/11/15 $12015/12/24 $ 80
16/01/01 $100
280 + 120 + 80 - 100 = $380
Data Integrity
Nulls No data entry
• a “not applicable” condition non-existing data e.g., middle initial, fax number
• an unknown attribute value non-obtainable data e.g., birthdate of John Doe
• a known, but missing, attribute value uncollected data e.g., date of hospitalization, cause of death
Can create problems• when functions such as COUNT, AVERAGE, and SUM are used
Not permitted in primary key• should be avoided in other attributes
Database System 15
Indexes Composed of an index key and a set of pointers
Points to data location (e.g. table rows)Makes retrieval of data fastereach index is associated with only one table
Database System 16
ACTOR_NAME ACTOR_IDJames Dean 12Henry Fonda 23Robert DeNiro 34
MOVIE_ID MOVIE_NAME ACTOR_ID1 231 Rebel without Cause 122 352 Twelve Angry Men 233 455 Godfather 2 344 460 Godfather II 345 625 On Golden Pond 23
index key(ACTOR_ID)
pointers
12 123 2, 534 3, 4
Data Dictionary & Schema Data Dictionary
Detailed description of a data model• for each table in a database
→ list all the attributes & their characteristicse.g. name, data type, format, range
→ identify primary and foreign keysHuman view of entities, attributes, and relationships• Blueprint & documentation of a database
→ design & communication tool
Relational SchemaSpecification of the overall structure/organization of a database• e.g. visualization of a structureShows all the entities and relationships among them• tables w/ attributes• relationships (linked attributes)
→ primary key foreign key• relationship type
→ 1:M, M:N, 1:1
Database System 17
Data Dictionary Lists attribute names and characteristics for each table in the database
record of design decisions and blueprint for implementation
Database System 18Database Systems: Design, Implementation, & Management: Rob & Coronel
Relational Schema A diagram of linked tables w/ attributes
Database System 19
- from https://www.fiverr.com/mohsinejaz7/design-database-erd-diagram-and-relation-schema -