Technology ReviewTechnology Review
Professor MartinProfessor MartinProfessor XiongProfessor Xiong
CSUSCSUS
This lecture is based primarily on Romney & This lecture is based primarily on Romney & Steinbart(2003). It also draws on Martin (2002).Steinbart(2003). It also draws on Martin (2002).
Updated on: Monday, January 27, 2003Updated on: Monday, January 27, 2003
AgendaAgenda Database Management–An IntroductionDatabase Management–An Introduction Relational DatabaseRelational Database Entity-Relationship DiagramEntity-Relationship Diagram
WHAT IS DATA MANAGEMENT?WHAT IS DATA MANAGEMENT?(The first seven slides are based on Watson (2002))(The first seven slides are based on Watson (2002))
The management of The management of organizational memoryorganizational memory Involves designing, using, and managing Involves designing, using, and managing
memory systems of modern organizationsmemory systems of modern organizations
EXAMPLES OF INDIVIDUAL AND EXAMPLES OF INDIVIDUAL AND ORGANIZATIONAL MEMORY SYSTEMSORGANIZATIONAL MEMORY SYSTEMS
INDIVIDUALINDIVIDUAL Internal memoryInternal memory External memory (diaries, bookmarks, address External memory (diaries, bookmarks, address
books)books) ORGANIZATIONAL ORGANIZATIONAL
examples people, filing cabinets, policy examples people, filing cabinets, policy manuals, planning boards, and computers. (Do manuals, planning boards, and computers. (Do organizations have external memories?)organizations have external memories?)
Characteristics similar to Individual memoryCharacteristics similar to Individual memory
DESIRABLE ATTRIBUTES OF DESIRABLE ATTRIBUTES OF ORGANIZATIONAL MEMORYORGANIZATIONAL MEMORY
ShareableShareable readily accessed by more than one person at a timereadily accessed by more than one person at a time
TransportableTransportable Easily moved to a decision makerEasily moved to a decision maker
SecureSecure Protected from destruction and unauthorized useProtected from destruction and unauthorized use
AccurateAccurate Reliable, precise recordsReliable, precise records
TimelyTimely Current and up-to-dateCurrent and up-to-date
RelevantRelevant Appropriate to the decisionAppropriate to the decision
TYPICAL PROBLEMS WITH FILE-TYPICAL PROBLEMS WITH FILE-BASED SYSTEMSBASED SYSTEMS
Organizational memoryOrganizational memory may be seen as a may be seen as a vast, vast, disorganized data warehouse. disorganized data warehouse. Problems include:Problems include:
Redundancy:Redundancy: same data stored in different memories same data stored in different memories Data control:Data control: data not managed as a valuable resource data not managed as a valuable resource Interface:Interface: difficult to access data difficult to access data Delays:Delays: long delays in responding to requests for data long delays in responding to requests for data Lack of reality:Lack of reality: data do not reflect the complexity of the data do not reflect the complexity of the
real worldreal world Lack of data integration:Lack of data integration: data dispersed across different data dispersed across different
systems; also where data is stored may not be known.systems; also where data is stored may not be known.
File-Oriented ApproachFile-Oriented Approach
Applicationprogram #2
Applicationprogram #1
File # 1
Item A Item B Item C
File # 2
Item B Item D Item E
DATABASE APPROACH TO DATABASE APPROACH TO MANAGING PERSISTENT DATAMANAGING PERSISTENT DATA
The database approach emphasizes the The database approach emphasizes the integration and sharing of data across integration and sharing of data across the organization. the organization.
Database ApproachDatabase Approach
Applicationprogram #3
Applicationprogram #2
Databasemanagement
system
Applicationprogram #1
Item A Item B Item C Item D Item E
Database
BENEFITS OF BENEFITS OF THE DATABASE APPROACHTHE DATABASE APPROACH
Redundancy can be reducedRedundancy can be reduced Thus, inconsistency can be avoidedThus, inconsistency can be avoided Integration of dataIntegration of data Data can be shared among applicationsData can be shared among applications Standards can be enforced by the DBAStandards can be enforced by the DBA
formats, representation, naming, formats, representation, naming, documentationdocumentation
Security restrictions can be applied Security restrictions can be applied
BENEFITS OF BENEFITS OF THE DATABASE APPROACHTHE DATABASE APPROACH
Data integrity can be maintainedData integrity can be maintained by minimizing inconsistencyby minimizing inconsistency by having controls to check against incorrect by having controls to check against incorrect
updates, especially in the multi-user contextupdates, especially in the multi-user context
BENEFITS OF BENEFITS OF THE DATABASE APPROACHTHE DATABASE APPROACH
Data independenceData independence Broadly -- the immunity of applications to change in storage Broadly -- the immunity of applications to change in storage
structure and access techniquestructure and access technique Logical -- capacity to change conceptual schema without Logical -- capacity to change conceptual schema without
changing application programs (e.g., adding an attribute or an changing application programs (e.g., adding an attribute or an entity type)entity type)
Physical -- capacity to change internal schema without having to Physical -- capacity to change internal schema without having to change external or conceptual schema (e.g., creating additional change external or conceptual schema (e.g., creating additional access structures to improve retrieval performance)access structures to improve retrieval performance)
Ease of Application Development Ease of Application Development Data accessibility and responsiveness enhancedData accessibility and responsiveness enhanced Reduced program maintenanceReduced program maintenance
SOME DEFINITIONSSOME DEFINITIONS What is a database?What is a database?
a a shared collectionshared collection of of logically related logically related persistent datapersistent data, designed to meet the needs of , designed to meet the needs of multiple usersmultiple users usually within an organization. usually within an organization.
What is a database management system?What is a database management system? DBMS is a collection of programs that enables DBMS is a collection of programs that enables
users to define, construct and manipulate a users to define, construct and manipulate a database. (More detailed defn. later). database. (More detailed defn. later).
What is a database system?What is a database system?
FUNCTIONS OF A DBMSFUNCTIONS OF A DBMS Data definition using DDLData definition using DDL Data manipulation using DMLData manipulation using DML Data security and integrityData security and integrity Data recovery and concurrency controlData recovery and concurrency control Data dictionaryData dictionary Satisfactory performanceSatisfactory performance
STEPS IN DATABASE STEPS IN DATABASE DEVELOPMENT PROCESSDEVELOPMENT PROCESS
Analysis Analysis creation of the creation of the Entity-Relationship ModelEntity-Relationship Model
DesignDesign Logical Database DesignLogical Database Design
creation of creation of normalized relationsnormalized relations Physical Database DesignPhysical Database Design
specification specification storage technology requirementsstorage technology requirements specification/ creation of specification/ creation of appropriate file structuresappropriate file structures
SchemasSchemas What are schemas?What are schemas? A schema describes the logical structure of a database.A schema describes the logical structure of a database. There are three levels of schemas:There are three levels of schemas:
1 Conceptual-level schemaConceptual-level schema2 External-level schemaExternal-level schema3 Internal-level schemaInternal-level schema
SchemasSchemas The conceptual-level schema is an organization-wide view of the The conceptual-level schema is an organization-wide view of the
entire database.entire database. The external-level schema consists of a set of individual user views The external-level schema consists of a set of individual user views
of portions of the database, also referred to as a of portions of the database, also referred to as a subschema.subschema. The internal-level schema provides a low-level view of the database.The internal-level schema provides a low-level view of the database.
AgendaAgenda Database Management–An IntroductionDatabase Management–An Introduction Relational DatabaseRelational Database Entity-Relationship DiagramEntity-Relationship Diagram
Relational DatabasesRelational Databases A A data modeldata model is an abstract representation of the contents is an abstract representation of the contents
of a database.of a database. The The relational data modelrelational data model represents everything in the represents everything in the
database as being stored in the form of tables.database as being stored in the form of tables. Technically, these tables are called Technically, these tables are called relationsrelations..
Basic Requirements of the Basic Requirements of the Relational Data ModelRelational Data Model
1 Primary keys must be unique.Primary keys must be unique.2 Every foreign key must either be null or have a value Every foreign key must either be null or have a value
corresponding to the value of a primary key in another relation.corresponding to the value of a primary key in another relation.3 Each column in a table must describe a characteristic of the Each column in a table must describe a characteristic of the
object identified by the primary key.object identified by the primary key.
Basic Requirements of the Basic Requirements of the Relational Data ModelRelational Data Model4 Each column in a row must be single-valued.Each column in a row must be single-valued.5 The value in every row of a specific column must The value in every row of a specific column must
be of the same data type.be of the same data type.6 Neither column order nor row order is significant.Neither column order nor row order is significant.
Accessing recordsAccessing records
Records are typically Records are typically updated, stored, and updated, stored, and retrieved using an identifier retrieved using an identifier called a called a primary keyprimary key– customer number for customer filecustomer number for customer file– invoice number for invoice fileinvoice number for invoice file– stock number for inventory filestock number for inventory file
Accessing RecordsAccessing Records A A secondary keysecondary key is another is another
field used to identify a recordfield used to identify a record Secondary keys do not uniquely Secondary keys do not uniquely
identify individual recordsidentify individual records Examples of secondary keysExamples of secondary keys
– invoice due dateinvoice due date– zip codezip code– bank customer last namebank customer last name
Accessing RecordsAccessing Records Foreign key:Foreign key:
attribute attribute (field) in one (field) in one table (record) table (record) that matches that matches primary key in primary key in another tableanother table
Used to link tables Used to link tables togethertogether
Relational DatabaseRelational Database
ProductProductNumberNumber
VendorVendor CodeCode
123467123467 ZDGZDG
243893243893 CFCCFC
277883277883 TBTTBT
476556476556 BBCBBC
775622775622 DFFDFF
Product TableProduct Table
Primary KeyPrimary Key
Foreign KeyForeign Key
VendorVendor CodeCode
ShipShip ModeMode
ACCACC TRKTRK
BADBAD ARPARP
BBCBBC TRKTRK
CACCAC UPSUPS
Vendor TableVendor Table Go to top ofVendor Table Go to top ofVendor Table
Searchsequentially until find ‘BBC”
Searchsequentially until find ‘BBC”
******
Relational DatabasesRelational Databases
FormalFormal TermTerm
LessLess Formal Formal TermTerm
DataDataProcessingProcessing TermTerm
relationrelation tabletable filefile
tupletuple rowrow recordrecord
attributeattribute columncolumn fieldfield
AgendaAgenda Database Management–An IntroductionDatabase Management–An Introduction Relational DatabaseRelational Database Entity-Relationship DiagramEntity-Relationship Diagram
ENTITY-RELATIONSHIP MODEL ENTITY-RELATIONSHIP MODEL (proposed by CHEN, 1976)(proposed by CHEN, 1976)
A detailed logical representation of data for an A detailed logical representation of data for an organization or business areaorganization or business area
Four Basic Constructs:Four Basic Constructs:-Entity-Entity-Relationship-Relationship-Attribute-Attribute-Cardinality (participation)-Cardinality (participation)
1.1. ENTITYENTITY: : Entities are named Entities are named objects in the universe of objects in the universe of discoursediscourse Types of entitiesTypes of entities
Thing (truck, building)Thing (truck, building) Person (customer, employee)Person (customer, employee) EventEvent
Instant duration (sale, purchase, cash Instant duration (sale, purchase, cash receipt)receipt)
Extended duration (month-long use of a Extended duration (month-long use of a truck, a course offering that starts on JAN 3 & truck, a course offering that starts on JAN 3 & ends on 15 May)ends on 15 May)
Concept (category of customer, course)Concept (category of customer, course) SYMBOL -- RectangleSYMBOL -- Rectangle
Customer Course offering
2. RELATIONSHIP 2. RELATIONSHIP : : Association Association between two (or more ?) entitiesbetween two (or more ?) entities
Examples:Examples: employee “assigned to” buildingemployee “assigned to” building customer “participates in” salecustomer “participates in” sale professor “teaches” course-professor “teaches” course-
offeringoffering
SYMBOL -- DiamondSYMBOL -- Diamond
Customer Saleparticipates
in
3. ATTRIBUTE 3. ATTRIBUTE : : Characteristics or elementary Characteristics or elementary properties of entities or relationships. They are properties of entities or relationships. They are used for actual communication about the real used for actual communication about the real world phenomena represented by entities or world phenomena represented by entities or relationshipsrelationships
Example attributes for the entity Example attributes for the entity INVENTORY:INVENTORY: stock#, color, price, cost, weightstock#, color, price, cost, weight
A primary key is a special attribute used to represent A primary key is a special attribute used to represent an instance of an entity or relationship in a databasean instance of an entity or relationship in a database Must be unique and universalMust be unique and universal Can be a concatenated (combined key)Can be a concatenated (combined key) ““No representation without identification”No representation without identification” For this class, we assume that relationships are For this class, we assume that relationships are
identified by the keys of their participating entitiesidentified by the keys of their participating entities SYMBOL – small connected circle (filled in for primary SYMBOL – small connected circle (filled in for primary
key)key)
InventoryStock# Color Price
4. Participation CARDINALITY (min, max): 4. Participation CARDINALITY (min, max): These show the correspondence of These show the correspondence of
entities entities and relationshipsand relationships
A Brel(min, max)
Entity “A” participates in relationship “rel” at a minimum of
- “0” times (optional)
- “1” time (mandatory)
Entity “A” participates in relationship “rel” at a maximum of
- “1” time (single time only)
- “n” times (many times)
4. Participation CARDINALITY (min, max): 4. Participation CARDINALITY (min, max): (other side of relationship)(other side of relationship)
A Brel(min, max)
Entity “B” participates in relationship “rel” at a minimum of
- “0” times (optional)
- “1” time (mandatory)
Entity “B” participates in relationship “rel” at a maximum of
- “1” time (single time only)
- “n” times (many times)
An ExampleAn Example Assuming two entities (EMPLOYEE and COURSE), draw an E-R Assuming two entities (EMPLOYEE and COURSE), draw an E-R
diagram for the following (sample data). Assume that diagram for the following (sample data). Assume that Employee_name and Course_titles are unique. Also assume other Employee_name and Course_titles are unique. Also assume other attributes such as Employee Address, and Course Credits.attributes such as Employee Address, and Course Credits.
Employee_nameEmployee_name Course_titleCourse_title Date_completedDate_completed Chen Chen C++C++ 06/9806/98 ChenChen JavaJava 09/9809/98 LisaLisa C++C++ 06/9806/98 LisaLisa SQLSQL 03/9903/99 TrinaTrina JavaJava 03/9803/98 HeikkiHeikki PerlPerl 06/9806/98 HeikkiHeikki JavaJava 09/9809/98 ……………….. …….……. ……..…….. ……………….. …….……. ……..……..
More ExamplesMore Examples A company has a number of employees. The attributes A company has a number of employees. The attributes
of EMPLOYEE include NAME, ADDRESS, and BIRTH-of EMPLOYEE include NAME, ADDRESS, and BIRTH-DATE. The company also has several projects. The DATE. The company also has several projects. The attributes of Project include PROJECT_CODE, attributes of Project include PROJECT_CODE, DESCRIPTION, and START_DATE. Each employee may DESCRIPTION, and START_DATE. Each employee may be assigned to one or more projects, or may not be be assigned to one or more projects, or may not be assigned to any project. A project is required to have at assigned to any project. A project is required to have at least one employee assigned, but may have several least one employee assigned, but may have several employees assigned.employees assigned.
A university has a large number of courses in its A university has a large number of courses in its catalog. Attributes of courses include CRS_NO, catalog. Attributes of courses include CRS_NO, CRS_NAME, and UNITS. Each course may have one or CRS_NAME, and UNITS. Each course may have one or more other courses as prerequisites, or may have no more other courses as prerequisites, or may have no prerequisite.prerequisite.
AssignmentAssignment
A college course may have one or more scheduled A college course may have one or more scheduled sections, or may not have a scheduled section. sections, or may not have a scheduled section. COURSE attributes include CRS_ID, CRS_NAME, COURSE attributes include CRS_ID, CRS_NAME, and UNITS. Attributes of SECTION include and UNITS. Attributes of SECTION include SECTION_NO and INSTRUCTOR.SECTION_NO and INSTRUCTOR.
A laboratory has several chemists who work on A laboratory has several chemists who work on various projects, and who may use certain kinds of various projects, and who may use certain kinds of equipment on each project. Attributes of CHEMIST equipment on each project. Attributes of CHEMIST include CHEMIST_ID, NAME, and PHONE. Attributes include CHEMIST_ID, NAME, and PHONE. Attributes of PROJECT include PROJ_ID and START_DATE. of PROJECT include PROJ_ID and START_DATE. Attributes of EQUIPMENT include EQUIP_NO and Attributes of EQUIPMENT include EQUIP_NO and COST.COST.
Topics DiscussedTopics Discussed
Database Management–An Database Management–An IntroductionIntroduction
Relational DatabaseRelational Database Entity-Relationship DiagramEntity-Relationship Diagram