Chapter 3 and Module CChapter 3 and Module C
DATABASES AND DATA DATABASES AND DATA WAREHOUSESWAREHOUSES
Supporting the Analytics-Supporting the Analytics-Driven OrganizationDriven Organization
Opening Case: Opening Case: Did You Know CDs Come from Did You Know CDs Come from
Dead Dinosaurs?Dead Dinosaurs?
In 2010, more than half of all music was in digital form; physical music will never again be the norm
INTRODUCTIONINTRODUCTION Business intelligence (BI)Business intelligence (BI)
Knowledge about your customers, Knowledge about your customers, competitors, business partners, competitors, business partners, competitive environment, and internal competitive environment, and internal operations to make effective, important, operations to make effective, important, and strategic business decisionsand strategic business decisions
AnalyticsAnalytics Fact-based decision-makingFact-based decision-making Integrated use of IT and statistical Integrated use of IT and statistical
techniques to create BItechniques to create BI
Data ProcessingData Processing
IT tools help process information to IT tools help process information to create business intelligence create business intelligence according to…according to… OLTPOLTP OLAPOLAP
Data ProcessingData Processing Online transaction processing (OLTP)Online transaction processing (OLTP)
The gathering and processing transaction The gathering and processing transaction information, and updating existing information to information, and updating existing information to reflect the transactionreflect the transaction
Databases support OLTPDatabases support OLTP Operational databaseOperational database – databases that support OLTP – databases that support OLTP
Online analytical processing (OLAP)Online analytical processing (OLAP) TThe manipulation of information to support decision he manipulation of information to support decision
makingmaking Databases can support some OLAPDatabases can support some OLAP Data warehouses only support OLAP, not OLTPData warehouses only support OLAP, not OLTP Data warehouses are special forms of databases that Data warehouses are special forms of databases that
support decision making and help build BIsupport decision making and help build BI
THE RELATIONAL DATABASE THE RELATIONAL DATABASE MODELMODEL
There are many types of databasesThere are many types of databases The relational database model is the The relational database model is the
most popularmost popular
Relational databaseRelational database
Database CharacteristicsDatabase Characteristics
1.1. Collections of informationCollections of information
2.2. Created with logical structuresCreated with logical structures
3.3. Include logical ties within the Include logical ties within the informationinformation
4.4. Include built-in integrity constraintsInclude built-in integrity constraints
2. Database – Logical 2. Database – Logical StructureStructure
CharacterCharacter FieldField RecordRecord File File
(Table)(Table) DatabaseDatabase Data Data
WarehouWarehousese
Advisor
Advisor IDALastNam
eAFirstNam
e
101 Leonard Lori
102 Aurigemma Sal
103 Bajaj Akhilesh
104 Platner Steve
105 McCrary Mike
ClassClass
SynonymClass Prefix
Class NoClass
Section
10342 MIS 3003 3
10344 MIS 1123 2
10359 MIS 4133 2
10450 MIS 1123 1
10578 MIS 2013 3
10643 MIS 4053 1
Student-ClassStudent ID Class Synonym
1011 10342
1011 10643
1013 10578
1014 10342
1014 10359
1014 10450
1015 10578
1016 10342
1017 10344
1017 10450
Student
Student IDSLastNam
eSFirstNam
eAdvisor ID
1011 Berry Jeff 101
1012 Smith Tom 103
1013 Sanders Tally 101
1014 Anderson Cindy 103
1015 Whitman Amy 102
1016 Jones Kelsi 105
1017 Phillips Susan 104
Logical Structure: CharacterLogical Structure: Character
CharacterCharacter FieldField RecordRecord File File
(Table)(Table) DatabaseDatabase Data Data
WarehouWarehousese
Advisor
Advisor IDALastNam
eAFirstNam
e
101 Leonard Lori
102 Aurigemma Sal
103 Bajaj Akhilesh
104 Platner Steve
105 McCrary Mike
ClassClass
SynonymClass Prefix
Class NoClass
Section
10342 MIS 3003 3
10344 MIS 1123 2
10359 MIS 4133 2
10450 MIS 1123 1
10578 MIS 2013 3
10643 MIS 4053 1
Student-ClassStudent ID Class Synonym
1011 10342
1011 10643
1013 10578
1014 10342
1014 10359
1014 10450
1015 10578
1016 10342
1017 10344
1017 10450
Student
Student IDSLastNam
eSFirstNam
eAdvisor ID
1011 Berry Jeff 101
1012 Smith Tom 103
1013 Sanders Tally 101
1014 Anderson Cindy 103
1015 Whitman Amy 102
1016 Jones Kelsi 105
1017 Phillips Susan 104
Logical Structure: FieldLogical Structure: Field
CharacterCharacter FieldField RecordRecord File File
(Table)(Table) DatabaseDatabase Data Data
WarehouWarehousese
Advisor
Advisor IDALastNam
eAFirstNam
e
101 Leonard Lori
102 Aurigemma Sal
103 Bajaj Akhilesh
104 Platner Steve
105 McCrary Mike
ClassClass
SynonymClass Prefix
Class NoClass
Section
10342 MIS 3003 3
10344 MIS 1123 2
10359 MIS 4133 2
10450 MIS 1123 1
10578 MIS 2013 3
10643 MIS 4053 1
Student-ClassStudent ID Class Synonym
1011 10342
1011 10643
1013 10578
1014 10342
1014 10359
1014 10450
1015 10578
1016 10342
1017 10344
1017 10450
Student
Student IDSLastNam
eSFirstNam
eAdvisor ID
1011 Berry Jeff 101
1012 Smith Tom 103
1013 Sanders Tally 101
1014 Anderson Cindy 103
1015 Whitman Amy 102
1016 Jones Kelsi 105
1017 Phillips Susan 104
Logical Structure: RecordLogical Structure: Record
CharacterCharacter FieldField RecordRecord File File
(Table)(Table) DatabaseDatabase Data Data
WarehouWarehousese
Advisor
Advisor IDALastNam
eAFirstNam
e
101 Leonard Lori
102 Aurigemma Sal
103 Bajaj Akhilesh
104 Platner Steve
105 McCrary Mike
ClassClass
SynonymClass Prefix
Class NoClass
Section
10342 MIS 3003 3
10344 MIS 1123 2
10359 MIS 4133 2
10450 MIS 1123 1
10578 MIS 2013 3
10643 MIS 4053 1
Student-ClassStudent ID Class Synonym
1011 10342
1011 10643
1013 10578
1014 10342
1014 10359
1014 10450
1015 10578
1016 10342
1017 10344
1017 10450
Student
Student IDSLastNam
eSFirstNam
eAdvisor ID
1011 Berry Jeff 101
1012 Smith Tom 103
1013 Sanders Tally 101
1014 Anderson Cindy 103
1015 Whitman Amy 102
1016 Jones Kelsi 105
1017 Phillips Susan 104
Logical Structure: FileLogical Structure: File
CharacterCharacter FieldField RecordRecord File File
(Table)(Table) DatabaseDatabase Data Data
WarehouWarehousese
Advisor
Advisor IDALastNam
eAFirstNam
e
101 Leonard Lori
102 Aurigemma Sal
103 Bajaj Akhilesh
104 Platner Steve
105 McCrary Mike
ClassClass
SynonymClass Prefix
Class NoClass
Section
10342 MIS 3003 3
10344 MIS 1123 2
10359 MIS 4133 2
10450 MIS 1123 1
10578 MIS 2013 3
10643 MIS 4053 1
Student-ClassStudent ID Class Synonym
1011 10342
1011 10643
1013 10578
1014 10342
1014 10359
1014 10450
1015 10578
1016 10342
1017 10344
1017 10450
Student
Student IDSLastNam
eSFirstNam
eAdvisor ID
1011 Berry Jeff 101
1012 Smith Tom 103
1013 Sanders Tally 101
1014 Anderson Cindy 103
1015 Whitman Amy 102
1016 Jones Kelsi 105
1017 Phillips Susan 104
Logical Structure: DatabaseLogical Structure: Database
CharacterCharacter FieldField RecordRecord File File
(Table)(Table) DatabaseDatabase Data Data
WarehouWarehousese
Advisor
Advisor IDALastNam
eAFirstNam
e
101 Leonard Lori
102 Aurigemma Sal
103 Bajaj Akhilesh
104 Platner Steve
105 McCrary Mike
ClassClass
SynonymClass Prefix
Class NoClass
Section
10342 MIS 3003 3
10344 MIS 1123 2
10359 MIS 4133 2
10450 MIS 1123 1
10578 MIS 2013 3
10643 MIS 4053 1
Student-ClassStudent ID Class Synonym
1011 10342
1011 10643
1013 10578
1014 10342
1014 10359
1014 10450
1015 10578
1016 10342
1017 10344
1017 10450
Student
Student IDSLastNam
eSFirstNam
eAdvisor ID
1011 Berry Jeff 101
1012 Smith Tom 103
1013 Sanders Tally 101
1014 Anderson Cindy 103
1015 Whitman Amy 102
1016 Jones Kelsi 105
1017 Phillips Susan 104
Databases – Created with Databases – Created with Logical StructuresLogical Structures
Databases have many tablesDatabases have many tables In databases, the row number is In databases, the row number is
irrelevant; not true in spreadsheet irrelevant; not true in spreadsheet softwaresoftware
In databases, column names are very In databases, column names are very important. Column names are important. Column names are created in the data dictionarycreated in the data dictionary
Database – Created with Logical Database – Created with Logical StructuresStructures
Data dictionary Data dictionary – contains the logical – contains the logical structure for the information in a databasestructure for the information in a database
Before you can enter information into a database, you must define the data dictionary for all the tables and their fields. For example, when you create the Truck table, you must specify that it will have three pieces of information and that Date of Purchase is a field in Date format.
3. Databases – With Logical 3. Databases – With Logical Ties Within the InformationTies Within the Information
Logical ties must exist between the Logical ties must exist between the tables or files in a databasetables or files in a database
Logical ties are created with primary Logical ties are created with primary and foreign keysand foreign keys
Primary key (PK)Primary key (PK) Composite primary key (CPK)Composite primary key (CPK) Foreign key (FK)Foreign key (FK)
Database – Logical Ties within Database – Logical Ties within the Informationthe Information
Customer Number is the primary key for Customer and appears in Order as a foreign key
Logical Ties – KeysLogical Ties – Keys A PK and a FK do not have to have the A PK and a FK do not have to have the
same name.same name. If a record can be uniquely identified If a record can be uniquely identified
with only one PK, then the file should with only one PK, then the file should only have one.only have one.
A PK is required (or CPKs) for each file.A PK is required (or CPKs) for each file. A FK may or may not exist for each file.A FK may or may not exist for each file. All CPKs do not have to be FKs.All CPKs do not have to be FKs.
4. Databases – Built-In 4. Databases – Built-In Integrity ConstraintsIntegrity Constraints
Integrity constraintsIntegrity constraints – rules that help – rules that help ensure the quality of the informationensure the quality of the information
ExamplesExamples Primary keys must be uniquePrimary keys must be unique Foreign keys must be presentForeign keys must be present Sales price cannot be negativeSales price cannot be negative Phone number must have area codePhone number must have area code
Steps in Developing a Steps in Developing a DatabaseDatabase
Step 1: Define Entity Classes (tables) Step 1: Define Entity Classes (tables) and Primary Keysand Primary Keys
Step 2: Defining Relationships Among Step 2: Defining Relationships Among Entity ClassesEntity Classes ERD (entity relationship diagram)ERD (entity relationship diagram) NormalizationNormalization: (1) eliminate M:M; (2) : (1) eliminate M:M; (2)
fields must depend on PK; (3) no derived fields must depend on PK; (3) no derived fieldsfields
Step 3: Defining Information For Each Step 3: Defining Information For Each RelationRelation
Step 4: Use A Data Definition Language Step 4: Use A Data Definition Language To Create Your DatabaseTo Create Your Database
5 Components of a DBMS5 Components of a DBMS1.1. DBMS engineDBMS engine
2.2. Data definition subsystemData definition subsystem
3.3. Data manipulation subsystemData manipulation subsystem ViewsViews Report generatorsReport generators QBE toolsQBE tools SQLSQL
4.4. Application generation subsystemApplication generation subsystem
5.5. Data administration subsystemData administration subsystem
ViewView
ViewView – allows you to see the contents of a database – allows you to see the contents of a database file, make changes, and query it to find informationfile, make changes, and query it to find information
Report GeneratorReport Generator
Report generator Report generator – helps – helps you quickly define formats you quickly define formats of reports and what of reports and what information you want to information you want to see in a reportsee in a report
Query-by-Example ToolQuery-by-Example Tool QBE tool QBE tool – helps you graphically – helps you graphically
design the answer to a questiondesign the answer to a question
Structured Query LanguageStructured Query Language
SQLSQL – standardized fourth-generation – standardized fourth-generation query language found in most DBMSsquery language found in most DBMSs
Sentence-structure equivalent to QBESentence-structure equivalent to QBEMostly used by IT professionalsMostly used by IT professionalsNon-procedural language, which Non-procedural language, which makes it different from other makes it different from other programming languagesprogramming languages
DATA WAREHOUSES AND DATA WAREHOUSES AND DATA MININGDATA MINING
Data warehouses support OLAP and Data warehouses support OLAP and decision makingdecision making
Data warehouses do not support OLTPData warehouses do not support OLTP
Data warehouseData warehouse Data martData mart Data-miningData-mining
Data Warehouse Data Warehouse ConsiderationsConsiderations
Do you really need one, or does your Do you really need one, or does your database environment support all your database environment support all your functions?functions?
Do all employees need a big data Do all employees need a big data warehouse or a smaller data mart?warehouse or a smaller data mart?
How up-to-date must the information How up-to-date must the information be?be?
What data-mining tools do you need?What data-mining tools do you need?
INFORMATION OWNERSHIPINFORMATION OWNERSHIP
Information is a resource you must Information is a resource you must manage and organize to help the manage and organize to help the organization meet its goals and organization meet its goals and objectivesobjectives
You need to considerYou need to consider Strategic management supportStrategic management support Sharing information with responsibilitySharing information with responsibility Information cleanlinessInformation cleanliness
Strategic Management Strategic Management SupportSupport
• Data administration Data administration – function – function that plans for, oversees the that plans for, oversees the development of, and monitors the development of, and monitors the information resourceinformation resource
• Database administration Database administration – – function responsible for the more function responsible for the more technical and operational aspects of technical and operational aspects of managing organizational informationmanaging organizational information
Sharing InformationSharing Information
Everyone can share – while not Everyone can share – while not consuming – informationconsuming – information
But someone must “own” it by But someone must “own” it by accepting responsibility for its quality accepting responsibility for its quality and accuracyand accuracy
Information CleanlinessInformation Cleanliness
Related to ownership and Related to ownership and responsibility for quality and accuracyresponsibility for quality and accuracy
No duplicate informationNo duplicate informationNo redundant records with slightly No redundant records with slightly different data, such as the spelling of different data, such as the spelling of a customer namea customer name
GIGO – if you have garbage GIGO – if you have garbage information you get garbage information you get garbage information for decision making information for decision making