Date post: | 12-Jan-2016 |
Category: |
Documents |
Upload: | lynn-oneal |
View: | 230 times |
Download: | 0 times |
PCLec 02 / 1
Lecture 2Lecture 2
This lecture will introduce some more terminology
- about primary keys, foreign keys, candidate keys, access keys and some design concepts.
There will be a brief mention of table structures, constraints and values.
There are some examples of outputs of queries
And we will also look at SQL
PCLec 02 / 2
Relational Database ConceptsRelational Database Concepts
The relational data model was developed by Dr. E.Codd (a mathematician) in the late 1960’s and early 1970’s.
The theory of normalisation of data is closely linked.
Databases based on the relational model should be easy to use and understand.
There should be no need for the user to be aware of the physical structure of the underlying files.
Most databases developed for commercial use are now based on the relational model.
PCLec 02 / 3
Data ModelsData Models
Codd suggests that any data model has three components:
the data structures;
the integrity constraints;
the data manipulation operators.
PCLec 02 / 4
The Relational Data ModelThe Relational Data Model
Data Structures - domain, attribute, relation, row (tuple), primary key, degree, cardinality.
Integrity Constraints - entity integrity and referential integrity.
Data Manipulation Operations - defined through relational algebra and equivalent relational calculus.
PCLec 02 / 5
The Beginning of the Relational ModelThe Beginning of the Relational Model
In 1969, Dr. Edgar F Codd published an original paper titled ‘ ‘Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks’
In 1970, there was a revised version titled ‘A Relational Model of Data for Large Shared Data Banks’
PCLec 02 / 6
Dr. Codd’s Relational TheoryDr. Codd’s Relational Theory
He also published• ‘Relational Completeness of Data Base Sublanguages’
• ‘A Data Base Sublanguage Founded on the Relational Calculus’
• ‘Further Normalisation of the Data base Relational Model’
• ‘Interactive Support for Non-programmers : The Relational and Network Approaches’
PCLec 02 / 7
Dr. Codd’s Relational TheoryDr. Codd’s Relational Theory
• And ‘Extending the Relational Database Model to Capture More Meaning’
Dr. Codd also produced papers relating to• Multiprogramming• Natural Language processing
The Relational Model serves as the basis for the theory of data - he instigated the ideas of predicate logic as a foundation for database management and he defined both a relational algebra and a relational calculus as a basis for data in relational form.
PCLec 02 / 8
Dr. Codd’s Relational TheoryDr. Codd’s Relational Theory
• The Original Paper (‘Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks’ - 1969) contains references to these aspects:
1 A Relational View of Data
2 Some Linguistic Aspects
3 Operations on Relations
4 Expressible, Named and Stored Relations
5 Derivability, Redundancy and Consistency
6 Data Bank Control
PCLec 02 / 9
Dr. Codds Relational TheoryDr. Codds Relational Theory
• The relational model described ‘provides a means of describing data in terms of its natural structure’ - no machine representation details.
• The model also provided a basis for constructing a high-level retrieval language with ‘maximal data independence’ which led to the development of SQL
PCLec 02 / 10
Meet Dr. E.F.Codd
PCLec 02 / 11
Relational Data StructureRelational Data Structure
EMPLOYEE
Empno Name Gender Mgr Empno
E1 Jones Male E65
E6 Smith Male E28
E28 Jones Female -
Relation Attribute
Heading
Body
FemaleMale
Gender
Domains
EmpnoE1 toE125
And what about ‘expired’ Empnos ?
PCLec 02 / 12
DomainsDomains
EmpnoE1E2E3
NameRed
BrownBlack
Mgrno-
E1E1
E1, E2, E3,E4Red, BrownBlack, Blue
Attributes
Domains
Employee
Person Name
Empno
PCLec 02 / 13
Value Sets and DomainsValue Sets and Domains
• Domains in Relational Database can be extensive and
complex.
• A ‘domain’ (a restriction of value or expression) can be applied to the result of a function or of a derived value.
For example, the multiplication of a person’s age by the person’s I.D. would not lead to a realistic value
A domain constraint would ensure that this process, if initiated, would not proceed and would result in an error message being displayed
PCLec 02 / 14
Value Sets and DomainsValue Sets and Domains
• The arithmetic addition of an I.D. and a date of birth would also be a non realistic value
• Domains can be used to limit which attributes can be associated with other attributes - this leads to interesting and complex processes - Rules and Procedures (Ingres) and Triggers and Constraints (Oracle).
• Access has the option of delving into Visual Basic
• Does anyone know what SQLServer has available ?
PCLec 02 / 15
Relational Data StructuresRelational Data Structures
• The only structure available is a 2-dimensional file of data.
• This is known as a relation or table.
• Each entity corresponds to a table and each attribute to a column (or field) in that table.
• Each entity occurrence corresponds to a row of the table.
PCLec 02 / 16
Properties of RelationsProperties of Relations
• Data is held in tables
• There is no order of data in the tables - either in row or attribute
• Primary Key - Foreign Key relationship • Data Typing including NULLS
• Query Access - insert, update, delete, retrieval
• Indexing on Candidate (and Primary) keys
PCLec 02 / 17
Some ConceptsSome Concepts
A database system is a computerised record keeping system
A database is a collection of structured data files and associated indexes
A database user must be able to add, retrieve, insert, update and delete data and files
A set is any collection of definite distinguishable things. Olympians for instance are a ‘set’ of people.
The term distinguishable means that in inspection of any 2 things which fit into a set, there must be the capability of deciding if they are identical or different
PCLec 02 / 18
Some ConceptsSome Concepts
The term ‘definite’ means that if the set is known, and the thing is known, a decision can be made that
(a) the thing belongs to the set or
(b) the thing does not belong to the set
For the set to be known, it is sufficient if the members are known
PCLec 02 / 19
A RelationA Relation
Relations exist between 2 or more things
There is a relation between Lleyton Hewitt and tennis
There is a relation between Steve Waugh and cricket
There is a relation between Tiger Woods and golf
We could present this as :
Name Sport
Lleyton Hewitt Tennis
Steve Waugh Cricket
Tiger Woods Golf
and we have a relation of degree 2. We can also have relations of any required degree 3, 4, 5 ……….
PCLec 02 / 20
A RelationA Relation
This is a table of ‘ordered pairs’ and the relationship is directional Lleyton Hewitt plays Tennis - Tennis doesn’t play Lleyton Hewitt. This is a binary relation.
The order is horizontal, and is row limited.
The order of the rows in the table is immaterial to the data
In this example (and in any table) the relationship is the set of all ordered pairs
(Question : what happens to this data if, for instance, Lleyton Hewiit is unable to play tennis ?)
PCLec 02 / 21
Another RelationAnother Relation
We could have this
Name Activity Residence Date of Death
Smith, J Doctor Clayton 22-09-1998
Ellis,T Blacksmith Colac 12-10-1976
Werija,K Lecturer Caulfield ???
Brack,S Premier Ballarat ???
This is a relation, or table, of degree 4
Notice that each row has only 1 entry in each ‘column’ or attribute - this is called the ‘atomic value’
PCLec 02 / 22
Strictly SpeakingStrictly Speaking
A ‘set’ in mathematics has no duplicates
A relation is a set, so a relation shouldn’t have duplicates either
A relational database consists of tables
A table is not a relation, but the only difference is that a table may have duplicate row values (not a good idea)
Duplicate rows should be avoided and the duplicates erased
All relational database should consist of relations
Relations must have unique names
PCLec 02 / 23
A TableA Table
A Table : Is a named set of rows - an ordered row of one or more column names, together with zero or more unordered rows of data values
Tables store data about a specific entity - each row in a Table describes a single occurrence of that entity.
The SQL Standard defines 3 types of tables - Base tables, Views, and Derived tables
PCLec 02 / 24
More on TablesMore on Tables
Base tables are created and managed with the Create Table, Alter Table and Drop Table statements.
Views are created and managed with the Create View and Drop View statements
Derived tables are created when a query is executed.
Tables are dependent a Schema or a Module.
PCLec 02 / 25
More on TablesMore on Tables
Column :
A column is a named component of a table. A set of similar data values describe the same attribute of an entity. A column’s values all belong to the same data type or to the same Domain, and may vary over time.
A Column value is the smallest unit of data which can be selected from, or updated in, a table.Columns are dependent on some table, and are created, altered, and dropped with column definition in the Create Table and Alter Table statements
PCLec 02 / 26
A Primary KeyA Primary Key
• McFadden, Hoffer and Prescott define a Primary Key as :
An attribute (or combination of attributes) which uniquely identifies each row in a relation. (table)
• Richard T. Watson has this to say:
The primary key definition block specifies a set of column values comprising the primary key. Once a Primary Key is defined, the system enforces its uniqueness by checking that the Primary Key of any new row does not already exist in the table.
PCLec 02 / 27
A Primary Key - What’s That ?A Primary Key - What’s That ?
• A key - a unique identifier
‘A key is said to be nonredundant if every attribute it contains is necessary for the purpose of unique identification - if any attribute of the key were removed, the remaining attributes would not be a unique identifier’
PCLec 02 / 28
And a Foreign Key ?And a Foreign Key ?
• McFadden, Hoffer and Prescott’s definition:
An attribute (or attributes) in a relation (table) of a database which serves as the Primary Key of another relation (table) in the same database.
• Richard T. Watson says:
An attribute (or attributes) that is a Primary Key in the same table, or another table. It is the method of recording relations in a relational database.
And, both the Primary and Foreign Key(s) should be drawn from the same Domain.
PCLec 02 / 29
Other KeysOther Keys
• Candidate Key(s) - is a key (an attribute, or attributes) which should be considered as a Primary Key
• Access Key - an attribute, or attributes, other than the Primary (or Foreign) key on which data will be retrieved from a table e.g. postcode as in your second tutorial example
PCLec 02 / 30
SQL - An IntroductionSQL - An Introduction
• With SQL, the user does not ‘open’ nor ‘close’ tables
• A user normally has a subset of tables to which access is allowed, and privileges are granted to allow the user to perform some specific functions
• A query (an access to data in a table or tables) returns the whole result set all at once. All of the required rows are updated, inserted or deleted - or none of the rows are.
• The whole set involved in the ‘transaction’ works, OR the whole ‘transaction’ fails
PCLec 02 / 31
A TransactionA Transaction
A transaction is a sequence of SQL statements which Oracle treats as a single unit
The set of changes is made permanent with the Commit statement
Part or all of a transaction can be undone with the Rollback statement
A transaction starts with the execution of the first SQL statement in the transaction and ends with either the Commit or Rollback statement
PCLec 02 / 32
SQL - An IntroductionSQL - An Introduction
Transaction Control• A transaction in SQL is either completely finished OR it is not
done at all• No partial results can be produced• Work done can be committed - it becomes a permanent part
of the database or it can be rolled back - the database is restored to the state prior to the transaction commencing
• SQL programmers need to be aware of the need for concurrency control - that is the sharing of the database contents among transactions (more about this later)
PCLec 02 / 33
A TransactionA Transaction
Oracle guarantees that a transaction has statement-level read consistency (the data stays the same while Oracle is gathering and returning it)
If a transaction has multiple queries, then each query is consistent, but not with each other
Transaction-level read consistency can be achieved with the Set Transaction Read Only - (queries only)
PCLec 02 / 34
SQL - An IntroductionSQL - An Introduction
SQL has some very specific rules
1 is that every table has a structure
Another rule is that insertion, updating and deletion of rows in each table can only occur if all the rows have the same structure as the rest of the rows in the table
This reinforces the rule that – A table is a set of rows of one particular type
PCLec 02 / 35
SQL - An IntroductionSQL - An Introduction
A table has no ordering - data is not ‘in ascending or descending’ order or ‘date’ order ….
Columns are referenced by name only, not by their relative position in a table
The columns of a table can be re-arranged, BUT the SQL statements referencing this or these tables are not affected
PCLec 02 / 36
Properties of RelationsProperties of Relations
Integrity Constraints included in the DBMS– Attribute value ranges– Referential Integrity– Entity Integrity - No part of any Primary key may be null
Set retention constraints (how long to retain a set of data)
Domain constraints
User Defined Rules
Recovery Procedures (after failure)
PCLec 02 / 37
Properties of RelationsProperties of Relations
No explicit linkage between tables - set up at run time
Linking or Embedding database operations in a procedural language
The Database may be distributed across similar or different DBMS’s
PCLec 02 / 38
A Relational DatabaseA Relational Database
EMPNUM NAME Date of Birth DEPTNUM 3 JONES 27/11/1967 650 7 ADAMS 14/10/1978 432 11 NGUYEN 9/05/1977 314 18 PHAN 30/06/1969 432
Relation Schema EMP(empnum,name,age,deptnum)
DEPTNUM DEPTNAME650 PRODUCTION432 INFOSYS314 FINANCE
Relation Schema DEPT(deptnum, deptname)
PCLec 02 / 39
A Relational DatabaseA Relational Database
EMPNUM NAME Date of Birth DEPTNUM 3 JONES 27/11/1967 650 7 ADAMS 14/10/1978 432 11 NGUYEN 9/05/1977 314 18 PHAN 30/06/1969 432
Relation Schema EMP(empnum,name,age,deptnum)
DEPTNUM DEPTNAME650 PRODUCTION432 INFOSYS314 FINANCE
Relation EMP
Relation DEPT
Relation Schema DEPT(deptnum, deptname)
PCLec 02 / 40
More TerminologyMore Terminology
The degree of a relation is the number of attributes in that relation.
Degree Name1 unary2 binary3 ternary.n n-ary
The cardinality is the number of rows in the relation (table).
PCLec 02 / 41
Primary KeysPrimary Keys
EmpnumE110E261E311
SurnameParkesKimballHurwitz
Given NameJohn JohnFred
Tax FileNo 100-100-232
101-111-222
A candidate key of a relation is a set of attributes that satisfy two time independent properties:
Uniqueness - No two rows of the relation have the same values for the set of attributes forming the candidate key.
Minimality - No attributes can be discarded from the candidate key without destroying the uniqueness property.
PCLec 02 / 42
Entity IntegrityEntity Integrity
Given NameJohn
Fred
SurnameParkes Kimball HurwitzAshton
Salary40,00050,00060,00070,000
· No component of the Primary Key of a base relation is allowed to accept nulls.
What is the Primary Key ?
PCLec 02 / 43
Foreign KeyForeign Key
· A foreign key is an attribute or attribute combination of one relation R2 whose values are required to match those of the primary key of relation R1 where R1 and R2 are not necessarily distinct. The foreign key and the corresponding primary key should be defined on the same domain(s).
EmpnumE110E261E311
SurnameParkesKimballHurwitz
Deptd1d2d3
Worksfordeptd1d3d2
DnamePayTaxArt
Employee DeptForeign key
PCLec 02 / 44
Referential IntegrityReferential Integrity
If base relation R2 includes a foreign key FK matching the primary key PK of some base relation R1 then every value of FK in R2 must either
(a) be equal to the value of PK in some row of R1, or
(b) be wholly null.
Note that PK and FK may comprise more than one attribute and that R1 and R2 are not necessarily distinct.
( Stated more simply : a foreign key should associate to a valid primary key value, or the foreign key should be null.)
PCLec 02 / 45
Recording Design DecisionsRecording Design Decisions
Formal design decisions can be recorded in the same graphical notation as an E-R diagram.
This is called a data structure diagram and is developed from normalised relations using a few simple steps.
PCLec 02 / 46
Recording Design DecisionsRecording Design Decisions
a) Treat each relation as an entity, represent it as a rectangle and enter its name.
b) Primary and Foreign keys are used to establish the relationships (Note; a foreign key can be part of a composite primary key).
If the primary key in one relation exists as the foreign key in another relation, then draw a line linking the relationship between these two entities.
PCLec 02 / 47
Some E-R ExamplesSome E-R Examples
DEPARTMENT EMPLOYEE
STUDENT RESULT UNIT
DEPARTMENT(DeptNo,Dname) EMPLOYEE(Empnum,Ename,Salary,DeptNo)
STUDENT(StudentNo,Name) UNIT(Unitcode,Title) RESULT (StudentNo,Unitcode,Result)
PCLec 02 / 48
Open to InterpretationOpen to Interpretation
Student Course
Unit Text
There are a number of ‘rules’ in this model, whichdetermine the relationships.
They are known as Business Rules.
PCLec 02 / 49
The Rules ?The Rules ?
• A student must be enrolled in 1 Course• A Course may contain zero, or many students• A student may be enrolled in many units, but at least 1• A unit may attract many students (or no students ?)• Each Unit has one prescribed text• Each text is associated with one unit
PCLec 02 / 50
Open to InterpretationOpen to Interpretation
Customer Invoice
Line
Product
Each Customer may generate one ormore Invoices
Each Invoice is generated by one Customer
Each Invoice contains one or more linesEach line is contained in an Invoice
Each line references one productEach Product may be referenced in one or more lines
PCLec 02 / 51
Modelling to ProcessingModelling to Processing
So,
how do we convert the conceptual design details into software which allows for the entry of data into the appropriate tables,
and for further processing to allow for the use of this data to respond to queries ?
PCLec 02 / 52
Something Different - or, how do we make this happen ?
An Introduction to SQL
PCLec 02 / 53
Some Comments Regarding SQLSome Comments Regarding SQL
In the next few overheads, there will be some terms and explanations which should help you to make the transition from the methods of data storage and file processing to that of the relational database style of storage and processing of data.
PCLec 02 / 54
An Introduction to SQLAn Introduction to SQL
Firstly some plusses for SQL.
1. SQL is the one industry standard for querying databases
2. Other ‘tools’ such as front enders don’t allow the developer to use all of the features of a database
3. Tools provided invariably do not exploit the full functionality of the underlying language
4. An SQL query in a client-server environment can be run in any application language and the result will always be the same
PCLec 02 / 55
Some SQL BasicsSome SQL Basics
SQL acts as a bridge between
– the user
– the database management system (DBMS)
– the data tables
– the transactions which involve the previous 3 items
SQL also allows the ‘system’ to be administered and
managed by a database administrator using the same format
: procedural commands and data in tables. (.net ?)
SQL can be embedded into source code from C to Pascal
PCLec 02 / 56
Procedural and Non-Procedural LanguagesProcedural and Non-Procedural Languages
SQL requires a different approach from that used in other programming languages
C, Fortran, Basic, Cobol, Pascal, PL/1 are procedural languages. They are characterised by statements which tell the executing computer what to do, and in a structured step-by-step way (even when loops are used).
SQL is a declarative language - the computer is told what the user wants to achieve and the computer ‘decides’ on how to achieve this requirement, and correctly.
The user sees the results.
PCLec 02 / 57
SQL SetsSQL Sets
SQL is a set-oriented language.
Many programmers are used to file-oriented languages.
A set is an unordered collection of items, all of which have the same type and structure
These sets become tables in SQL, and are made up of vertical attributes (or columns) and horizontal rows
PCLec 02 / 58
SQL - Data ManipulationSQL - Data Manipulation
Data Retrieval (DML)
SELECT retrieve data from table
Data Modification (DML)
INSERT add a single row or copy rows from other table(s)
UPDATE amend column values
DELETE delete rows of data
PCLec 02 / 59
Data Definition - DDL (Oracle)Data Definition - DDL (Oracle)
Creating Tables
create table emp,(empno number(6,0), name varchar2(20), salary number(6,0), age number(3,0), deptno number(5,0));
A table is defined. Space is reserved.
The system catalogue is updated. (also known as the Data Dictionary)Table and Column Names begin with alpha (A-Z) less than or equal to 12 charactersTable names contain (A-Z, 0-9)Column names contain (A-Z, 0-9,)
PCLec 02 / 60
Data Definition (DDL)Data Definition (DDL)
Did you notice the entries such as– Number(5,0)– Varchar2(20)– Number(6,0)
in the previous overhead ?These are ‘data types’ and further assist integrity by defining
actual data values which can exist for each attribute
The size (or number of bytes) of each attribute is also expressed (either explicitly or implicitly)
PCLec 02 / 61
Overview of SQLOverview of SQL
Data Definition (DDL)
Create Table define table and constraints
Create View define user view of data
Alter Table add new columns (Oracle)
Drop Table delete table
Drop View delete user view
PCLec 02 / 62
Overview of SQLOverview of SQL
Data Control
Commit commit changes to the databaseRollback rollback previous changes
Data Security
Grant grant access privileges to usersRevoke revoke access privileges
PCLec 02 / 63
IBM Relational Products
DB2/nn MVS/370 MVS/XASQL/DS VM/CMS DOS/VSEQMF front-end to DB2 and SQL/DSCSP application development tool
Numerous other RDBMS
ORACLE 8, 8i, 9iOPENINGRES from ASK Corp. (OSL,ABF)AIM/RDB from FujitsuINFORMIX - now in DB2VAXSQL/Rdb from DECNonStop SQL from Tandem
Relational DBMS Products
Microcomputer versionsSQL Server (as in MS 2000)Quadbase-SQLORACLEINGRESdBASEV / Visual dBASEmicroSQLpractically all micro DBMS
Other Oracle Products :Designer2000, Developer2000,Programmer2000,Discoverer2000
PCLec 02 / 64
3 people agree to buy an item for $30 and hand over $10 each.
• The salesperson discounts the item by $5 , and refunds each person $1 each. Each person has therefore paid $9. (5/3 does not give an even amount)
• He keeps the remaining $2 as a token of good will.• Mathematically, 3 x $9 = $27 plus the additional
$2 = $29• The question is, where is the other $1 ???
Can you explain this ?
PCLec 02 / 65
And , what are your views on this ?
These are quotations :
Eye Drops Off Shelf
Wild Cow Injures Farmer with Axe
and Cold Wave Linked to Temperatures
PCLec 02 / 66
Relax - until the next sessionRelax - until the next session