+ All Categories
Home > Documents > Dbms Theory

Dbms Theory

Date post: 20-Nov-2014
Category:
Upload: bsr987
View: 764 times
Download: 0 times
Share this document with a friend
Popular Tags:
36
THEORY DATA: The term data referred to known as facts that could be recorded and stored on computer media. Data consists of facts, text, graphics, images, sound, and video segments that have meaning in the user environment. Facts Cust name Address Phone no Text Documents Numeric Digits images Photos Sound Audio files Video Video files Graphics Excel graphs DATABASE: Database is organized collection of logically related data. Organized means the data are structured to be easily stored, manipulated, and retrieved by users Related means that data describe a domain of interest to a group of users can use the data to answer questions concerning. sr. no Database Data 1 Time table Day Hour Subject 2 Sales person Cust name Address Phone no INFORMATION : Data that have been processed in such a way as to increase the knowledge of the person who use the data EX:
Transcript
Page 1: Dbms Theory

THEORY

DATA: The term data referred to known as facts that could be recorded and stored on computer media.

Data consists of facts, text, graphics, images, sound, and video segments that have meaning in the user environment.

FactsCust nameAddress Phone no

Text Documents

Numeric Digits

images Photos

Sound Audio files

Video Video files

Graphics Excel graphs

DATABASE: Database is organized collection of logically related data.Organized means the data are structured to be easily stored, manipulated, and retrieved

by usersRelated means that data describe a domain of interest to a group of users can use the data

to answer questions concerning.

sr. no Database Data

1 Time table Day Hour Subject

2 Sales person Cust name Address Phone no

INFORMATION : Data that have been processed in such a way as to increase the knowledge of the person who use the data

EX: Mon 4th SQL programming

Page 2: Dbms Theory

Tue 1st SQL programming Thu 3rd MDBMS

Fri 2nd MDBMS

METADATA: Data that describes the properties or characteristics of data

Data item Value

Name Type Length Min Max Description

Course Alphanumeric 30Course id and

name

Hour Integer 1 6 Hours in a day

Subject Alphanumeric 15 Book name

Here data type, length, minimum and maximum values are consider as metadata It means metadata describes the properties of data but do not include that data.

DISADVANTAGES OF FILE PROCESSING SYSTEMS

1. PROGRAM DATA DEPENDENCE: File descriptions are stored with in each application program that accesses a given file. If any change to file structure requires changes to the file description for all programs that access the file.

For ex suppose it is decided to change the customer address field length in the records from 30 to 40 characters the file descriptions in each program it is effected would have to be modified. It is often difficult even to locate all programs effected by such changes

2. DUPLICATION OF DATA: Applications are often developed independently in file processing system unplanned duplicate data files rule the data.

Page 3: Dbms Theory

Orderfilling system contains an inventory master file while the invoicing system contains an inventory pricing file. These files undoubtedly contains product description, unit price, quantity on hand. This duplication is wasteful and required additional storage space and increased effort to keep all these files up to date

3. LIMITED DATA SHARING: In the traditional file processing system each application has its own private files and users have little opportunity to share data outside their own application

for ex users in the accounting dept have access to the invoicing system and its files but they do not have access the Orderfilling system or the pay roll system. It is very difficult to find a requested report from several incompatible files in separate systems.

4.LENGTHY DEVELEPMENT TIMES: In the traditional file processing system there is little opportunity to use previous development efforts. Each new application requires new file formats and descriptions. So writing the file access logic for each new program required lengthy development times. 5. EXCESSIVE PROGRAM MAINTENANCE: The preceding factors create heavy program maintenance. 80 percent of the total information system development budget may be devoted to program maintenance in such organization.For ex if an organization develops many separately managed databases with little or no coordination of the metadata then all the above said problems can occur

THE RANGE OF DATABASE APPLICATIONS: The range of database application can be divided into five categories

1. PERSONLA DATABASES: Personal databases are designed to support one user. Ex: 1. Personal computers 2. Lap tops Personal digital assistants (PDAs) has incorporated personal databases into handheld devices. These are not only functioning computing devices but also as cellular phones, ax senders and web browsers. Simple database application that store customer information can be used from a PC or PDA. It can be easily transferred from one device to the other for backup and work purposes.

Figure 1.7 from pg no 15.

Disadvantages: The data cannot be easily shared with other users. These are limited for very small organizations

2.WORKGROUP DATABASES: A workgroup is relatively small people who collaborate on the same project or application or on a group of similar projects or applications. A workgroup typically comprises fewer than 25 persons. The group of persons allowed to be easily shared the data and developments of database.

All the workgroup members are linked by a local area network (LAN). The database is stored on a central device called the database server is also connected to the network. Different

Page 4: Dbms Theory

types of group members (developer or project manager) may have different user views of the shared database

Figure 1.8 page no 16

3. DEPARTMENT DATABASES: A department is a functional unit within an organization. A department is generally larger than a workgroup typically between 25 and 100 persons. Department databases are designed to support various functions and activities of a department.

4. ENTERPRIZE DATABASES: An enterprise database is one whose scope is the entire organization or enterprise. Such databases are intended to support organization wide operations and decision making. An organization may have several enterprise databases. A single operational enterprise database is impractical for many medium to large organizations. An enterprise database need information from many supporting departments.

It has two major developments 1. Enterprise resource planning (ERP) systems 2. Data warehousing implementations ENTERPRISE RESOURCE PLANNING SYSTEMS: A business management system that integrates all functions of the enterprise such as manufacturing, sales, finance, marketing, inventory, accounting and human resources. ERP systems are software applications that provide the data necessary for the enterprise to examine and manage its activities.

All ERP systems are heavily dependent on databases to store the data required by the ERP applications.

DATA WAREHOUSE: An integrated decision support database whose content is derived from the various operational databases. Data warehouses collect their content from various operational databases including personal, workgroup and department databases. Data warehouses provide user to work with historical data.

Fig 1.9 pg. no 19

INTERNET, INTRANET, EXTRANET DATABASES:

INTERNET: It is a global network of public computers connects users of multiple plat formsTelephone wire, cable and satellite connect millions of computers around the world to each other.

WEB BROWSER: A worldwide network that connects users of multiple platforms easily through an interface known as a web browser

INTERNET DATABASE: A database attached to a web browser is called Internet database

INTRANET: Use of Internet protocols to establish access to company data and information that is limited to the organization.

Page 5: Dbms Theory

EXTRANET: use of Internet protocols to establish limited access to company data and information by the company’s customers and suppliers.

ADVANTAGES OF DATABASE APPROACH:

PROGRAM DATA INDEPENDENCE:

The separation of data descriptions (metadata) from the application programs that use the data is called data independence. With the database approach data descriptions are stored in a central location called the repository. This property of the database system allows an organization’s data to change with out changing the application programs that process the data

MINIMAL DATA REDUNDANCY: The database approach does not eliminate redundancy entirely. But it allows the designer

to carefully control the type and amount of redundancy. For ex each order in the order table contains a customer-id to establish the relationship between orders and customers.

IMPROVED DATA CONSISTENCY: By eliminating data redundancy we greatly reduce the opportunities for inconsistency. If a customer is stored only once we can not disagree on the stored values. Updating data values is greatly simplified when each value is stored in only one place only. We avoid wasted storage space that results from redundant storage.

USER VIEW A logical description of some portion of the database that is required by a user to perform some task. IMPROVED DATA SHARING: A Database is a shared corporate resource. Authorized internal and external are granted permission to use the database. Each user is provided one or more user views to facilitate this use. A user view is often a form or report that comprises data from more than one table.

INCREASED PRODUCTVITY OF APPLICATION DEVELOPMENT:

A major advantage of the data base approach is that it greatly reduce the cost and time for developing new business applications.

Assuming that data database and the related data capture and maintenance applications have already been designed and implemented the programmer can concentrate on the specific functions required for new applications.

ENFORCEMENT OF STANDARDS When the database approach is implemented with full management the database administration function should be granted single point authority and responsibility for establishing and enforcing standards. These standards will include naming conventions, data quality standards, uniform procedures for accessing, updating, and protecting data.

IMPROVED DATA QUALITY Database designers can specify integrity constraints that are enforced by DBMS. A constraint is a rule that cannot be violated by database users. IMPROVED DATA ACCESSIBILITY AND RESPONSIVENESS:

In a relational database end users without programming experience can have often retrieve and display data.

Page 6: Dbms Theory

Select * from product where product_name= ”computer desk” is SQL command to display the information about computer desks

REDUCED PROGRAM MAINTAINANCE:Stored data must be changed for a variety of reasons new data item types are added, data

formats are changed and so on. For ex in year 2000 problem common two digit year fields were extended to four digits to rectify that

COSTS AND RISKS OF THE DATABASE APPROACH

NEW SPECIALIZED PERSONNEL:

Data base approach need to hire and train individuals to design and implement databases, provide database administration services and manage a staff of new people. Because of the rapid changes in technology these new people will have to be retrained or upgraded on a regular basis.

INSTALLATION AND MANAGEMENT COST AND COMPLEXITY

A Multiuser database management system is a large and complex suite of software that has high initial costs, requires a staff of trained personnel to install and operate. Installing such a system may also require upgrades to the hardware and data communication systems in the organizationSubstantial training is normally required on an ongoing basis to keep up with new releases and upgrades.

CONVERSION COSTS:The term legacy system is widely used to refer to older applications in an organization

that ate based on file processing system. The cost of converting these older systems to modern database technology measured in terms of dollars, time and organizational commitment.

NEED FOR EXPLICIT BACKUP AND RECOVERY:A shared corporate database must be accurate and available at all times. This requires that comprehensive procedures to be developed and used for providing backup copies of data and for restoring a database when damage occurs. A modern database management system normally automates many more of the backup and recovery tasks

ORGANAIZATIONAL CONFLICT:A shared database requires a consensus on data definitions and ownership as well as responsibilities for accurate data maintenance. Experience has shown that conflicts on data definitions, data formats and coding, rights to update shared data. Handling these issues requires organizational commitment and organizationally astute database administrators

COMPONENTS OF THE DATABASE ENVIRONMENT

1. COMPUTER AIDED SOFTWARE ENGINEERING (CASE) TOOLS Automated tools used to design databases and application programs

2. REPOSITORY: A centralized knowledge base of all data definitions, data relationships, screen and report formats, and other system components. A repository contains an extended set of metadata important for managing databases.

3. DATABASE MANAGEMENNT SYSTEMS (DBMS)

Page 7: Dbms Theory

A commercial software application that is used to create, maintain, and provide controlled access to user databases.

4. DATABAE: Database is organized collection of logically related data. The repository contains definitions of data where as the database contains occurrences of the data

5. APPLICATION PROGRAM Computer programs that are used to create and maintain the database and provide information to users

6. USER INTERFACELanguages, menus, and other facilities by which users interact with various system

components, such as CASE tools, application programs, the DBMS and the repository 7. DATA ADMINISTRATORS

Persons who are responsible for the overall information resources of an organization. Data administrators use CASE tools for system requirements analysis and program design

8. SYSTEM DEVELOPERS: Persons such as systems analysts and programmers who design new application programs. System developers often use CASE tools for system requirements analysis and program design

9. END USERSPersons throughout the organization who add, delete, and modify data in the database and who request or receive information from it. All user interactions with the database must be routed through the DBMS.

THE DATA BASE DEVELEPMENT PROCESS

DATABASE DEVELOPMENT WITH IN INFORMATION SYSTEMS DEVELOPMENT

Page 8: Dbms Theory

In many organizations database development begins with enterprise data modeling.

ENTERPRISE DATA MODELING: The first step in database development, in which the scope and general contents of the organizational databases are specified

Fig 2.1 page no 37

INFORMATION SYSTEMS ARCHITECTURE (ISA):A conceptual blueprint or plan that expresses the desired future structure for the

information systems in an organization.It consists of six key components

1. DATA : represented in fig2.12. PROCESS: That manipulates data. These can be represented by data flow diagrams 3. NETWORK: Which transports data around the organization and between the organization4. PEOPLE: Who perform processes and are the source and receiver of data and information5. EVENTS AND POINTS IN TIME: When processes are performed. These can

be shown by state transition diagram.6. REASONS: For events and rules that govern the processing of data. Some

diagrammatic tools exist for rules such as decision tables

INFORMATION ENGENEERINGIt is a data oriented methodology to create and maintain information systems. Because of

the data orientation information engineering can be helpful how databases are identified and defined. Information engineering follows top down planning

Information engineering includes four steps1. Planning 2. Analysis 3. Design 4. implementation

TOP DOWN PLANNING: A generic information systems planning methodology that attempts to gain a broad understanding of the information system needs of the entire organization

INFORMATION SYSTEMS PLANNING:The goal of information systems planning is to align information technology with the

business strategies of the organization. This planning phase includes three sections

1. IDENTIFYING STRATEGIC PLANNING FACTORS

Planning factors Examples

Organizational GoalsMaintain 10% per year growth rate

Maintain 15% before tax return on investmentAvoid employee layoffs be a responsible corporate citizen

Critical success factors

High quality productsOn time deliveries of finished products

High productivity of employees

Page 9: Dbms Theory

Problem areasInaccurate sales forecastIncreasing competition

Stockouts of finished products

For ex the problem area of in accurate sales forecast might cause information system managers to place additional historical sales data, new market research data or data concerning results from test trails of new products in organizational databases

IDENTIFYING CORPORATE PLANNING OBJECTSThe corporate planning objects define the business scope

1.Organizational units The various departments of the organization Ex: sales, orders, accounting, manufacturing

2.Organizational locations The places where business operations occur

Ex: corporate head quarters, Durango plant, western regional sales office, Lumber mill

3. Business functions A related group Business process that support some aspect of the mission of an

enterprise Ex: Business planning, product development materials management Marketing and sales

4. Entity typesMajor categories of data about the people, places and things managed by the organization

Ex: customer product, raw materiel order work center invoice

5. Information systems: The application software and supporting procedures for handling sets of data

Ex: Transaction processing system Management information systemOrder tracking Sales managementOrder processing Inventory controlPlant scheduling Production scheduling

DEVELOPING AN ENTERPRIZE MODELA comprehensive enterprise model consists of a functional break down (or

decomposition) model of each business function

Functional decomposition: An iterative process of breaking down the description of a system into finer and finer details in which one function is described in greater detail by a set of other, supporting functions

Fig 2.2 in pg. 39

Page 10: Dbms Theory

An example of decomposition of an order fulfillment. Many databases are necessary to handle the full set of business functions and supporting functions. A particular database may support only a subset of the supporting functions

An enterprise data model shows not only the entity types but also the relation ships between data entities. A common format for showing the interrelation ship between planning objects is matrixes

A wide variety of planning matrixes as follows

LOCATION TO FUNCTION: indicates which business functions are being performed at which business locations

UNIT TO FUNCTION: identifies which business functions are performed by or are the responsibility of which business units

INFORMATION SYSTEM TO DATA ENTITY: Explains how each information system interacts with each data entity Ex: whether each system creates, retrieves, updates or deletes data in each entity

SUPPORTING FUNCTIONS TO DATA ENTITY: Identifies which data are captured, used, updated, or deleted with in each function

INFORMATION SYSTEM TO OBJECTIVE: Shows which information systems support each business objective

SYSTEMS DEVELOPMENT LIFE CYCLE (SDLC):

A traditional process for conducting an information systems development project is called the systems development life cycle. The SDLC is a complete set of steps that a team of information systems professionals including database designers and programmers. They are used specify, develop, maintain, and replace information systems.

THREE SCHEMA ARCHITECTURE DEVELOPMENT

1. CONCEPTUAL SCHEMA(during the analysis phase)2. EXTERNAL SCHEMA OR USER VIEW( During the analysis and logical design phase)3. PHYSICAL OR INTERNAL SCHEMA( During the physical design phase)

CONSEPTUAL SCHEMA: A conceptual schema is a detailed specification of the overall structure of organizational

data that is independent of any database management technology

Page 11: Dbms Theory

A conceptual schema defines the whole database without reference to how data are stored in a computers secondary memory. Specifications for the conceptual schema are stored as metadata in a repository or data dictionary

EXTERNAL SCHEMA OR USER VIEW: A user view was defined as a logical description of some portion of the database that is

required by a user to perform some task.A user view is defined in both logical (technology independent) terms as well as

programming language terms (that is consistent with the syntax of the programming language). The original description of a user view is a computer screen displays a business transactionEx: subscription renewal form

PHYSICAL SCHEMA:A physical schema contains the specifications for how data from a conceptual schema are

stored in a computer’s secondary memory

THREE TIRED DATABASE LOCATION ARCHITECTURE

Three tiers more commonly considered

1. Data on a client server,2. Data on a application server or web server3. Data on a data base server.

1. CLIENT TIER: A desktop or laptop computer, which concentrates on managing the user system interface and localized data also called the presentation tier. Web scripting tasks may be executed on this tier.

2. APPLICATION/WEB SERVER TIER: Processes HTTP protocol, scripting tasks, performs calculations and provides access to data is called the process services tier

3. ENTERPRISE SERVER (DATABASE SERVER)Performs sophisticated calculations and manages the merging of data from multiple

sources across the organization is called the data services tier

Three tired architecture for databases and information related to the concept of client/server architecture

CLIENT /SERVER ARCHITECTUREA local area network based environment in which database software on a server (called a

database server or database engine) performs database commands sent to it from client work stations and application programs on each client concentrate on user interface functions

It allows for simultaneous processing on multiple processors for the same application.,

It is possible to take advantage of the best data processing features of each computer platform.

You can mix client technologies and share common data

Page 12: Dbms Theory

MODELING DATA IN THE ORGANAIZATION

MODELING THE RULES OF THE ORGANAIZATIONBusiness rules and policies govern creating, updating, and removing data in an

information processing and storage systemFor ex a student in a university must have a faculty adviser forces data in a database.

Business rules and policies are not universal. Different universities may have different policies for student advising. The policies of an organization may change over time. A university may decide that a student does not have to be assigned a faculty adviser until a student choose a major

THE ROLE OF A DATABASE ANALYST

Page 13: Dbms Theory

Identify and understand that those rules govern the data

Represent those rules so that they can be unambiguously understood by information systems developers and users

Implement those rules in database technology

OVERVIEW OF BUSINESS RULESA business rule is a statement that defines or constraints some aspect of the business. It is

intended to assert business structure or to control or influence the behavior of the business

Ex: A student may register for a section of a course only if he or she has successfully completed the prerequisites for that course

A preferred customer qualifies for a 10 percent discount unless he has an overdue account balance

CHARACTERISTICS OF GOOD BUSINESS RULES:

CHARACTERISTIC EXPLANATION

DECLARATIVE A business rule is a statement of policy, not how policy is enforced or conducted. The rule does not describe a process or implementation, but

rather describes what a process validates

PRECISEWith the related organization, the rule must have only one interpretation

among all interested people, and its meaning must be clear

ATOMICA business rule marks one statement, not several, no part of the rule can

stand its own as a rule(the rule is indivisible)

CONSISTENTA business rule must be internally consistent. That is not containing

conflicting statements.

EXPRESSIBLEA business rule must be able to be stated in natural language, but it will be stated in a structured natural language, so that there is no misinterpretation

DISTINCT Business rules are not redundant, but a business rule may refer to other rules

BUSINESS ORINETED

A business rule is stated in terms business people can understand, and since it is a statement of business policy, only business people can modify or invalidate a rule. Thus a business rule is owned by business

DATANAMES AND DATA DEFINATIONSData objects must be named and defined before they can be used unambiguously

DATA NAMES

RELATE TO BUSINESS, NOT TECHNICAl (HARDWARE OR SOFTWARE) CHARACTERISTICS

Customer is a good name but file10, bit7 and payroolreportsortkey are not good names

BE MEANINGFUL: The data name should be meaningful for documentation purpose. So avoid using generic words like “has” , “is”, “person” , “it”

Page 14: Dbms Theory

BE UNIQUE: The name used for every distinct object words should be included in the data name. Ex: Home address, campus address

READABLE: The name is structured most naturally.Ex: Grade point average is a good name Average grade point relative to a is a awkward name

COMPOSED OF WORDS TAKEN FROM AN APPROVED LIST:Each organization chooses vocabulary from significant words for data namesEx: maximum, never upper limit, ceiling, or highestAlternative or alias names can also be included in the complete set of database

documentation. Words in the vocabulary may also have approvedEx: CUST for customer

REPEATABLE: Different people or the same person at different times should develop exactly or almost

the same name. This means there is a standard hierarchy for data namesEx: Birth date of a student would be StudentBirthDate Birth date of a employee would be EmployeeBirthDate

DATA DEFINATIONS: A definition is considered a type of business rule. A definition is an explanation of a term or a fact. A term is a word or phase that has a specific meaning for the business.Ex: course, section, and rental car, flight reservation and passenger

Terms are often the keywords used to form data names

FACT: An association between to are more terms

A course is a module of instruction in a particular areaThe sentence contains two terms 1. Module of instruction, 2. Subject area

A customer may request a model of car from a rental branch on a particular date Here model, rental, request associates the four underlined terms

Page 15: Dbms Theory

THE ER MODEL

ENTITY RELATIONSHIP MODEL: (E-R MODEL): A logical representation of the data for an organization or for a business area. The E-R model is expressed in terms of entities in the business environment.

ENTITY RELATIONSHIP DIAGRAM: A graphical representation of an ER model. An E-R model is normally expressed as an entity relation ship diagram, which is a graphical representation.

Entities are represented by the rectangleRelationships between entities are represented by the diamond symbol connected by lines to the related entities.

CUSTOMER: A person or organization who has ordered or might order products EX: VRS & YRN COLLEGEPRODUCT: A type of furniture made by Pine Valley Furniture, which may be ordered by customer

ORDER: The transaction associated with the sale of the one or more products to a customer and identified by a transaction number from sales or accounting

ITEM: A type of component that goes into making one or more products and can be supplied by one or more suppliesEx: ball bearing

SUPPLIER: Another company that may provide items to pine valley furniture

SHIPMENT: The transaction associated with items received in the same package by pine valley furniture from a supplier.

A SUPPLIER may supply many ITEMS (by “may supply “ we mean the supplier may not supply any items). Each ITEM is supplied by any number of SUPPLIERS.(BY is supplied we mean must be supplied by at least one supplier.

Each item must be used in the assembly of at least one PRODUCT and may be used in many products. Conversely each product must use one or more items.

A SUPPLIER may send many SHIPMENTS. On the other hand each shipment must be sent by exactly one SUPPLIER. A supplier may be able to supply an item, but may not yet have sent any shipments of that item.

A shipment must include one or more ITEMS. An ITEM may be included on several SHIPMENTS.

Page 16: Dbms Theory

A CUSTMOER may submit any number of orders. However each order must be submitted by exactly one customer.

An ORDER must request one (or more) PRODUCTS. A given PRODUCT may not be requested on any order

ENTITIES: An entity is a person, place, object, event or concept in the user environmentabout which the organization wishes to maintain data.

Ex: Person : EMPLOYEE, STUDENT, PATIENTPlace : STORE,WAREHOUSE, STATE

Object : MACHINE, BUILDING, AUTOMOBILEEvent : SALE, RGISTRATION, RENEWALConcept : ACCOUNT, COURSE, WORK CENTER

ENTITY TYPE: A collection of entities that share a common properties or characteristics We use capital letters for names of entity type(s). In an e-r diagram the entity name is placed inside the box representing the entity type.

ENTITY INSTANCE: A Single occurrence of an entity type

Entity type: employee

Attributes:

Empno char(10)Name char(25)Address char(30)City char(10)

Two instances of EMPLOYEE

642-12 534-34q p100 pacific 450 red woodsan francisco redwood city

STRONG ENTITY TYPE: An entity that exists independently of other entity types

Page 17: Dbms Theory

Ex: employee, student, automobile and course

WEAK ENTITY TYPE: An entity type whose existence depends on some other entity typeEx: Class

Class cannot be uniquely identified without a course number Weak Entities are indicated by double lined rectangle

IDENTIFYING OWNER: The entity type on which the weak entity type depends

Identifying owner => coursethe class doesn’t exist without the course

IDENTIFYING RELATION SHIP: The relation ship between a week entity type and its owner

Identifying relationship => instance-ofthe relation ship is indicated by the double-lined diamond symbol.

NAMING AND DEFINING ENTITY TYPES:

1. SINGULAR NOUN: An entity type name is singular noun (Such as customer student or automobile)

2. SPECIFIC TO THE ORGANIZATION: An entity type name should be specific to the organization. One organization may use the entity type name CUSTOMER and another organization may use the term client. The name should be descriptive for the organization and distinct from all other entity type names within that organization.

3. CONCISE: An entity type name should be concise using as few words as possible. For example in a University database an entity type REGISTRATION for the event of a student registering for a class. Sufficient name for this entity type is student registration for class

4. ABBREVIATION OR SHORT NAME: An abbreviation or short name should be specified for each entity type name and the abbreviation may be sufficient to use in the E-R diagram

5. Event entity types should be named for the result of event. The event of a project manager assigning an employee to work on a project is an ASSIGNMENT.

ATTRIBTES: A property or characteristic of an entity that is of interest to the organization

STUDENT: Student_Id, Student_Name, Home_AdressAUTOMOBILE: Vehicle_Id, Color, WeightEMPLOYEE: Employee_Id, Employee_Name, Weight, Horsepower

In naming attributes we use initial capital letter followed by lower case letters. If an attribute name consists of two words we use an underscore character to connect the words and we start each word with a capital letter for ex: employee_Name

In ER diagram we represent an attribute by placing its name in an ellipse with a line connecting to its associative entity

Entity I

Page 18: Dbms Theory

Student_Id = 455Student_Name = SmithHome_Adress = 452 walnut streetPhone =303-839Major =DBMS

Entity 2

Student_Id = 555Student_Name = ThomsHome_Adress = 944 mapel streetPhone =631-391Major =VCPP

COMPOSITE ATTRIBUTE: An attribute that can be broken down into component partsEx: Address

It can be broken down into Street_adress, city, State and Postal_code

SIMPLE ATTRIBUTE: An attribute that can not be broken down into smaller componentsEx: Color, Weight

SINGLE-VALUED ATTRIBUTE : An attribute that holds a single-value for a single entity.

Ex: Customer 1, Branch 11

MULTI-VALUED ATTRIBUTE: An attribute that may take on more than one valuefor a given entity instance.

Ex: Tel_No: 234-5678 and 456-7839 Rollno and regno

DERIVED ATTRIBUTE: An attribute whose values can be calculated from related attribute values For ex: The employee entity has a Date_Employed attribute. If users need to know how many years a person has been employed that value can be calculated using date_employed and today’s date

IDENTIFIER: An attribute or (combinations of attributes) that uniquely identifies individual instances of an entity type.

Entity type Identifier

Student Student_idAutomobile Vehicle_Id

Student_Id is not a identifier because many students have same name or may change their namee. It is underlined in the E-R diagram

Student_Id Student_Name

Other_Attributes

STUDENT

Page 19: Dbms Theory

COMPOSITE IDENTIFIER: An identifier that consists of a composite attribute.

Entity flightComposite identifier: Flight _Id

Flight_Id in turn has component attributes Flight_number and Date. This combination is required to uniquely identify individual occurrences of Flight.

Naming and Defining Attributes:

1. NOUN: An attribute name is a noun (such as customer_Id, Age Product_Minimum_price or major.

2. Unique: An attribute name should be unique. No two attributes of the same entity type may have the same name. For clarity purpose no tow attributes across all entity types have the same name

3. SHOULD FOLLOW A SPECIFIC FORMAT:

A common format is [entity type name {[qualifier],} classWhere […] is an optional

{…} Indicates that the clause may repeat

Ex: cust_id

CLASS: class is a phrase from list of phrases defined by the organization that are permissible characteristics of entities

Ex: Name NmIdentifier IDDate Dt /*entities*/Amount Amt

Qualifier: A qualifier is a phrase from list of phrases defined by the organaization.

Ex: Maximum Max Hourly Hrly /*attributes*/ State St

A relationship type: A relationship type is a meaningful association between (or among) entity types.

Relationship instances: An association between (or) among entity instances where each relationship instance includes exactly one entity from each participating entity type

Page 20: Dbms Theory

Attributes on relationships: Attributes may be associated with a many to many (or one to one) relationship as well as with an entity. Suppose an organization wishes to record the date (month and year) when an employee completes each course.

Relationship instances:

Employee Course

Chen C++

Melton Java

Ritche COBOL

Celko Basic

Gosling SQL

Fig 3.10 page 96 perl

Associative entity: An entity type that associates the instances of one or more entity types and contains attributes that are peculiar to the relationship between those entity instancesThe associative entity is represented with the diamond relationship symbol encloses within the entity box. The purpose of this symbol is to preserve the information that the entity was initially specified as a relationship on the E-R diagram

Employee A B

Degree of a relationship: The degree of a relationship is the number of entity types that participate in that relationship.

The three most common relationship degrees in E-R models are Unary (degree1), Binary (degree2) and ternary (degree3)

Unary Relationship: A relationship between the instances of a single entity type. Unary relation ships are also called recursive relationships). Is_married_to is shown as a one to one relation ship between instances of the PERSON entity type.

In the second example “manages” is shown as one_to_many relationship between instances of the employee entity type. Using this relationship we could identify the employees who report a particular manager.

Binary Relationship: A relationship between the instances of two entity types and is the most common type of relationship encountered in data modeling.

EMPLOYEE COURSECERTIFICATE

Page 21: Dbms Theory

Ex1: Indicates (one to one) that an employee is assigned one parking place and each parking place is assigned to only one employee

Ex2: indicates (one to many) that a product line may contain several products and each product belongs

Ex3: (Many to Many) that a student may register for more than one course and that each course may have many student registrants.

Ternary relation ship: A ternary relation ship is a simultaneous relationship among the instances of three entity types. In this example vendors can supply various parts to warehouses. The relationship supplies is used to record the specific parts that are supplied by a given vendor to a particular warehouse

Thus there are three entity types. VENDOR, PART and WAREHOUSEThere are two attributes on the relationship supplies Shipping mode and Unit_cost

CARDINALITY CONSTRAINTS: cardinality constraints specifies the number of instances of one entity that can (or must) be associated with each instance of another entity

For ex considers a VIDEOSTORE that rents videotapes of movies. Since the store may stock more than one video tape for each movie. The store may not have any typess of a given movie in stock at a particular time.

MINIMUM CARDINALITY: The minimum number of instances of one entity that may be associated with each instance of another entity.

In our VIDEOTAPE the minimum number of videotapes for a movie is zero. When the minimum number of participants is zero we say the entity type b is an optional participant in the relationship.

MAXIMUM CARDINALITY: The maximum cardinality of a relationship is the maximum number of instances of one entity that may be associated with each instance of another entity.

In our VIDEOTAPE example the maximum cardinality for the VIDEOTAPE entity is “Many”. That is an unspecified one greater than one. This is indicated by the “ crows foot” symbol on the arrow next to the VIDEOTAPE

A relationship is of course bi-directional

MANDATORY ONE: The minimum and maximum are both one. This is called mandatory one cardinality. In other words each videotape of a movie must be a copy of exactly one movie. If the minimum cardinality is zero participation is optional

Ex: of mandatory cardinality fig 317

Naming and Defining Relationship:

1. Verb Phase: A relationship name is a verb phrase. (Such as Assigned _to, supplies or teaches) Relationships represent actions being taken usually in the present tense.

A relationship name

Page 22: Dbms Theory

States the action Not the result of action

Employee is assigned _ to a project Employee is assigning a project

2. Avoid vague names: you should avoid vague names such as Has or Is_related_to. Use descriptive verb phases found in the definition of relationship

(Vague means not clearly expressed)3. A relationship definition explains what action is being taken and possibly why it is important.

It may be important to state who or what does the action. But it is not important to explain how the action is taken.

4. It may also be important to give examples to clarify the action. For ex for a relationship of registered for between student and course. It may be useful to explain that this covers both on site and on line registration and includes registration s made during the drop/add period.

5. Optional participation: The definition should explain any optional participation. You should explain what condition lead to zero associated instances.

Subtype: a subgrouping of the entities in an entity type that is meaningful to the organization and that shares common attributes or relationships distinct from other sub groupingsEx: graduate student, undergraduate student, Supertype: a generic entity type that has relationship with one or more subtypesEx: student

Pg. 129 fig 4.1

Hourly employees: Employee_Number, Employee_Name, Adress, Date_Hired , Hourly _Rate

Salaried employees: Employee_Number,Employee_Name, Address,Date_Hired, Hourly _Rate

Contract consultant: Employee_number, Employee_name

Page 23: Dbms Theory

Fig 4.2 page no 130

Attribute inheritance: attribute inheritance is the property by which subtype entities inherit values of all attributes of the supertype. This property makes it unnecessary to include all supertype attributes redundantly with the subtypes.

For ex Employee_name is an attribute of EMPLOYEE but not of the sub types of employee.

When to use supertype/subtype relation ships:

Whether to use supertype/subtype relations ships are not is a decision that the data modeler must make in each situation. You should consider for the following conditions are present

1. There are attributes that apply to some of the instances of an entity type 2. The instances of a subtype participate in a relation ship unique to that subtype.3. Fig 4.3 Page 131The hospital entity type PATIENT has two sub types. OUT PATIENT and RESIDENT PATIENT. The primary key is Patient_ID. All patients have an Admit Date attribute as well as Patient_name also. Every patient is carded by a responsible physician who develops a treatment plan for the patient.

Each sub type has an attribute that is unique to that sub type. Out patients have a check back date. While resident patients have a Date_Discharged. Resident patients have a unique relation that assigns each patient to a bed. Each bed may or may not assigned to a patient.

According to attribute inheritance each out patient and resident patient inherits the attributes of the parent supetype PATIENT. Patient_Id, Patient_Name, and Admit_date

REPRESENTING GENERALIZATION AND SPECIALIZATION:

Generalization: The process of defining a more general entity type from a set of more specialized entity types. Thus generalization is bottom up process.

Fig 4.4 page 133

In the above example three entity types have been defined. CAR, TRUCK and MOTOR CYCLE. We have observed that three entity types have a number of attributes in commonVehicle_Id(identifier), Vehicle_Name( with components make and model), price, Engine_Displacement. This fact suggests that each of the three entity types is really a version of generalization.

There is more general entity type named VEHICLE. The entity CAR has the specific attribute NO_of_passengers, while TRUCK has two specific attributes Capacity and Cab._Type. Thus generalization has allowed us to group entity types along with their common attributes

The entity MOTOR CYCLE is not included in the relation ship it does not satisfy the condition for a subtype because attributes of MOTOR CYCLE are common to all vehicles. There are no attributes specific to motor cycle. Further MOTOR CYCLE does not have a relation ship to another entity type. Thus there is no need to create a MOTOR CYCLE subtype.

Page 24: Dbms Theory

Specialization: The process of defining one or more sub types of the super type and forming supertype/ subtype relationships. Specialization is a top down process the direct reverse of generalization.Fig 4.5 page 134

Fig 4.5a shows an entity type named PART together with several of its attributes. The identifier is part_no and other attributes include Description, Unit_price, Location, Qty_on_Hand, Routing_number and suppliers(The last attribute is multivalued since there may be more than one supplier with associated unit price for a part.

Here some parts are manufactured internally while others are purchased from outside suppliers. Thus routing numbers applies only to manufactured parts while supplier_ID and Unit_price apply only to purchased parts.

SPECIFYING CONSTRAINTS IN SUPERTYPE/SUBTYPE RELATIUON SHIPS:

Specifying completeness constraints: A completeness constraint addresses the question whether an instance of a supertype must also be a member of at least one subtype. The completeness constraint has two possible rules 1. total specialization 2. Partial specialization

Total Specialization Rule:The total specialization rule specifies that each entity instance of the supertype must be a

member of some subtype in the relation ship. Fig 4.6a pg 136

In this example the business rule is the following. A patient must be either an outpatientor a resident patient(there are no other types of patient in this hospital). Total specialization is indicated by the double line extending from the PATIENT entity type to the circle.

In this example every time a new instance of Patient is inserted into the supertype a corresponding instance is inserted into either OUTPATIENT or RESIDENT PATIENT. If the instance is inserted into RESIDENT PATIENT an instance of the relation ship is assigned is created to assign the patient to a hospital bed.

Partial Specialization Rule: The partial specialization rule specifies that an entity instance of the supertype is allowed

not to belong to any subtype.

Fig 4.6b pg 136

Motorcycle is a type of vehicle, but that it is not represented as a subtype in that model. Thus if a vehicle is a car it must appear as an instance of CAR. If it is a truck it must appear as an instance of TRUCK. How ever if the vehicle is a motorcycle it can not appear as an instance of any subtype. This example of partial specialization and it is specified by the single line from the VEHICLE supertype to the circle.

SPECIFYING DISJOINTENESS CONSTRAINTS:

Page 25: Dbms Theory

A disjointness constraint addresses the question whether an instance of a supertype may simultaneously be a member of two or more subtypes. The disjointness constraint has two possible rules. 1. The disjoint rule 2. The over lap rule

The disjoint rule:The disjoint rule specifies that if an entity instance is a member of one subtype it can not

simultaneously be a member of any other sub type.Fig 4.7a page 138

The business rule in this case is the following at any given time a patient must be either an outpatient or a resident patient but can not be both. This is the disjoint rule as specified by the letter ‘d’ in the circle joining the supertype and its subtypes. Note: The sub class of a PATIENT may change over time but at a given time a PATIENT is only of one type.

Overlap Rule:The overlap rule specifies that an entity instance can be simultaneously be a member of

two or more sub types. Fig 4.7b pg 138

In this example an instance of PART is a particular part number. That is a type of part not an individual part. This is indicated by the identifier which is Part_No. For ex consider part number 4000. The overlap rule is specified by placing the letter ‘o’ in the circle . thus any part must be either purchased or a manufactured part or it may be simultaneously be both of these.

DEFINING SUBTYPE DISCRIMINATORS: A subtype discriminator is an attribute of the supertype whose values determines the

target subtype or subtypes.` Disjoint Subtypes: Fig 4.8 page 139This example is for EMPLOYEE supertype and its subtypes. Thus each employee must be either hourly, salaried, consultant.

A new attribute(employee_type) has been added to the supertype to serve as subtype discriminator. When a new employee is added to the supertype this attribute is coded with one of three values as follows “H” (for hourly), “S” (for salaried) or “C”(for Consultant) Depending on this code the instance is then assigned to the appropriate subtype.

Thus for example the condition “employee_type=”S” causes an entity instance to be inserted into the SALARIED EMPLOYEE subtype.

OVERLAPPING SUBTYPES:

When subtypes overlap a slightly modified approach must be applied for the subtype discriminator. The reason is that a given instance of the supertype may require that we create an instance in more than one subtype.Fig 4.9 page 140

Page 26: Dbms Theory

A new attribute when named part_type has been added to PART. Part_Type is a composite attribute with components manufactured and purchased. Each of these attributes is a boolean variable. It takes only the values yes “Y” and no “N” when a new instance is added to part

Type of part Manyfactured? Purchased?Manyfactured only “Y” “N”Purchased only “N” “Y”Manyfactured and Purchased “Y” “Y”

Defining supertype/subtype hierarchies:A subtype/supertype hierarchy is a hierarchical arrangement of supertypes and subtypes,

where each subtype has only one supertypeEx: suppose you are asked to model the human resources in a university. Using specialization you must proceed as followsStarting at the top of a hierarchy model the most general entity type first. In this case the attributes shown in fig 4.10 page 141 are SSN(identifier), name,address,gender and Date_of_Birth. The entity type at the top of a hierarchy is sometimes called the root.

Next define all major subtypes of the root. In this example there are three sub types of PERSON: EMPLOYEE(person who work for the university),STUDENT(person who attend classes) and ALUMNUS( person who have graduated)Assuming there are no other types of persons of interest to the university the total specialization rule applies as shown in the fig. A person might belong to more than one subtype( for ex ALUMNUS AND EMPLOYEE). So the over lap rule is used. Note: overlap allows for any overlap ( a person may be simultaneous in any pair or all in three types.). if certain combinations are not allowed then a more redefined supertype/subtype hierarchy ould have developed to elliminate prohibitted combinations. Page 141 fig 4.10

EER modelling diagram pine valley furniture pg 143

ENTITY CLUSTRERING:A entity cluster is a set of one or more entity types and associated relationships grouped

into a single abstract entitytype.

Pg 146 fig 4.13

SINGLE UNIT: represents the SALESPERSON and SALES TERRIORITY entity types and the serves relationship

CUSTOMER: represents the CUSTOMER entity supertype, its subtypes and the relationship between supertype and subtypes

ITEMSALE: represents the ORDER entity type and ORDERLINE associative entity as well as the relationship between them.

ITEM: represents the PRODUCT LINE and PRODUCT entity types and the includes relation ship

Page 27: Dbms Theory

MANUFACTURING : represents the WORK CENTER and EMPLOYEE supertype entity and its subtypes as well as the works in and supervises relationships and the relationship between the supertype and its subtypes.

MATERIAL: represents the RAW MATERIAL and VENDOR entity types, the SUPPLIER subtype, the supplies relationship, and the supertype/subtype relationship between VENDOR and SUPPLIER.

FIG 4.13 PG 146,147


Recommended