+ All Categories
Home > Documents > Module 2: Database Models - Lifelong...

Module 2: Database Models - Lifelong...

Date post: 02-Feb-2021
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
57
Module 2: Database Models Objectives: In this module, the student will learn: 1. About data modeling and why data models are important 2. About the basic data-modeling building blocks 3. What business rules are and how they influence database design 4. How the major data models evolved 5. How data models can be classified by level of abstraction
Transcript
  • Module 2: Database Models

    Objectives:In this module, the student will learn:

    1. About data modeling and why data models areimportant

    2. About the basic data-modeling building blocks3. What business rules are and how they influence

    database design4. How the major data models evolved5. How data models can be classified by level of

    abstraction

  • Data Modeling and Data Models

    • Data modeling – the first step in designing adatabase, refers to the process of creating a specificdata model for a determined problem domain.

    • Problem domain – is a clearly defined area withinthe real-world environment, with well-defined scopeand boundaries, that is to be systematicallyaddressed.

    • Database Models - Collection of logical constructsused to represent data structure and relationshipswithin the database.

  • Note on data models

    An implementation-ready data model should contain at least the following components:

    1. A description of the data structure that will store the end-user data.

    2. A set of enforceable rules to guarantee the integrity of the data.

    3. A data manipulation methodology to support the real-world data transformations.

  • Importance of data models

    • Data models can facilitate interaction among thedesigner, the applications programmer, and theend user.

    • A well-developed data model can even fosterimproved understanding of the organization forwhich the database design is developed.

    • Data models are a communication tool.

  • Data model basic building blocks

    • The basic building blocks of all data models are entities,attributes, relationships, and constraints.

    Entity- is anything (a person, a place, a thing, or an event) about

    which data are to be collected and stored.

    - represents a particular type of object in the real world.

    • Example:

    CUSTOMER entity would have many distinguishablecustomer occurrences, such as Juan Dela Cruz, PedroDinamita, Tom Cruz, etc.

  • Data model basic building blocks

    Attribute

    - a characteristic of an entity.

    - are the equivalent of fields in file systems.

    Example:

    CUSTOMER entity would be described by attributessuch as customer last name, customer firstname, customer phone, customer address,and customer credit limit.

  • Data model basic building blocks

    Relationship- describes an association among entities.- defines how entities are related to each other.- are bi-directional.

    Example: a relationship exists between customers and agentsthat can be described as follows:

    an agent can serve many customers, and eachcustomer may be served by one agent.

  • 3 types of relationships used in data

    models:1.) one-to-one (1:1) – only one entity can have a relationship with another entity.

    Example: Married relationship (One HUSBAND to one WIFE)

    2.) one-to-many (1:M) – one entity can have more than one of that type ofrelationship.

    Example: A painter paints many different paintings, but each one of them ispainted by only one painter. Thus, the PAINTER (the “one”) is related to thePAINTINGS (the “many”).

    3.) many-to-many (M:N) – can have any number of relationships with otherentities.

    Example: A student can take many classes and each class can be taken by manystudents, thus yielding the M:N relationship “STUDENT takes CLASS.”

  • Data model basic building blocks

    Constraints- a restriction placed on the data.

    - Constraints are important because they help to ensure dataintegrity.

    - Constraints are normally expressed in the form of rules.

    Example of constraints:

    • An employee’s salary must have values that are between 6,000 and 350,000.

    • A student’s GPA must be between 0.00 and 4.00.• Each class must have one and only one teacher.

  • Q: How do you properly identify entities,attributes, relationships, and constraints?

    A: The first step is to clearly identify thebusiness rules for the problem domain youare modeling.

  • Business Rules- a brief, precise, and unambiguous description of a policy,

    procedure, or principle within a specific organization.

    - help to create and enforce actions within that organization’senvironment.

    - must be rendered in writing and updated to reflect anychange in the organization’s operational environment.

    Examples of business rules:

    • A customer may generate many invoices.• An invoice is generated by only one customer.• A training session cannot be scheduled for fewer than 10

    employees or for more than 30 employees.

  • The process of identifying and documenting businessrules is essential to database design for severalreasons:

    • They help standardize the company’s view of data.• can be a communications tool between users and

    designers.• allow the designer to understand the nature, role,

    and scope of the data.• allow the designer to understand business

    processes.• allow the designer to develop appropriate

    relationship participation rules and constraintsand to create an accurate data model.

  • Translating business rules to data

    model components

    As a general rule, a noun in a business rule will translate into anentity in the model, and a verb (active or passive) associatingnouns will translate into a relationship among the entities.

    For example, the business rule “a customer may generate many invoices” containstwo nouns (customer and invoices) and a verb (generate) thatassociates the nouns.

    From this business rule, you could deduce that:

    • Customer and invoice are objects of interest for the environmentand should be represented by their respective entities.

    • There is a “generate” relationship between customer and invoice.

  • Naming Conventions

    • Entity names should be descriptive of the objects inthe business environment, and use terminology thatis familiar to the users.

    • An attribute name should also be descriptive of thedata represented by that attribute.

    • It is also a good practice to prefix the name of anattribute with the name of the entity (or anabbreviation of the entity name) in which it occurs.

  • Example of naming conventions:

    In the CUSTOMER entity, the customer’s credit limit may be called CUS_CREDIT_LIMIT.

    - The CUS indicates that the attribute isdescriptive of the CUSTOMER entity, whileCREDIT_LIMIT makes it easy to recognize thedata that will be contained in the attribute.

  • Note on naming conventions:

    • The use of a proper naming convention willimprove the data model’s ability to facilitatecommunication among the designer, applicationprogrammer, and the end user.

  • Database ModelsHierarchical Model– developed in the 1960s to manage large amounts of data for

    complex manufacturing projects.

    - contains levels, or segments, wherein, a segment is theequivalent of a file system’s record type.

    - Within the hierarchy, a higher layer is perceived as theparent of the segment directly beneath it, which is called thechild.

    - The hierarchical model depicts a set of one-to-many (1:M)relationships between a parent and its children segments.

    - logically represented by an upside down tree.▫ Each parent can have many children▫ Each child has only one parent

  • Fig. 2.1: Hierarchical Database Model

  • Hierarchical ModelAdvantages

    promotes data sharing Parent/Child relationship promotes conceptual simplicity and

    data integrity. Database security is provided and enforced by DBMS. Efficient with 1:M relationships. Data independence

    Disadvantages Complex implementation requires knowledge of physical data

    storage characteristics. Navigational system yields complex application development,

    management, and use; requires knowledge of hierarchical path. Changes in structure require changes in all application

    programs. Implementation limitations (no multi-parent or M:N

    relationships) Lack of standards No Structural Independence.

  • Database Models

    Network Model - created to represent complex data relationships

    more effectively than the hierarchical model, toimprove database performance, and to impose adatabase standard.

    - the user perceives the network database as acollection of records in 1:M relationships.

    - network model allows a record to have morethan one parent.

  • Network ModelAdvantages

    conceptual simplicity, like hierarchical model.

    handles more relationship types, i.e. M:N and multi-parent.

    data access flexibility.

    data owner/member relationship promotes dataintegrity.

    conforms to standards.

    data independence

    Disadvantages System complexity limits efficiency – still a navigational

    system

    Lack of structural independence

    Structural changes require changes in all application programs.

  • Database Models

    Relational Database Model

    - introduced in 1970 by E. F. Codd (of IBM) in hislandmark paper “A Relational Model of Data for LargeShared Databanks” (Communications of the ACM, June1970, pp. 377−387).

    - perceived by user as a collection of tables for datastorage.

    - tables are a series of row/column intersections

    - tables related by sharing common entity characteristic(s)

  • Fig. 2.2. A relational database model

  • Relational Database Models

    Advantages

    Structural independence is promoted by the use ofindependent tables. Changes in a table’s structure do notaffect data access or application programs.

    Tabular view substantially improves conceptualsimplicity, thereby promoting easier database design,implementation, management, and use

    Ad hoc query capability with SQL

    Powerful database management system isolates the enduser from physical-level details and improvesimplementation and management simplicity

  • Relational Database Models

    Disadvantages

    Requires substantial hardware and system softwareoverhead.

    Conceptual simplicity gives relatively untrainedpeople the tools to use a good system poorly, and ifunchecked, it may produce the same data anomaliesfound in file systems

    It may promote “islands of information” problemsas individuals and departments can easily developtheir own applications

  • Database Models

    Entity-Relationship Model

    - Peter Chen first introduced the ER data model in1976;

    - graphical representation of entities and theirrelationships in a database structure

    - complemented the relational data model concepts.

    - ER models are normally represented in an entityrelationship diagram (ERD), which usesgraphical representations to model databasecomponents.

  • Entity-Relationship Model

    The E-R Model is based on the following components:

    1) Entity- An entity is represented in the ERD by arectangle, also known as an entity box.

    - The name of the entity, a noun, is written in thecenter of the rectangle.

    - The entity name is generally written in capitalletters and is written in the singular form

  • 2) Relationship

    - Describe associations among data.

    - Most relationships describe associations betweentwo entities.

    - three types of relationships among data wereillustrated: one-to-many (1:M), many-to-many(M:N), and one-to-one (1:1).

    - The ER model uses the term connectivity to label the relationship types.

    - The name of the relationship is usually an active or passive verb.

  • Two ER Notations:

    • Chen Notation

    - based on Peter Chen’s paper.

    - The connectivities are written next to each entitybox.

    - Relationships are represented by a diamondconnected to the related entities through arelationship line.

    - The relationship name is written inside thediamond.

  • Two ER Notations

    • Crow’s Foot Notation

    - The name “Crow’s Foot” is derived from the three-pronged symbol used to represent the “many” sideof the relationship.

    - Connectivities are represented by symbols.

    - The “1” is represented by a short line segment, andthe “M” is represented by the three-pronged “crow’sfoot.”

  • Fig. 2.3. Two ER Notations

  • Entity-Relationship Database Model

    Advantages: Visual modeling yields exceptional conceptual

    simplicity.

    Visual representation makes it an effectivecommunication tool.

    It is integrated with dominant relational model.

    Disadvantages: There is limited constraint representation.

    There is limited relationship representation.

    Loss of information content occurs when attributes are removed from entities to avoid crowded displays.

  • Database Models

    Object-Oriented Data Models

    - The object-oriented data model (OODM) uses objects as the basic modeling structure.

    - An object resembles an entity in that it includes thefacts that define it. But unlike an entity, the objectalso includes information about relationshipsbetween the facts, as well as relationships with otherobjects, thus giving its data more meaning.

  • Object-Oriented Database Model

    The OO database model is based on the following components:

    1) An object is an abstraction of a real-world entity.

    In general terms, an object may be consideredequivalent to an ER model’s entity.

    More precisely, an object represents only oneoccurrence of an entity.

  • Object-Oriented Database Model

    2) Attributes describe the properties of an object.

    Example:

    a PERSON object includes the attributes Name,Social Security Number, and Date of Birth.

    3) Classes are organized in a class hierarchy.

    - The class hierarchy resembles an upside-down tree in which each class has only one parent.

  • 4) Objects that share similar characteristics are grouped inclasses.

    class - collection of similar objects with shared structure(attributes) and behavior (methods). In a general sense, a classresembles the ER model’s entity set.

    However, a class is different from an entity set in that itcontains a set of procedures known as methods.

    method- represents a real-world action such as finding a

    selected PERSON’s name, changing a PERSON’s name, orprinting a PERSON’s address.

    - equivalent of procedures in traditional programminglanguages; define an object’s behavior.

    Inheritance – ability of an object within the class hierarchy to inherit the attributes and methods of the classes above it.

  • Notes on Object-Oriented Data Models

    • Object-oriented data models are typically depictedusing Unified Modeling Language (UML) classdiagrams.

    • Unified Modeling Language (UML) is alanguage based on OO concepts that describes a setof diagrams and symbols that can be used tographically model a system.

    • UML class diagrams are used to represent data andtheir relationships within the larger UML object-oriented system’s modeling language.

  • Fig. 2.4. Comparison of ER and OO

    ModelsINVOICE

    INV_DATE

    INV_NUMBER

    INV_SHIP_DATE

    INV_TOTAL

    CUSTOMER

    LINE

    1

    M

    ER Model Object Representation

  • Object-Oriented Database Model

    Advantages: Semantic content is added. Visual representation includes semantic content. Inheritance promotes data integrity

    Disadvantages: Slow development of standards caused vendors to

    supply their own enhancements, thus eliminating a widely accepted standard.

    It is a complex navigational system. There is a steep learning curve. High system overhead slows transactions

  • Data Models: A SummaryThe evolution of DBMSs has always been driven by the search

    for new ways of modeling increasingly complex real-worlddata.

    In the evolution of data models, there are some commoncharacteristics that data models must have in order to bewidely accepted:

    • A data model must show some degree of conceptualsimplicity without compromising the semantic completenessof the database.

    • A data model must represent the real world as closely aspossible.

    • Representation of the real-world transformations (behavior)must be in compliance with the consistency and integritycharacteristics of any data model.

  • Fig. 2.5. Evolution of Data ModelsSemantics in

    Data Model

    least

    most

    hierarchical

    network

    relational

    Entity- relationship

    semantic

    Object-

    Oriented

    Extended

    Relational

    • Difficult to represent M:N relationships

    (hierarchical only)

    • Structural level dependence

    • No ad hoc queries (record-at-a-time access)

    • Access path predefined (navigational access)

    • Conceptual simplicity (structual independence)

    • Provides ad hoc queries (SQL)

    • Set-oriented access

    • Easy to understand (more semantics)

    • Limited to conceptual modeling

    • More semantics in data model

    • Support for complex objects

    • Inheritance (class hierarchy)

    • Behavior

    • Unstructured data (XML)

    • XML data exchanges

  • Data Abstraction

    One purpose of a database management system is tomake it easier for users to access and modify thedata stored in the system.

    DBMS provide users with an abstracted view of thesystem, concealing certain “low-level” details of datastorage and maintenance.

    Using levels of abstraction can be very helpful inintegrating multiple (and sometimes conflicting)views of data as seen at different levels of anorganization.

  • Levels of Data Abstraction

    External Model – the end users’ view of the data environment.

    End users – refer to people who use theapplication programs to manipulate the data andgenerate information.

    A specific representation of an external view isknown as an external schema.

  • Fig. 2.6. External Schema for a college

  • • Each external schema includes the appropriateentities, relationships, processes, andconstraints imposed by the business unit.

    • Also note that although the application views areisolated from each other, each view shares acommon entity with the other view.

    Example:

    the registration and scheduling external schemasshare the entities CLASS and COURSE.

  • The use of external views representing subsets of thedatabase has some important advantages:

    1. It makes it easy to identify specific data required to supporteach business unit’s operations.

    2. It makes the designer’s job easy by providing feedbackabout the model’s adequacy. The model can be checked toensure that it supports all processes as defined by theirexternal models, as well as all operational requirements andconstraints.

    3. It helps to ensure security constraints in the databasedesign. Damaging an entire database is more difficult wheneach business unit works with only a subset of data.

    4. It makes application program development much simpler.

  • Conceptual Model

    - represents a global view of the entire database asviewed by the entire organization.

    - The conceptual model integrates all external views(entities, relationships, constraints, and processes) into asingle global view of the data in the enterprise.

    - Also known as a conceptual schema, it is the basis forthe identification and high-level description of the maindata objects.

    Levels of Data Abstraction

  • Notes on Conceptual Models

    • The most widely used conceptual model is theER model.

    • the ER model is illustrated with the help of theERD, which is, in effect, the basic databaseblueprint.

    • The ERD is used to graphically represent theconceptual schema.

  • Fig. 2.6. Conceptual Model for a college

  • Advantages of Conceptual Models

    1. it provides a relatively easily understood bird’s-eye (macrolevel) view of the data environment.

    2. the conceptual model is independent of both software andhardware.

    • Software independence - the model does not depend onthe DBMS software used to implement the model.

    • Hardware independence - the model does not depend onthe hardware used in the implementation of the model.

    Changes in either the hardware or the DBMS software will haveno effect on the database design at the conceptual level.

  • Levels of Data Abstraction

    Internal Model

    - the representation of the database as “seen” by theDBMS.

    - the internal model requires the designer to match theconceptual model’s characteristics and constraints tothose of the selected implementation model.

    An internal schema depicts a specific representationof an internal model, using the database constructssupported by the chosen database.

  • Fig. 2.7 Conceptual vs. Internal Model

  • Notes on Internal Model

    • The development of a detailed internal model isespecially important to database designers whowork with hierarchical or network models becausethose models require very precise specification ofdata storage location and data access paths.

    • In contrast, the relational model requires less detailin its internal model because most RDBMSs handledata access path definition transparently;

    • Nevertheless, even relational database softwareusually requires data storage location specification,especially in a mainframe environment.

  • Notes on Internal Model

    • Because the internal model depends on specific database software, it is said to be software-dependent.

    A change in the DBMS software requires that the internal model be changed to fit the characteristics and requirements of the implementation database model.

    When you can change the internal model without affecting the conceptual model, you have logical independence.

    • However, the internal model is still hardware-independentbecause it is unaffected by the choice of the computer on which thesoftware is installed.

    A change in storage devices or even a change in operating systemswill not affect the internal model.

  • Levels of Abstraction

    Physical Model- operates at the lowest level of abstraction,

    describing the way data are saved on storagemedia such as disks or tapes.

    - This is the “nuts-and-bolts” view of the system,where data structures and file formats used bythe system are described.

    - This level is generally not visible to end-users,and may only used by the system programmers.

  • Fig. 2.8 Data Abstraction LevelsEnd-User View End-User View

    External

    Model

    External

    Model

    Conceptual

    Model

    Designer’s

    View

    Physical

    Model

    Internal

    Model

    Logical independence

    Physical independence

    DBMS

    View

    Degree of Abstraction

    High

    Medium

    Low

    ER

    Relational

    Object-Oriented

    Network

    Hierarchical

    Characteristics

    Hardware-independent

    Software -independent

    Hardware-independent

    Software -dependent

    Hardware-dependent

    Software -dependent

  • Table 2.1. Levels of Abstraction

    MODEL Degree ofAbstraction

    Focus Independentof

    External High End User Views Hardware/Software

    Conceptual Global-view of data(database model-independent)

    Hardware/Software

    Internal Specific database model Hardware

    Physical Low Storage and access methods Neither


Recommended