+ All Categories
Home > Documents > PlanningDesigningYourDatabase DEL 20121214

PlanningDesigningYourDatabase DEL 20121214

Date post: 04-Jun-2018
Category:
Upload: quelqune
View: 214 times
Download: 0 times
Share this document with a friend

of 41

Transcript
  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    1/41

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    2/41

    Some keystrokes and menu items are different on a Mac from those used in Windows and Linux. The table below

    gives some common substitutions for the instructions in this chapter. For a more detailed list, see the application

    Help.

    Windows or Linux Mac equivalent Effect

    Tools > Options menu selection LibreOffice > Preferences Access setup options

    Right-click Control+click Open a context menu

    Ctrl (Control) (Command) Used with other keysF5 Shift ++F5 Open the Navigator

    F11 +TOpen the Styles and Formatting

    window

    Contents

    Copyright 2

    Note for Mac users 2

    Introduction 5

    General information 5

    Specific information about this chapter 5

    Specific information about Base 6

    Java 7Memory cached tables 7

    Why some knowledge of database theory is necessary 7

    Problems when shutting down computer or a crash occurs 8

    Useful commands 9

    Goals: planning for the database 10

    Database descriptions 10

    Text databases 10

    Using text data sources with Base 11

    Flat database 11

    Using flat databases with Base 12

    Relational databases 12

    Using a relational database with Base 16

    Planning for a flat or relational database 16

    The plan: the outline 17

    Design based upon the plan 21

    Part 1: Purpose of our database and a design for it 21

    Part 2: Type of database and a design for it 23

    Database data and a structure for it 23

    Fields 24

    Field names 24

    Field types 24

    Field properties 24

    Tables 25

    Tables for list boxes 25

    Flat databases 26

    Relational databases 27

    First normal form 28

    Second normal form 30Third normal form 32

    Boyce-Codd normal form (BCNF) 35

    Relationships between tables 38

    Relationships 40

    View 41

    Forms 42

    Queries 44

    Principle #1: Determine the tables and fields we need 45

    Principle #2: Sorting the information in the query 45

    Principle #3: Determine the search conditions 45

    http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13392_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13390_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13388_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10960_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13386_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__5522_1393493797http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13384_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13382_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13380_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13378_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13376_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13374_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4119_171218500http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4117_171218500http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13372_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10956_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13370_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13368_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13366_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10954_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10952_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11019_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11017_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10950_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11011_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4778_1929050850http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10996_217214055http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__12216_688277121http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13364_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11009_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13362_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11173_698468123http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4776_1929050850http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__3729_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__1599_489383896http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11005_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__1595_489383896http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11003_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__5516_1393493797http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__1587_489383896http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__2960_2038632354http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__2958_2038632354http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__3727_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__20494_2096162955http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__3107_536839882
  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    3/41

    Principle #4: Principle #4: Set the type of output. (list, calculations) 45

    Principle #5: Using Aliases to shorten or clarify headings 45

    Principle #6: Review the query design as to whether it meets our requirements or not 45

    Principle #7: Name and save the query 45

    Principle #8 Save the database file 45

    Reports 45

    Introduction

    General information

    The Base Guide describes what you need to know in order to use the Base component of LO effectively. It begins by

    describing a simple flat database in chapter 1. Then, beginning with chapter 2, we consider a more complex database

    type: relational databases. Some principles described from chapter 2 onwards also apply to flat databases.

    Chapter 2, Planning/Designing your Database, looks at the planning required when creating a database. This is one

    of the most important parts of the creation process: it is the outline to be used when the database is created.

    Chapter 3, Data Input and Removal, describes the beginning of the creation process. This chapter centers upon the

    parts of the database with which we input and remove data: tables and forms. Wizards used to create tables and

    forms are examined to reveal their principles and how to use them. We also examine the design dialogs used to

    create tables and forms. These dialogs give you much more control over the creation of tables and forms than theWizard does. These dialogs are also used to modify tables and forms.

    Chapter 4, Data Output, continues the creation process. This chapter centers upon the parts of the database that

    provide us with data output: the queries and reports. We describe the wizards used to create queries and reports.

    The query section also contains the explanation of the query design dialog. This dialog contains two parts: Design

    View and SQL View. The design view permits creation of more complex queries, and the SQL view permits an even

    more complex query using the SQL language.

    The report section shows how to use the report wizard and the report design view for creating reports. Use of the

    report design view (Report Builder dialog) to create or edit a report is discussed as well.

    Chapter 5, Exchanging Data, describes how to use Base with other components of LO: Calc, Impress, and Writer.

    Base can provide data for documents created by these components, and Base can input data from documents from

    other components of LO.

    Chapter 6, Customization of your Database, describes the customizing of the database design. This is a more detailed

    look at the design of the individual parts: tables, queries, forms, and reports.

    Chapter 7, More customizations, describes even more customization including macros, validation of data, generation

    of calculated data, working with multiple forms, and working with other LO documents.

    Chapter 8 discusses using Base at work. This includes advancing to an external server of database files, and usage of

    database files from other servers, for example: MySQL, Oracle, PostgreSQL, and MS Access.

    Specific information about this chapter

    How well a database works depends upon how well it is planned and designed. But the same can be said for creating

    a drawing (Draw), presentation (Impress), a spreadsheet (Calc), or text document (Writer). Help from others and

    one's own experiences provide information concerning how to do the planning and designing of any project.

    This chapter is designed to give you a background in how to do these things if you need it. If you already

    understand how to plan and design a database, you might not need to read this chapter, However, you might still

    find the content a useful refresher course.For those who are fairly new to databases, I recommend you follow the instructions of this chapter to create a simple

    database. Then use the next two chapters to create it. Practice using the database for a while. Then plan and design a

    more complex database. This should give you even more confidence.

    Planning and designing a database begins with seeking as much information about the proposed database as

    possible. That means many, many questions need to be asked and answered. For more complex databases, the

    answers will often lead to more questions.

    This is only the beginning. Even as the database is constructed, surprises will occur. It is doubtful that any plan will

    be perfect when it is first created. Changes will have to be made in the plan. As the changes are made, the database

    http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10962_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13402_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13400_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13398_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13396_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13394_1869327000
  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    4/41

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    5/41

    data only requires a small fraction of that memory when the table is cached. This leads to a second advantage: when

    the database file is first opened, there is less writing of data to RAM, so it takes less time to open the file. And there is

    also a third advantage: the potential size of the database. You can work with databases that whose size is hundreds of

    megabytes. That is not always possible if all the databases data is written in memory.

    There is one drawback to cached tables: search time. When a query is run, the desired data has to be written into

    memory first. Only then will the query run. The HSQLDB User Guide contains some directions on how to design a

    query for quicker results. It still will not be as quick as when the data is already in memory before the query is

    started.

    Why some knowledge of database theory is necessary

    Base has a wizard with which you can create a new database file for your database. As long as you are only creating

    a simple embedded database, this wizard is fairly straightforward. However, if you want to connect to an external

    data source, some knowledge is necessary before you can accomplish that. This includes connecting to a

    spreadsheet.

    Base has a wizard for creating tables. It includes several suggested tables along with the suggested fields to put in the

    table. Then the wizard takes you through steps which determine what the field types and properties should be. You

    can accept the suggested field types in the wizard. But what if you think you might want to change a fields field type

    in some way? You could make some mistakes unless you know what you are doing. Otherwise you could just be

    guessing. The same thing can be said about selecting the field properties. And you will be asked to select a Primary

    Key. Do you know what this is, and why you need to have one?

    Base also gives you the option to create a table usingDesign View . To use this requires you to name your fields,define their field types and properties. The Primary Key also needs to be selected as well. Some knowledge of

    database theory helps greatly when using this option.

    Creating and working with Base databases is not hard if you know what you are doing. This is why we teach the

    basics of database theory as we describe what to do while creating and working with these databases.

    Note

    The basics of database theory are the same for

    different databases, but the application of the basics

    are usually different. If you have a background in

    these basics as applied in another database program,

    I advise you to look for the differences between

    how Base applies these basics and the another

    database program. It might same you some time and

    some headaches.

    Problems when shutting down computer ora crash occurs

    CautionWhen Base is using the HSQLDB engine, some

    special actions must be taken when Base is

    shutdown incorrectly. This includes a computer

    crash or shutdown before Base is closed properly.

    This caution should have caught your attention. I intended to do exactly that! If you follow the directions we give,

    you will have a minimum of problems. If you do not follow them, you can have many more problems than you will

    want.

    The other components of LO have an auto recovery feature which you can configure in Tools > Options > Load/

    Save > General. Base does not use this feature. Type something in Writer or enter data in Calc. If your computer

    crashes after the length of time set by Autorecovery, the information you entered will be restored when LO has gone

    through its recovery process.

    During the recovery process LO will try to recover databases, but in most cases the changes created during your last

    session cannot be recovered. The recovery process can not recover opened forms and reports.

    Backup options under Tools > Options > Load/Save > General can create backup files in the backup folder;

    unfortunately, you need to create backup files manually. This requires using Tools >SQL to type commands. Theyare mentioned in Useful commands.

    The sage advice of "save and save often" applies especially to Base when working with an embedded database. We

    explain the importance of doing this in Chapter 3 and 4 of the Base Guide: Data Input and Removal, and Data

    Output respectively.

    How often you save when using Base to connect to an external data source depends upon the source and what you

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    6/41

    are doing. When creating or modifying a table, query, or report, you should follow the advice given for data source

    you are using.

    Base will prompt you to save your data under specific circumstances. When you close a form after entering or

    modifying data, this prompt appears unless the database has already saved this data.

    Data can also be added to or modified in a table opened in the Data Sources window. This data also needs to be

    saved. While there is a Save icon at the top of this window, it can be easy to forget to save. When you close the Data

    Sources window, you will be prompted to save the data unless it has already been saved.

    Tip

    Ways to save data

    Click the Save icon in the table or in the form you are

    working in.

    Move to another record of a simple form.

    Move to another record of the main form or sub form of

    a complex form.

    Special care must be taken when saving any editing made to any table, query, form, or report. The same is the case

    when creating these. Saving these changes to the database requires two steps. First you must save your changes in the

    table, query, form, or report you are editing or creating. This writes what you have done to memory. Now these

    changes need to be written to the database file itself. To do so, click the Save icon in the Standard toolbar of the

    main database window. This window is labeled at the top. Its label contains the name of your database followed by

    LibreOffice Base.

    Any time you finish using a database, close it. If you have any data which has not been saved yet, you will be asked

    if you want to save the changes. You will be asked the same question if you have not yet saved changes made to thedatabase structure. This means you may see two such dialog boxes in a row. In almost all circumstances, you will

    want to answer yes in both boxes. The only circumstance I can think of in which you would not want to answer yes

    is when you made a change that you do not want to keep and can not reverse.

    This is why you need to make sure you save your work on a regular basis. But what do you do if your computer

    crashes, or the power goes off, or you shut down Base improperly, or someone shuts down the computer

    improperly? You will lose any data or changes to the database since the last time you saved them.

    If any of these problems happen while the database is being closed, you might lose the whole database. This is

    especially true if Base is in the process of writing to the database when the power goes off.

    Useful commands

    Chapter 9 of the HSQLDB user guide contains the commands HSQLDB uses and their syntax. Some of these are

    different from those used in Calc. For example, enter =NOW()into a Calc cell and you get todays date (data and

    time if the cell is formatted for both). If you want to use todays date in a query, you must useCURDATE()

    . This isfound in theBuilt- in Functions and Stored Proceduressection of chapter 9.

    The commands can be directly submitted to database engine through menu Tools>SQL...command window.

    There are two useful commands which you can use when you want to force changes to write into file. Use these

    commands, when you have inserted a lot of data into tables and you want to avoid losing them.

    The first is CHECKPOINT; the second is SHUTDOWN.

    CHECKPOINT closes the database cleanly, then reopens it.

    When used with the DEFRAG option, CHECKPOINT also shrinks the data to its minimal size.

    SHUTDOWN closes the database open in Base; after that you need to close the ... .odb file and reopen it, if you want

    to continue your work. I mention here only one option to this command, others will be discussed in Chapter 9.

    SHUTDOWN COMPACT rewrites the data and shrinks all files to the minimum size. This is a good time to create a

    backup copy of your file. Never create a backup copy of an open .odb file; it will not contain all the data.

    Goals: planning for the database

    A database is a collection of data. It might be a text file, a flat database, or a relational database. Whatever type it is,

    we must create a structure for the data before we can use any of it. This is a three step process. First a plan needs to

    be made based upon the purpose of the database. Then a design is made based upon the plan. Finally, Base is used to

    create the database structure as it was designed.

    Begin with some general ideas of what you want to accomplish. Then break these goals down into smaller, more

    specific goals. These specific goals may need to be even further divided. You continue doing this until you are

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    7/41

    satisfied that the goals have become specific as possible.

    The first decision to make is what type of database should be used. Then, beginning with the stated, written goals,

    determine the structure of that database. So we will begin by describing the three types of databases. Later we will

    divide our goals into more specific goals until we have the structure we need.

    Database descriptions

    We will discuss three types of databases: text, flat, and relational. Each type serves its own purposes.

    Text databases

    Text database files are exactly that: files that contain text only. They have a defined structure so that their information

    can be retrieved. These text files may be one of several different types. One common format is comma separated

    value (CSV). Another common type is vCard. (The latter is used in some address books.) LDIF is another common

    format for an address book. (The vCard and LDIF formats are similar in structure.) Whether a given database engine

    can access the data in the file depends upon its ability to use the built-in structure in the given format.

    A variety of file extensions are used for text databases depending upon the format. For example CSV uses *.csv, and

    LDIF uses *.ldif. When creating a text file for data using a word processor, any file extension can be used as long as

    the file has a consistent structure that can be recognized by Base.

    This structure must contain field separators and text separators. If the file contains decimals, decimal separators must

    be present. (Although the database wizard includes a possible thousands separator, Base does not seem to recognize

    thousands separators.)Figures 1 is a sample CSV file showing its format. The first row contains the names of the fields for this data source,

    and the rest of the rows contain the data for these fields. All text is enclosed with double quotes, and the fields are

    separated by commas.

    Figure 1: Data source, CSV file

    When Base accesses the CSV file, it creates a table.

    Figure 2: Table for the CSV file

    Using text data sources with BaseBase can recognize and access many text data source files. Maintaining these data source files requires using a

    separate program to enter, remove, or modify the text in the file or files. Base will create the queries and reports from

    the text files but will not modify the file by adding, changing, or removing text.

    Normally text data source files should be created and modified using the same program. This will limit the potential

    for errors. Figure 1 is an example of this. This file could be modified using a word processor, but you might then

    misplace the commas. This type of error might make the data unusable. However, using spreadsheet programs such

    as LO Calc, you can make your modifications and then save the file using the CSV format. Then all the commas will

    be placed right where they need to be.

    A text data source (database) can contain more than one text file with the same file structure. This might be several

    address books using the CSV format: one address book for general personal use, another for business use, and

    another for a social organization. As long as they are all placed in the same folder, Base will access all the files with

    the format you tell Base to use.

    If you want to have more than one text database, you need to have separate folders for each one. Otherwise, you

    may find yourself accessing text files from more than one database especially if the two text databases use the same

    file format.

    CautionWhen modifying a text data source file, Isuggest you

    use the program used to create the data source files.

    You can make errors that will corrupt the data you

    see when accessing the file using Base.

    Flat database

    A flat database contains data just as a text data source does, but the flat database is structured in a different way.

    Figures 1 and 2 show the differences. The data is organized into one or more tables. (This example has only one

    table shown in Figure 2.) Then the data is further organized into columns with each column containing the same

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    8/41

    kind of data. (The first column contains only the first names of people, the second column contains only the last

    names of people, the third column contains display names, the fourth column contains only nicknames of people,

    the fifth column contains the primary email addresses, and the sixth column contains the secondary email addresses.

    These are referred to as fields.

    The data is even further organized by the rows. All the data of a row refers to a specific person, place, or thing.

    Together they describe the specific person, place, or thing. (The data in the first row refers to John Smith.) These are

    referred to as records.

    Tables in a flat database are independent of each other. You work with one table at a time. Think of a flat database

    with more than one table as being a collection of tables that together serve a general purpose. Each table provides a

    part of that purpose. Each table has its own queries and reports that gather information for that one table. Each table

    can have its own form.

    Computer Address books are flat databases. You can divide your addresses into groups (tables) according your

    needs. You can have the same address in different groups if they meet the requirements for more than one group.

    One of these may contain only the person's name and business email; another may contain family information,

    personal email, and social network information.

    Consider a library database that includes book titles, authors names, information on each author, and publication

    information. When the library contains two or more books by the same author, the authors name is a duplication, as

    is the author information.

    As an example of a database requiring a large number of fields, consider a family database. Some of the fields are

    the family members first and last name (total of 2 fields). Additional fields are the spouses first and last name. But

    then we would need the first and last names for all the spouses. One of my siblings has been married three times (sixfields). So the total number of fields has increased to 8. Then we need the first and last names of the children. Well,

    one of my siblings has five children (10 fields). So the total number of fields has gone to 18. A relational database

    providing the same information requires only 11 fields. And with these 11 fields, a relational database will also

    identify the step children and to what parent they belonged before the marriage. Even more fields would be needed

    to do this with a flat database.

    Note

    Base and Calc can create flat databases. Specifically, Calc

    spreadsheets can be saved in dBase format. To save more

    than one sheet in a spreadsheet, you have to access each

    sheet and save it one sheet at a time. (If you have 3 sheets,

    you have to save three times.) For more information on doing

    this, see Chapter 5 of the Base Guide.

    Using flat databases with BaseThere is nothing really complex about planning for a flat database. You first determine the data to be used in the

    database and what characteristics the data will have. The types of data and their characteristics will determine the

    number of fields and the characteristics of each field. (Each field should have one characteristic that distinguishes it

    from all the other fields.) You then name each field and define the characteristics for each field.

    Determine if you will need only one or more than one table. Since tables in a flat database are independent from

    each other, divide the fields into groups which have no defined relationships with the other groups.

    Will the database be contained in one file (as in those databases created with Base) or will it be contained in two or

    more files as in dBase or CSV files? If the latter, then all the files need to be placed in the same folder.

    Some database programs will create only flat databases. Most can create relational databases as well as flat ones. The

    latter has the ability to relate data held in separate tables, while the former can not do this. But unless the program

    defines one or more relationships between data contained in separate tables of a database, it will be a flat one.

    Relational databases

    Like a flat database, relational databases contain one or more tables. They were originally designed to solve the

    problems of duplication of data and multiplication of fields in a flat database.

    A relational database has a more complex structure than does a flat one. Below are some of the general principles

    that have been found to produce good design in relational databases.

    Each table of the database must have one or more fields that uniquely identify each row of the table. For a

    given value in one field or given sequence of values in two or more fields, there can only be one row

    containing this value or sequence of values.

    The tables of a database must meet specific criteria known as first, second, third, and fourth normal form. (These are

    discussed in the design phase. See First Normal Form [reference will be placed here.])

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    9/41

    Tip

    value. A cell with a null value has no entry at all in

    it. Anything entered into a cell is considered to be a

    value. A cell is considered to be null until a value is

    entered into it. Similarly, a cell that has had its

    value deleted has gone from being not null to have

    the null value.

    2) The field or fields that uniquely determine a

    row have a name: primary key. This term isdefined and explained in chapter 3 of the Base

    Guide, "Data Input and Removal." Relationships are defined by linking one or more fields of one table with one or more fields of another table.

    The link is made by virtue of the fact that the two fields have identical values. The field or fields from one of

    the tables will be the primary key for that table. The other fields from the second table may or may not

    include that tables primary key. They are called foreign keys. Since sometimes we may want to use the

    primary keys of two tables to define our relationship, it is possible that a primary key may also be a foreign

    key.

    TipRelationships in a relational database are defined by

    primary-foreign key pairs.

    Example of a relational database and the restraints required of it

    We have family data that we want to put into a relational database. It begins with one table: the Family member. It

    contains the family members first and last name, the spouses first and last name, and the childs first and last names.A sample table is shown below using the information given below as well. (This table does not have primary key

    yet.)

    Table 1: Family member table

    FM First

    name

    FM Last

    nameS First name S Last name C First name C Last name

    Sam Silver Martha Silver Bob Silver

    Wendy Whitter Dave Whitter Alice Whitter

    Next we complicate things because Sam has since divorced Martha and married Wilma. Sam and Wilma now have

    one daughter, Sandra.(Martha has kept her married name.)

    Rather than just adding two more fields for the new spouse (her first and last name), we will create a second table

    containing the first and last name of the spouses as well as the childs first and last name. Then we will add the fields

    needed to link these two tables. By doing so, we will be able to associate each family member with a ll of the spouses

    they have had.

    The new table will be name Spouse for clarity. To identify each row of the Family member table, we have added the

    field Family member ID (the primary key) to the Family member table and removed the four fields referring to the

    spouse and child. The Family member ID field contains distinct values so we can identify each family member by

    the value of the Family member ID field.

    The Spouse table contains the Spouse ID field (the primary key), the four fields removed from the Family member

    table, and the Family member ID field (foreign key). This last field is used to link the Spouse table to the Family

    member table. The first two rows of the Spouse table are both linked with the first row of the Family member table.

    For these rows, the values for both Family member ID fields is the same: 0. At the same time, individual spouses are

    identified by the Spouse ID fields. Using the Family member ID with the Spouse ID fields, we can determine who has

    been married to whom and what children belong to each one of these marriages. (See Tables 2, 3,and 4)

    With the design we have now, we can list as many spouses for any given family member as we need. We do not

    have to add any fields. Furthermore, we could add one or more fields such as the wedding date and date of birth.

    Table 2: Modified Family member table

    FM First name FM Last name Family member ID

    Sam Silver 0

    Wendy Whitter 1

    Table 3: Spouse table

    Spouse ID S First name S Last name C First name C Last nameFamily

    member ID

    0 Martha Silver Bob Silver 0

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    10/41

    1 Wilma Silver Sandra Silver 0

    2 Dave Whitter Alice Whitter 1

    Wendy Whitter has now had a second child, David, so we need to apply the same principles to the Spouse table that

    we did to the Family member table. We will create a Child table containing the childs first and last names. We also

    need a field to identify each of the children listed. I have used Child ID (the primary key). The Spouse table also

    needs a field, Child ID (foreign key), to link the Child table to the Spouse table.

    Table 4: Modified Spouse table

    Spouse ID S First name S Last name Family member ID

    0 Martha Silver 0

    1 Wilma Silver 0

    2 Dave Whitter 1

    Table 5: Child table

    Child ID C First name C Last name Spouse ID

    0 Bob Silver 0

    1 Sandra Silver 1

    2 Alice Whitter 2

    3 David Whitter 2

    We have three tables in our database with 11 fields. With this structure, we can list all the family members, their spouses, andtheir children regardless of how many times any of the family members have married, how many children were born in these

    marriages, and how many step-children there may be.

    With a flat database, it would take 10 fields to list the family members, their one spouse, and one child. For each

    additional spouse a family member may have, 2 more fields must be added. For each additional child, 2 more fields

    must be added.

    Figure 3 shows the family member as Sam Silver, two spouses: Martha and Wilma, and one child, Bob, belongs to

    Sam and Martha (a green arrow is on the left end of Martas row). Figure 4 shows that Sam has been married twice

    and they have had a child, Sandra, during their marriage. (The green arrow is on the left end of Wilma's row.)

    Figure 3: Showing the child from one spouse for a family member

    Figure 4: Showing a child from another spouse from the same family memberThis figure shows that Wendy has been married once and two children have been the results of that marriage.

    Figure 5: Showing two children from one marriage

    Defined relationships in a relational database serve another purpose. They permit queries to be created using the

    tables that have these relationships. So with this example, we can have a query that will use all three tables since all

    three tables are related. This query would use the First name and Last name of the Family table, the First name and

    Last name of the Spouse table, and the First name and Last name of the Child table. By specifying the values for the

    names from the Family and Spouse tables, the query will tell us the names of the children who belong to this specific

    couple. This will only work because of these two relationships: the Family member ID field pair, and the Spouse ID

    field pair.

    Reports are based upon one table, view, or query, but in relational databases, a query can contain fields from more

    than one table. In the above example, a report can be generated containing fields from all three tables: Family

    members, Spouses, and Child. This is because the report can be based upon a query that contains the three tables.

    Relational databases are the most common type of databases used today.

    Using a relational database with BasePlanning for a relational database is very similar to planning for a flat database. You are still determining the tables,

    fields, and the tables to which the fields belong. The main difference comes from the additional restrictions that a

    relational database must have. They are necessary to define the relationships which exist among the fields of the

    database.

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    11/41

    Planning for a flat or relational database

    Tip

    Many of the principles taught here apply to both

    types of databases.. So, rather than mentioning both

    all of the time, we will discuss flat databases. Then

    we will discuss the additional things that apply to a

    relational one.Regardless of what type of database will be used, the planning for creating the database begins at the same point:

    you have some data that you want to use to provide you with some information. So, you begin with the question:

    "What do I want the database to do?" The answer to this answer is the general goal for the database.

    Planning for your database requires you to go from this general goal to more specific goals that will, when

    completed, accomplish the general goal. In many cases, you will need to divide the specific goals into even more

    specific goals. And you should continue this process until the specific goals cannot be further divided. When you

    have finishing your planning, you have an outline of what the database should do and what its structure should be.

    This planning process involves asking and answering questions based upon what you already have. This is how you

    divide your goals into smaller parts. This is true whether you are creating the plan by yourself or someone works

    with you creating the plan.

    Make sure you take enough time to ask and answer the questions thoroughly, or the plan might not be as complete

    as it could have been. You might not notice this until later in the creation of the database. If this happens, you willneed to come back to your plan to make changes and also change the things you have done since creating the

    original plan. It is common for people not to ask enough questions to make their plan complete. (When I have had

    to make changes when creating a database it was often because I did not even think of some possibilities.)

    Your plan needs to be in outline form. This permits you to list your main goal in general terms. These general terms

    will enough for the top level of the outline. As you divide each of these general terms, you create a second level.

    Further division gives you additional levels of the outline.

    You should create your plan using a word processor (such as Writer). You can review the higher and lower levels of

    the outline to see the logic you have been using. This also permits you to add lines anywhere in the outline where it

    is needed at any time, even long after the planning is done if the plan needs to be updated.

    The outline of the plan can become rather complex because you will need to use several levels to get to the specifics.

    If you are comfortable working with an outline containing six or more levels, then use the complete outline.

    However, for most people, I suggest you break the outline into smaller parts. It should make your task much easier.

    The outline can be divided into three parts: the purposes for the database, the type of database you will use, and thedata to be contained in the database. The first two part are not very complex. However, describing the data to be

    used can produce a very complex outline. In this part, you will be describing the properties of the data in detail.

    The plan: the outline

    Note

    1) The outline for the database plan will contain

    comments inline. Each of the comments will

    precede the portion of the outline to which the

    comments apply.

    2) Each part will be fully developed to the lowest

    level of the outline before the next part is

    begun.The top level of the outline contains the most important question: the purpose of the database. The answer may

    contain only one item, or it may contain more than one (in which case you need a second level). On this level, youask what you want each of these items to do. The individual answers to these questions may contain one item or a

    list. If an answer contains only one item, make sure the answer is detailed. If an answer contains a list, you go to a

    third level using the same basic question as on the second level.

    There is a pattern contained in the above paragraph. It is a pattern that you continue until all the lines of questions

    end in single item answers. I have listed only two levels here and listed three levels in the outline shown below.

    Some database plans require only one level, some require two, and others require three or more.

    The answer to Question 3 in Part 1 should be very general in nature since Part 3 asks you to describe the data in very

    specific terms. For example, data in a financial database will have to contain amounts, accounts, descriptions for

    transactions, and dates and perhaps more. A library database will contain information about the books, information

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    12/41

    about the people borrowing the books, and information about loans of books.

    The answers to Question 3 in Part 1 should also be general in nature since Part 3 also asks you to describe the queries

    and reports in specific terms including what fields will be used. The answers can be in the form of "I want the

    database to tell me ..." Your answers to Question 3 will fill in the ellipses ().

    Tip

    Remember that your database plan needs to answer

    questions that begin with the word "What." Your

    database design answers question beginning withthe word "How." It is not very hard while

    developing your plan to drift from asking "What" to

    asking "How."

    Part 1: Purpose of the database

    1) What is the purpose of the database?

    a) What do you want (item in the list) to do?

    i) What do you want (item in the list) to do?

    What do you want (item in the list) to do?

    2) What general types of data will be in the database?

    3) What information do you want from the database? (queries and reports)

    a) What information do you want to get from queries?

    b) What information do you want to get from printed reports?

    Part 2: Type of database

    Base works with two types of data sources: external ones that it accesses, and embedded databases that it has created.

    Since text and spreadsheet data sources are somewhat different from others accessed by Base, we will consider them

    separately from the others. (The Base wizard contains the list of data sources that Base can access as a drop-down list

    on page 1.)

    Question 1 contains mutually exclusive choices so you only have to answer the questions in the appropriate section.

    Question 2 needs to be answered for people who use a network or are concerned with compute security.

    1) What type of database file or files will you be using?

    a) Accessed database

    i) For text or spreadsheet data sources:

    What program will you use to create, modify, or delete fields and tables from your

    data source?

    Should a folder be created for the data source file or files? (A text data source willneed to have a special folder.)

    ii) Other accessed databases:

    What type of database file or files will you be accessing? (For example: Access,

    MySQL, PostgreSQL, or Oracle)

    Should a folder be created for the data source file or files?

    b) Embedded databases

    i) Will you be using a flat or relational database?

    ii) Preference as to where to place the database file?

    2) Will you be accessing the database file or files on your computer or over a network?

    a) Who should have access to the database?

    b) Will you want to have user names and passwords to control people's access?

    Part 3: Data to be used in the database

    This part of the outline is very important. This is where you define the attributes (characteristics) of your data. Each

    attribute will be a field for the database. Once you have defined the attributes, you can separate them into tables

    based on common relations that exist among the fields. If you are creating a relational database, the tables you create

    in your plan will be modified and additional tables added as you design it based upon the plan outline you are

    creating now.

    Part 3a: Defining the attributes of the data

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    13/41

    Tip

    .

    in the Table Design dialog. (These choices are

    discussed in more detail in the "Creating the

    table" section of Chapter 3 of the Base Guide.

    1) What data will you be using?

    2) What names will you give to your data, the names of the fields in the database?

    3) What are the characteristics of this data?

    a) Name the characteristic for each field. (Choices: tiny integer, big integer, image, binary (variable or

    fixed), memo, fixed length text, number, decimal, integer, small integer, floating decimal, real

    number, double accuracy, variable length text, variable length text (ignore case), Yes/No, date, time,

    date/time, other)

    b) What are the additional properties for each field?

    i) Will you require that an entry be always made for the field?

    ii) What is the length of the field? (The maximum number of characters that any entry for the

    given field can have.)

    iii) Do you want a default value for this field that will entered for each record in the table? If so,

    what is it?

    iv) Is there a specific format that you want to use for this field? (For example, you can specify

    currency format when selecting number or decimal as the field type, the date format, time

    format, or the date stamp format.)

    Part 3b: Planning the tables

    Another thing that may determine some of the tables we will need: some fields will have a limited number of distinct

    values. For each such field, we will create a table with a single field containing these distinct values. These will be

    used to provide drop-down lists for these fields in the form(s) containing them. (We will use list boxes for the drop-

    down lists.)

    For example: A field for the days of a week. This has seven distinct values. Or, consider a field for the months of the

    year. This will have 12 or 13 values. (Some non-Gregorian calendars have a leap month.)

    It is possible that our initial list contains some "fields" that actually are not fields at all, but alternative values of a

    single field. When this happens, list this field with the others. Name the table to be created for the field, and list the

    distinct values with the name of the table.

    For example: our budget database. The budget categories may seem like fields, but they are only distinct values for

    specific budget levels. We will develop the plan for this database as our example after we define what a plan should

    be.

    The next step is to look at the fields to see if there are any relationships existing between them. We will be separating

    the fields into tables based upon these relationships.

    Note

    A household inventory database might be a flat

    database with one table.This permits queries to

    provide detailed information needed for insurance

    purposes and the value of the inventory. It might

    also be a flat database with one table for each room

    of the house. This permits queries or views that

    display what is contained in each room. This might

    be very helpful for a moving company. (A view

    could also be created for each room displaying the

    same information from a single table.)

    1) What fields have a limited number of distinct values?

    a) Name the table for each of these fields.

    b) For each field, list the distinct values it has. For each field, list the field type and properties.

    2) What fields are really distinct values for a specific field?

    a) Name a table for each set of distinct values.

    b) Name a field for each set (if it does not already exist) and list it with the fields of the database.

    3) What are the relations that exist among the fields?

    4) Do these relations agree with the purposes of our database?

    a) Only if the tables suggested by the relations agree with the stated purposes of our database should we

    use these tables.

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    14/41

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    15/41

    Part 1 of the plan usually provides little specific information that can be used in the design phase, but it does provide

    the general structure needed. It therefore places limits upon the other two parts as we work through them.

    For example: the flat database described in Chapter 1 of the Base Guide, Introducing Base. Part 1 of that plan

    requires that the information be useful to provide information that ones insurance agent could use to determine

    what the insurance policy should be for the contents of the house. This limits the number of tables to just one.

    Note

    Designing a database is an art. This means that more than

    one design may accomplish the goals listed in the databaseplan.

    Tip

    While designing a database, you should have a

    separate copy of your plan that you can view. This

    can be one or more sheets of paper or a text

    document to be viewed on a monitor. If you are

    designing it on a computer, you could have your

    plan and design for your database on your monitor

    at the same time (if it is big enough).

    Part 1: Purpose of our database and a design for it

    Part 1 provides the structure for the design's outline. So keep this part of the design in general terms. The more

    specific parts of the design will be filled in in Parts 2 and 3.

    Purpose of the database

    The plan has a list of things to be obtained from the database. These will add structure to it as well as place

    restrictions upon it. Design the general structure based on the answers to question 1 of Part 1 of the plan. Also design

    the restrictions the same way.

    1) What structure is required to accomplish each item listed in the answers to question 1a (including lower

    levels) of Part 1 of the plan?

    2) What restrictions are required because of these answers?

    General types of data

    These are described in greater detail in Part 3 of the design. In the plan, they served as an outline for all the data to

    be used. Part 3 of the plan used this outline to develop all of the data needed. We will not have to include them in

    Part 1 of the design.

    General description of the information desired from the database

    The general outline of the queries and reports is done here. In Part 3, these outlines are filled in based on the tables

    created from the data.

    Information from queries

    This is the time to list the query or queries to be created, if any. Then we need to give each one of them some basic

    structure: detailed or summary query. We should also list the general restrictions that apply to each query. Usually,

    these restrictions will be in WHERE clauses, but summary queries can use the GROUP BY clause. A few of them use

    the HAVING clause with GROUP BY. (This might not be obvious until you reach Part 3 of the design.)

    1) Name the queries to be created.

    2) Detailed queries: What restrictions are required using the WHERE clause?

    3) Summary quer ies :

    a) What restrictions require the WHERE clause?

    b) What restrictions require the GROUP BY clause? Will the HAVING clause be required as well?

    Information from reports

    We should look carefully at the report descriptions in the plan. If there is not a query or table upon which the report

    can be created, a query or view must be created before the report can exist.

    For all the reports that have tables or queries from which to get the data, we need to determine what type of report it

    should be: static, or dynamic. You also have the choice of using the Data window (F4) to insert data into a Writer

    document or Calc spreadsheet. For example, the report could take the form of a letter describing the status of

    investments for the previous year.

    Now that LibreOffice also has the Report Builder extension installed; more elaborate reports can be created with this

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    16/41

    than with the report wizard. These reports can take the form of a text document or spreadsheet.

    1) How will each report be generated? (Writer document, Calc spreadsheet, report wizard, or report builder?

    Include a brief outline of the desired report

    Report Builder: Will it be a spreadsheet or text document?

    1) What query, table, or view will the report use? (It can only use one.)

    a) If no existing one will do, what is the basic structure of the needed query or view?

    b) Name the query, table, or view if there is one.2) Answer for each report: will it be static or dynamic?

    Part 2: Type of databaseand a design for it

    The center of this part is the LO database document file (extension.odb). We may want the database to be embedded

    within this file. Or we may want to use the file to connect to a database (an accessed database).

    This raises even more questions to consider in the design. The type of database must be considered as well as the

    location of the database when it is an accessed. This is especially true when a local area network is involved.

    Security also plays a part in the design. Base does not encrypt its file. But it can access databases which require a

    password. If on a network, access can be controlled by use of passwords and access rights. All of these become a

    part of the design.

    Tip

    When LO is on a network, the security issues

    belong to the IT people. They will tell you what youneed to know when designing your database.

    We must consider the input and output of the database as well. With embedded databases, LO can create both data

    input (tables and forms) and data output (queries and reports). With external databases, LO cannot always add,

    modify, or delete data. It can create data output. Text databases and spreadsheets are the two examples for which LO

    cannot input or modify data. Neither can LO create a view for either. So, if you will be using either of these, you

    must decide what program you will use to input data into the database.

    1) Will the database be embedded within the database document file, or will this file be used to access the

    database?

    2) If the database is a text or spreadsheet, what will you use to add, modify, or remove data?

    3) For other accessed databases.

    a) How do you access i t?

    i) What database driver is needed?

    ii) What settings are required for the database driver?b) Where will the database be located? (same computer, network server)

    i) User and password required?

    ii) What is the path to the database on the network?

    What are the access rights to the database?

    Database data and a structure for it

    Part 1 of the design contains the general outline of the desired database while Part 2 of the design contains the

    outline of its, its creation, and what is required to use it. Part 3 of the design describes it in very specific terms.

    This includes defining how data will be added, modified, or removed (data input and removal). Tables, views, and

    forms for it are defined where needed. And if a relational database will be used, the relationships between tables are

    defined.

    Part 3 also includes how the data will be used (data output). Queries and reports are defined including what fieldsand their tables are used, what functions are needed (if any), what restrictions are required, and how the output data

    should be sorted.

    We begin with Part 3 of the plan. The latter lists the specific fields and their possible tables for the database. Then we look at

    how we can use what we have, and we begin to ask questions again. When planning it, we asked ourselves what we wanted.

    Now we ask another question: how do we do it?

    Fields

    It may seem as if we have all the information we need about our fields in Part 3 of the plan section. But we must now

    take the same information and look at it from a different perspective. For example, some of the fields may not really

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    17/41

    be fields but only field values. (This will be more obvious when we begin our design of the tables.)

    We need to create one or more three-column lists to contain our fields. The first column contains the field names, the

    second column the field types, and the third column the field properties. Thus each row contains a field name, its

    field type, and its field properties.

    In the plan, we have already divided the fields into proposed tables for it. We should now create a list of fields for

    each proposed table so that we can place fields that share a relationship in the same table. It is also easier to spot

    errors or missing fields within a smaller list per table than might be possible when we put all of the fields into a single

    large list.

    What fields refer to the same thing?

    Field namesWe have already defined and named our fields when we created our plan. At this point we should review our field

    names for their usefulness.

    Do we want to change any of the field names?

    Do we want to add or remove any fields?

    Field typesAs with the field names, we need to review the field types we have given to our fields. We might need to make some

    changes. If we added a field when we reviewed the field names, we also need to assign a field type to the new field.

    Have all of our fields been given an appropriate field type? If not, correct the field type.

    Field propertiesThese are divided into four parts: AutoValue and Auto-increment statement, or Entry required; Length or Length

    and Decimal; Default value; and format example.

    AutoValue and Auto-increment statement

    The AutoValue and Auto-increment statement properties apply only to the primary key of a table. The AutoValue

    determines whether this fields values are automatically generated by the table or not. The Auto-increment statement

    determines the value of the increments. (This statement is written in SQL.)

    Entry required

    For all fields of a table except for the primary key, you have a choice as to whether a value must be entered in every

    record of the table or not. (This is one of the questions in Part 3d of the plan.) You need to look again at each one of

    the fields to determine if the value for Entry required needs to be changed.

    Length

    Length describes the maximum number characters contained in a field value. (Fields having a date field type do not

    have this property.) Its purpose is to prevent unnecessary space requirements in the database. Yet you want to make

    sure that each of your fields is long enough. For example, a field for six digit odometer readings does not need a

    length more than 6 digits. But a database for a very large bank may require much longer fields. Use your own

    judgment as to what is enough, and what is too much.

    Associated with the Length property is the Decimal property. This one determines how many decimal places the field

    value will have. This divides the length into two parts: the number of digits to the left of the decimal point, and the

    number of digits to the right of it.

    Default value

    You may want some fields to have a specific value until you change it. You specify what value you want the field to

    have. This may be text, a number, or a date. Fields with the field type, Yes/No, permits three choices for the default

    value: None, Yes, and No.

    Format example

    Field formatting is the last field property. It is tempting to apply all of the formatting that you can here. In most

    cases, this should not be done. Only use essential formatting unless you are going to view the data in its table.

    Field data is usually only seen in forms, queries, reports, and views. It is in these place that you should format the

    field according to appearance you want in those places.

    1) Primary key:

    a) Is the AutoValue set properly for the primary key of each table?

    i) Is it set to Yes when the Field type is Integer [INTEGER]?

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    18/41

    ii) Is it set to No when the Field type is not Integer [INTEGER]?

    b) If an Auto-increment statement is desired, use SQL to write it.

    2) Entry required: Should a field have an entry in every record?

    a) If so, select Yes.

    b) If in doubt , select No.

    3) Length:

    a) For non-decimal fields, select a length that is appropriate for the data the field will contain.b) For fields that contain decimals, enter the appropriate number of decimal places.

    4 ) Default value.

    If the answer is yes, enter the value.

    If the field is Yes/No (Boolean),select the value you will use.

    Tables

    In the planning phase, we placed the fields of the database into one or more tables based upon the relations that exist

    between the fields. In the design phase, we will consider these more closely. It may require us to make some changes

    to our tables: modifying some, and perhaps creating some.

    Tables for list boxesIf you have identified one or more fields that have a limited number of distinct values, you have also named the

    tables that will contain these values. We must design a table for each set of values. If you have not done so already,

    for each field with distinct values, define its characteristics.

    If when planning you listed one or more fields having a limited number of distinct values, you need to design a table

    for each one of these fields. If so, you have a list of these fields, the table names associated with them, and their field

    types and properties. All of these will be used to define the tables needed to create list boxes for these fields in the

    forms.

    List boxes are very useful for fields with a limited number of distinct values. We use them to create drop-down lists

    in our forms. We also use them for consistency in our data entries for these fields.

    List each field with distinct values, the table holding its values, the characteristics of the table's field, and its

    distinct values.

    Because of the structure we have placed upon each table used for a list box,

    The table contains only one field.

    This field is the primary key for the table.

    You can only enter distinct values into this field.

    TipIn most cases, you will be using VARCHAR as the

    field type for the field. Other field types can be

    used, but this will be the most common one.

    Flat databasesFlat databases contains tables that are independent of each other while relational databases have tables in which one

    or more fields of a table are related to one or more fields of another table. In fact, a field of a table may be related to

    another field of the same table. Because of this we design the tables of a flat database differently from those of a

    relational database.

    For a flat database, our table has already been designed when we designed the fields with their field type properties.

    The only thing left to do is to create a table that implements these fields. But since a field is likely to have several

    properties, we will divide these into individual properties as shown.

    Before doing this, consider in what order you want each field to appear in its table. It may well be sensible to havesome consistency between the order of fields in a table and the form that contains it.

    So, create a list like the one below for each table of the database. Begin by listing the fields in the order you have

    now given them. For each table of the database, enter the characteristics of its fields as you have already defined

    them. You will be using these lists when you create the database.

    As you fill out these lists of field characteristics, you should review the fields you have defined.

    Are there any more fields that might need to be added?

    What are the characteristics of these fields?

    Tables 6 and 7 are really one table. Each one contains part of the characteristics for the four sample fields.

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    19/41

    Table 6: Sample Characteristics of the fields of a database table

    Field Name Field typeAutovalue Auto-Increment

    StatementEntry Required

    ID [INTEGER] Yes (none) Yes

    Budget 1 [VARCHAR] Yes

    Amount [DECIMAL] Yes

    Reconcile [BOOLEAN] Yes

    Table 7: Sample Characteristics of the fields of a database table

    Field Name Length Decimal places Default value Format Example

    ID

    Budget 1 50

    Amount 10 2

    Reconcile 0 No

    Table 8: Sample table with values.

    ID Budget 1 Amount Reconcile

    0 Auto 17.21 Yes

    1 Food 12.00 No

    2 Income 0.03 Yes

    Relational databasesThese require a different method of design. While flat database tables are well defined from the time they became

    part of the plan, relational database tables must meet more stringent requirements in the design phase. Fields may be

    moved from one table to another. Additional tables may be needed and, in technical terms, the tables have to be

    normalized. Relationships existing between tables have to be defined using one or more fields of one table and one

    or more fields of another.

    Normalized tableOne that has been shown or modified to meet the requirements for first through Boyce-Codd normal form.

    When physically representing these tables, we will use the column headings (field names). For example, Table 8 would look

    like:

    ID Budget 1 Amount Reconcile

    When modifying or creating a table, we will use this to indicate what fields belong to it. Sometimes we may need to

    move one or more fields from one table to another. Or we may need to create a table with a field identical to one in

    another table. We must then use the same name for the field in both tables.

    One reason why fields have to be moved or duplicated and new tables created is to conform the tables to the rules

    required for relational databases. This is done over four levels called the first, second, third, and fourth normal form.

    Note

    While additional normal levels exist, relational

    databases that have fourth normal form will run

    very well. In many cases, adding another

    normal form level does not improve how well

    the database runs.

    Microsoft, perhaps some others, have combined

    the original fourth normal form with anadditional requirement to define what they

    define as fourth normal form.

    While there are precise definitions of the normal

    forms, they are very technical in nature. Unless

    you have a good mathematical background,

    these definitions are very difficult, if not

    impossible to understand.

    First normal formBefore making a table first normal form, we have to look for two possible characteristics in the columns and rows

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    20/41

    that must be removed if they exist. Does the table contain two or more columns that are identical in nature? Does the

    table contain two or more rows with the same value for one field but different values for another field? If the answer

    to both questions is no, the table is first normal form.

    These two characteristics are two different ways of looking at the same problem. For example, consider an address

    book. It contains these fields: first name, last name, street address, city, state zip code (postal code), and phone

    number. One person listed in it has three phone numbers: a personal cell phone, his office cell phone number, and

    his home phone number (land line). Another person listed has three addresses: where she lives and post office box

    (for her snail mail). This tables would not be first normal form. One row of data contains three entries in the phonenumber field, and another row contains three entries in the street address field.

    Table 9: Address Book not first normal form

    First Name Last NameStreet

    AddressCity State Zip Code Phone no.

    Albert James 345 First Hart IN 12345

    (111)

    111-1111

    (222)

    222-2222

    (333)

    333-3333

    Wanda Anderson

    111 Main

    Apt. 14P O Box

    1991

    Amherst MA 15115(100)

    100-1010

    We need to consider the primary key for the table because this is important. Originally, all the data referred to

    specific people and could be divided into rows accordingly: one row for each person. As long as there was no

    duplication of names, these fields, First Name and Last Name, can be the composite primary key. If duplication

    exists, an ID field can be used as the primary key. Then each individual's name is associated with a specific ID value.

    We still have two rows in which one field has multiple values.

    Table 10: Not first normal form

    ID (PK) F N L N S A City State Z C P No.

    1 Albert James 345 First Hart IN 12345

    (111)

    111-1111

    (222)

    222-222

    2

    (333)

    333-333

    3

    2 Wanda Anderson

    111

    Main

    Apt. 14

    P O Box

    1991

    Amherst MA 15115

    (100)

    100-101

    0

    Another possibility is to associate each phone number with a specific ID value. In our case, we would have to also

    associate each street address with a specific ID value. While we can determine what each of the addresses are for,

    how do we know to what phone number to use to call a given phone? This possibility is making the situation too

    complex. The more fields have multiple entries, the more complex this possibility becomes. Furthermore, a field is

    still needed to distinguish between multiple entries of each field that has them.

    Table 11: First normal form

    F N L N S A City State Z C P no. ID (PK)

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    21/41

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    22/41

    Tip

    The new composite primary key will create additional

    rows of data that contains repetitions with exception

    of the field containing multiple values and the field

    added because of these values.

    Second normal formFor a table to be in second normal form, it must be in first normal form and the fields must be dependent upon the

    entire primary key. Clearly, if the primary key consists of only one field, the table is in second normal form. So,look for tables that have composite primary keys. They are the only ones in first normal form that might not be in

    second normal form.

    The primary key determines the values for the rest of the fields in table. With composite primary keys, one of the

    fields in it may determine the values of some of the other fields by itself. That is when problems can occur:

    repetitions in the rows of the table. Modifying the table to make it second normal form removes the repetitions.

    We continue to use the address as our example that is in first normal form. It does have a composite primary key, so

    we will check the fields in it to see if the table is in second normal form.

    The field, Phone no., is determined by a combination of two fields: Type of Phone, and ID. We do not need the

    fields, Type of Address nor First Name nor Last Name, to determine this field.

    Similarly, the fields, Street Address, City, State, and Zip Code are determined by a combination of two other fields:

    ID and Type of Address. We do not need the other field in the composite primary key to determine the values of

    former three fields.

    Finally, the field, ID, determines First Name and Last Name by itself. So, this requires a third table.

    DeterminantA field that can determine the values of another field in its table.

    Tip

    A primary key is a determinant since it determines

    the values of all the other fields. When we have a

    composite primary key, it may contain a field that is

    also a determinant that determines the values of

    some fields but not all of them. Having such a field

    in a table prevents it from being second normal

    form.

    This means that we have three fields in our table that are determinants. To make the table second normal form, we have to

    create three tables, one for each determinant. Each of these tables contains the determinant as its primary key. The fields that

    it determines are moved from the original table to this new one. The original table is modified. It still contains the completecomposite primary key. It may contain other fields, but our example does not. When we describe the designing of the Budget

    database, this example will have fields containing more than the primary key.

    Table 13: Phone table

    ID(PK) Type of Phone (PK) Phone no.

    1 Cell (111) 111-1111

    1 Bus Cell (222) 222-2222

    1 Home (333) 333-3333

    2 Home (100) 100-1010

    Table 14: Address table

    ID (PK)Type of

    Address (PK)Street Address City State Zip Code

    1 Home & mail 345 First Hart IN 12345

    2 Home111 Main Apt

    14Amherst MA 15115

    2 Mail P O 1991 Amherst MA 15116-1991

    Table 15: Name table

    ID(PK) First Name Last Name

    1 Albert James

    2 Wanda Anderson

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    23/41

    We have three new tables, so they should be examined to determine if they are first normal form. They are because

    all the fields contain single values. But what about second normal form? (two tables contain a composite primary

    key.) For the Phone table, both fields of the primary key are required to determine the Phone no. field. For the

    Address table, the same thing is true. So, these tables are both second normal form. The third table, Name table, is

    second normal form because its primary key contains only one field.

    The modified Address Book has a single field primary key. So it is also second normal form.

    Perhaps some observations need to be made about the modified Address Book table. In all five rows, the First Name

    and Last Name field values are determined by the ID field (the primary key). The first two columns come directly

    from the first two columns of the Phone table. Similarly, the first and third columns come from the Address table.

    Table 16: Modified Address Book table that is now second normal form

    ID (PK& FK) Type of Phone (PK & FK) Type of Address (PK & FK)

    1 Cell Home & mail

    1 Bus Cell Home & mail

    1 Home Home & mail

    2 Home Home

    2 Home Mail

    To check for second normal form:

    Does the table have only one field upon which all the other fields depend?

    If yes, the table is second normal form.

    If a table has a composite primary key, can any of the fields of this key determine the values of a table field?

    If no, the table is second normal form.

    Modifying a first normal form table to make it second normal form.

    1) What fields in the table depend upon just part of its composite primary key?

    2) Separate these fields into groups depending upon which part of the primary key they depend upon.

    3) Create a table for each group of these fields. Move each group to their own table. Add to each table a copy

    of the primary key components upon which of its fields depend. (Each of these primary key components

    becomes the primary key for its table.)

    4) Modify the original table so that it contains its primary key and only the fields that depend only upon it.

    Third normal formTo be third normal form, a table must be second normal form and its fields must depend only upon the primary key.Some times a field depends upon another key that is not the primary key nor part of a composite primary key.

    Check list for third normal form:

    Are there one or more fields that depend upon a field that is not the primary key?

    ! If Yes, the table is not third normal form.

    ! If No, the table is third normal form.Asking this question of the Phone and Address Book tables results in a No answer, so these tables are third normal

    form. However, the Address table will give a Yes answer to this question. It is not third normal form.

    Because the Zip Code field depends upon the stated three fields, we will create a new table using them as its

    composite primary key. We also move the Zip Code to the new table.

    Table 17: Zip Code table

    Street Address (PK) City (PK) State (PK) Zip Code345 First Hart IN 12345

    111 Main Apt 14 Amherst MA 15115

    P O 1991 Amherst MA 15116-1991

    Table 18: Modified Address table to make it third normal form

    ID (PK)Type of Address

    (PK)

    Street Address

    (FK)City (FK) State (FK)

    1 Home & mail 345 First Hart IN

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    24/41

    2 Home 111 Main Apt 14 Amherst MA

    2 Mail P O 1991 Amherst MA

    At this point it pays to consider what tables you now have. We began with a single Address Book table. This has

    been modified twice: once to make it first normal form, and the second time to make it second normal. We have also

    created the Phone and Address table. The latter has been modified to make it third normal form.

    So we need to check two tables, Address Book and Phone, to make sure they are also third normal form. In both

    tables, the only non-key fields are only dependent upon their respective primary key. So they are third normal form.Modifying a second normal form to make it third normal form:

    1) Determine what fields depend upon one or more non-key fields.

    2) List each of these non-key fields or group of not-key fields.

    3) Add to this list what fields depend upon each of them.

    4) Create a table for each of the determinants you found.

    5) Move the fields depending upon each determinant to the new table containing it.

    Summary of our example table

    Table 19: Original Address Book table

    ID (PK) F N L N S A City State Z C P No.

    1 Albert James 345 First Hart IN 12345

    (111)

    111-111

    1

    (222)

    222-222

    2

    (333)

    333-333

    3

    2 Wanda Anderson

    111

    Main

    Apt. 14

    P O Box

    1991

    Amherst MA 15115

    (100)

    100-101

    0

    Table 20: Modified Address Book table

    ID (PK& FK) Type of Phone (PK & FK) Type of Address (PK & FK)

    1 Cell Home & mail

    1 Bus Cell Home & mail

    1 Home Home & mail

    2 Home Home

    2 Home Mail

    Table 21: Modified Address table

    ID (PK)Type of Address

    (PK)

    Street Address

    (FK)City (FK) State (FK)

    1 Home & mail 345 First Hart IN

    2 Home 111 Main Apt 14 Amherst MA2 Mail P O 1991 Amherst MA

    Table 22: Name table

    ID(PK) First Name Last Name

    1 Albert James

    2 Wanda Anderson

    Table 23: Phone table

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    25/41

    ID(PK) Type of Phone (PK) Phone no.

    1 Cell (111) 111-1111

    1 Bus Cell (222) 222-2222

    1 Home (333) 333-3333

    2 Home (100) 100-1010

    Table 24: Zip Code table

    Street Address (PK) City (PK) State (PK) Zip Code

    345 First Hart IN 12345

    111 Main Apt 14 Amherst MA 15115

    P O 1991 Amherst MA 15116-1991

    Table 19 is the original Address Book with all of its flaws. The five following it are the tables we created as we

    worked toward making them third normal form.

    We will study the six tables to see what we can discover comparing Table 19 with the five tables following it. The

    values for the First and Last Name fields are listed once in the Name table. The values for the Street Address, City,

    and State fields are listed once in the modified Address table. The values for the Zip Code field are listed once in the

    Zip Code table. And the values of the Phone no. field are listed once in the Phone table. As a result, fewer errors are

    likely to be made when entering the data.

    Table 20 is the modified Address Book table that looks much different from the original table. All the information in

    the latter can be gotten from the modified table and the tables we created using a query or a form which containsdata from all the tables.

    Note

    As an introduction into the next normal form, I will

    state that these three tables above are a lso Boyce-Codd

    normal form. You can check for yourself if you want

    to do so.

    Boyce-Codd normal form (BCNF)There is an other normal form that is similar to third normal form. It is known as the Boyce-Codd normal

    form(BCNF). Most third normal form tables are also BCNF. The only time when this statement may not be true is

    when the table contains composite candidate keys which overlap. When a candidate key contains one or more fields

    that are also part of another candidate key, the keys overlap. This potential problem very seldom occurs.

    Boyce-Codd normal form

    A relation (table) is in Boyce-Codd Normal Form (BCNF) if every determinant is a candidate key.

    Candidate keyOne or more fields in a table that uniquely determines the other fields.

    Tip

    A table may have more than one candidate key, and

    any one of these can become the designated primary

    key. But remember that a candidate key can consist

    of one or more fields of the table.

    An example of BCNF: Consider a table, Enrollment, with these fields: student number, student name, course number,

    course name, date-enrolled. We make these two assumptions: no two students have the same name, and no two

    courses have the same name.

    The candidate keys are these groups of fields: student number, course number; student name, course number, student number,

    course name, and student name, course name). Any one of these for composite candidate keys can be used as the primary key.

    Notice how these candidate keys overlap.

    Student number Student name Course number Course name Date-enrolled

    Student number determines Student name, Student name determines Student number, Course number determines

    Course name, and Course name determines Course number. But none of these fields taken by itself is a candidate

    key. So the table is not Boyce-Codd normal form.

    To correct this problem, we create modify the original and create two new tables: Student and Course.

    Student, Course, and the modified Enrollment tables (respectively):

    Student number(PK) Student name

    Course number(PK) Course name

  • 8/13/2019 PlanningDesigningYourDatabase DEL 20121214

    26/41

    Student number(PK&FK) Course number(PK&FK) Date-enrolled

    To show that there is a problem and its correction, we add some data to the original Enrollment table and then to the

    tables correcting the problem. When a student takes more than one course, the Student number and Student name

    are repeated. Similarly, the Course number and Course name are repeated when more than one student takes the

    course. When this table is made third normal form, we eliminate this repetition.

    Table 25: Original Enrollment table

    Student number Student name Course number Course name Date enrolled

    1001 Sam Livingston CS101Introduction to

    ComputersAugust 8, 2015

    1001 Sam Livingston CS102Introduction to

    Operating SystemsAugust 8, 2015

    1002 Max Caprilla CS101Introduction to

    ComputersJanuary 4, 2015

    Table 26: Student table

    Column 1 Column 2

    1001 Sam Livingston

    1002 Max Caprilla

    Table 27: Course tableColumn 1 Column 2

    CS101 Introduction to Computers

    CS102 Introduction to Databases

    Table 28: modified Enrollment table

    Student number(PK & FK) Course number(PK & FK) Date enrolled

    1001 CS101 August 8, 2015

    1001 CS102 August 8, 2015

    1002 CS101 January 4, 2015

    We enter data in the Student table one time for each student and in the Course table for each course. In the

    Enrollment table, we enter data for the three fields for each course a student takes. When data is added or modified

    for these tables, it only has to be done one time. For example, Sam Livingston legally changes his name to Walley

    Habbernack. To modify our data, we only change his Student name field. When we do this, everywhere Studentnumber, 1001, appears, the database correctly lists the Student name as Walley Habbernack. We do not have to

    worry about changing his name for every course he has taken.

    From my studies, tables with this problem have fields that are similar in nature. This leads to candidate keys that

    overlap. In our example, Course name and Course number do this. Student number and Student name would seem

    to do it also. However, two or more students can have the exact same name. (I once had a class with three boys

    having identical names.) In this case, the field


Recommended