PlanningDesigningYourDatabase DEL 20121214

8/13/2019 PlanningDesigningYourDatabase DEL 20121214

1/41


2/41

Some keystrokes and menu items are different on a Mac from those used in Windows and Linux. The table below

gives some common substitutions for the instructions in this chapter. For a more detailed list, see the application

Help.

Windows or Linux Mac equivalent Effect

Tools > Options menu selection LibreOffice > Preferences Access setup options

Right-click Control+click Open a context menu

Ctrl (Control) (Command) Used with other keysF5 Shift ++F5 Open the Navigator

F11 +TOpen the Styles and Formatting

window

Contents

Copyright 2

Note for Mac users 2

Introduction 5

General information 5

Specific information about this chapter 5

Specific information about Base 6

Java 7Memory cached tables 7

Why some knowledge of database theory is necessary 7

Problems when shutting down computer or a crash occurs 8

Useful commands 9

Goals: planning for the database 10

Database descriptions 10

Text databases 10

Using text data sources with Base 11

Flat database 11

Using flat databases with Base 12

Relational databases 12

Using a relational database with Base 16

Planning for a flat or relational database 16

The plan: the outline 17

Design based upon the plan 21

Part 1: Purpose of our database and a design for it 21

Part 2: Type of database and a design for it 23

Database data and a structure for it 23

Fields 24

Field names 24

Field types 24

Field properties 24

Tables 25

Tables for list boxes 25

Flat databases 26

Relational databases 27

First normal form 28

Second normal form 30Third normal form 32

Boyce-Codd normal form (BCNF) 35

Relationships between tables 38

Relationships 40

View 41

Forms 42

Queries 44

Principle #1: Determine the tables and fields we need 45

Principle #2: Sorting the information in the query 45

Principle #3: Determine the search conditions 45
http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13392_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13390_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13388_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10960_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13386_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__5522_1393493797http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13384_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13382_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13380_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13378_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13376_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13374_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4119_171218500http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4117_171218500http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13372_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10956_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13370_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13368_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13366_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10954_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10952_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11019_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11017_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10950_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11011_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4778_1929050850http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10996_217214055http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__12216_688277121http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13364_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11009_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13362_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11173_698468123http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4776_1929050850http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__3729_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__1599_489383896http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11005_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__1595_489383896http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11003_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__5516_1393493797http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__1587_489383896http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__2960_2038632354http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__2958_2038632354http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__3727_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__20494_2096162955http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__3107_536839882


3/41

Principle #4: Principle #4: Set the type of output. (list, calculations) 45

Principle #5: Using Aliases to shorten or clarify headings 45

Principle #6: Review the query design as to whether it meets our requirements or not 45

Principle #7: Name and save the query 45

Principle #8 Save the database file 45

Reports 45

Introduction

General information

The Base Guide describes what you need to know in order to use the Base component of LO effectively. It begins by

describing a simple flat database in chapter 1. Then, beginning with chapter 2, we consider a more complex database

type: relational databases. Some principles described from chapter 2 onwards also apply to flat databases.

Chapter 2, Planning/Designing your Database, looks at the planning required when creating a database. This is one

of the most important parts of the creation process: it is the outline to be used when the database is created.

Chapter 3, Data Input and Removal, describes the beginning of the creation process. This chapter centers upon the

parts of the database with which we input and remove data: tables and forms. Wizards used to create tables and

forms are examined to reveal their principles and how to use them. We also examine the design dialogs used to

create tables and forms. These dialogs give you much more control over the creation of tables and forms than theWizard does. These dialogs are also used to modify tables and forms.

Chapter 4, Data Output, continues the creation process. This chapter centers upon the parts of the database that

provide us with data output: the queries and reports. We describe the wizards used to create queries and reports.

The query section also contains the explanation of the query design dialog. This dialog contains two parts: Design

View and SQL View. The design view permits creation of more complex queries, and the SQL view permits an even

more complex query using the SQL language.

The report section shows how to use the report wizard and the report design view for creating reports. Use of the

report design view (Report Builder dialog) to create or edit a report is discussed as well.

Chapter 5, Exchanging Data, describes how to use Base with other components of LO: Calc, Impress, and Writer.

Base can provide data for documents created by these components, and Base can input data from documents from

other components of LO.

Chapter 6, Customization of your Database, describes the customizing of the database design. This is a more detailed

look at the design of the individual parts: tables, queries, forms, and reports.

Chapter 7, More customizations, describes even more customization including macros, validation of data, generation

of calculated data, working with multiple forms, and working with other LO documents.

Chapter 8 discusses using Base at work. This includes advancing to an external server of database files, and usage of

database files from other servers, for example: MySQL, Oracle, PostgreSQL, and MS Access.

Specific information about this chapter

How well a database works depends upon how well it is planned and designed. But the same can be said for creating

a drawing (Draw), presentation (Impress), a spreadsheet (Calc), or text document (Writer). Help from others and

one's own experiences provide information concerning how to do the planning and designing of any project.

This chapter is designed to give you a background in how to do these things if you need it. If you already

understand how to plan and design a database, you might not need to read this chapter, However, you might still

find the content a useful refresher course.For those who are fairly new to databases, I recommend you follow the instructions of this chapter to create a simple

database. Then use the next two chapters to create it. Practice using the database for a while. Then plan and design a

more complex database. This should give you even more confidence.

Planning and designing a database begins with seeking as much information about the proposed database as

possible. That means many, many questions need to be asked and answered. For more complex databases, the

answers will often lead to more questions.

This is only the beginning. Even as the database is constructed, surprises will occur. It is doubtful that any plan will

be perfect when it is first created. Changes will have to be made in the plan. As the changes are made, the database
http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10962_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13402_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13400_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13398_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13396_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13394_1869327000


4/41


5/41

data only requires a small fraction of that memory when the table is cached. This leads to a second advantage: when

the database file is first opened, there is less writing of data to RAM, so it takes less time to open the file. And there is

also a third advantage: the potential size of the database. You can work with databases that whose size is hundreds of

megabytes. That is not always possible if all the databases data is written in memory.

There is one drawback to cached tables: search time. When a query is run, the desired data has to be written into

memory first. Only then will the query run. The HSQLDB User Guide contains some directions on how to design a

query for quicker results. It still will not be as quick as when the data is already in memory before the query is

started.

Why some knowledge of database theory is necessary

Base has a wizard with which you can create a new database file for your database. As long as you are only creating

a simple embedded database, this wizard is fairly straightforward. However, if you want to connect to an external

data source, some knowledge is necessary before you can accomplish that. This includes connecting to a

spreadsheet.

Base has a wizard for creating tables. It includes several suggested tables along with the suggested fields to put in the

table. Then the wizard takes you through steps which determine what the field types and properties should be. You

can accept the suggested field types in the wizard. But what if you think you might want to change a fields field type

in some way? You could make some mistakes unless you know what you are doing. Otherwise you could just be

guessing. The same thing can be said about selecting the field properties. And you will be asked to select a Primary

Key. Do you know what this is, and why you need to have one?

Base also gives you the option to create a table usingDesign View . To use this requires you to name your fields,define their field types and properties. The Primary Key also needs to be selected as well. Some knowledge of

database theory helps greatly when using this option.

Creating and working with Base databases is not hard if you know what you are doing. This is why we teach the

basics of database theory as we describe what to do while creating and working with these databases.

Note

The basics of database theory are the same for

different databases, but the application of the basics

are usually different. If you have a background in

these basics as applied in another database program,

I advise you to look for the differences between

how Base applies these basics and the another

database program. It might same you some time and

some headaches.

Problems when shutting down computer ora crash occurs

CautionWhen Base is using the HSQLDB engine, some

special actions must be taken when Base is

shutdown incorrectly. This includes a computer

crash or shutdown before Base is closed properly.

This caution should have caught your attention. I intended to do exactly that! If you follow the directions we give,

you will have a minimum of problems. If you do not follow them, you can have many more problems than you will

want.

The other components of LO have an auto recovery feature which you can configure in Tools > Options > Load/

Save > General. Base does not use this feature. Type something in Writer or enter data in Calc. If your computer

crashes after the length of time set by Autorecovery, the information you entered will be restored when LO has gone

through its recovery process.

During the recovery process LO will try to recover databases, but in most cases the changes created during your last

session cannot be recovered. The recovery process can not recover opened forms and reports.

Backup options under Tools > Options > Load/Save > General can create backup files in the backup folder;

unfortunately, you need to create backup files manually. This requires using Tools >SQL to type commands. Theyare mentioned in Useful commands.

The sage advice of "save and save often" applies especially to Base when working with an embedded database. We

explain the importance of doing this in Chapter 3 and 4 of the Base Guide: Data Input and Removal, and Data

Output respectively.

How often you save when using Base to connect to an external data source depends upon the source and what you


6/41

are doing. When creating or modifying a table, query, or report, you should follow the advice given for data source

you are using.

Base will prompt you to save your data under specific circumstances. When you close a form after entering or

modifying data, this prompt appears unless the database has already saved this data.

Data can also be added to or modified in a table opened in the Data Sources window. This data also needs to be

saved. While there is a Save icon at the top of this window, it can be easy to forget to save. When you close the Data

Sources window, you will be prompted to save the data unless it has already been saved.

Tip

Ways to save data

Click the Save icon in the table or in the form you are

working in.

Move to another record of a simple form.

Move to another record of the main form or sub form of

a complex form.

Special care must be taken when saving any editing made to any table, query, form, or report. The same is the case

when creating these. Saving these changes to the database requires two steps. First you must save your changes in the

table, query, form, or report you are editing or creating. This writes what you have done to memory. Now these

changes need to be written to the database file itself. To do so, click the Save icon in the Standard toolbar of the

main database window. This window is labeled at the top. Its label contains the name of your database followed by

LibreOffice Base.

Any time you finish using a database, close it. If you have any data which has not been saved yet, you will be asked

if you want to save the changes. You will be asked the same question if you have not yet saved changes made to thedatabase structure. This means you may see two such dialog boxes in a row. In almost all circumstances, you will

want to answer yes in both boxes. The only circumstance I can think of in which you would not want to answer yes

is when you made a change that you do not want to keep and can not reverse.

This is why you need to make sure you save your work on a regular basis. But what do you do if your computer

crashes, or the power goes off, or you shut down Base improperly, or someone shuts down the computer

improperly? You will lose any data or changes to the database since the last time you saved them.

If any of these problems happen while the database is being closed, you might lose the whole database. This is

especially true if Base is in the process of writing to the database when the power goes off.

Useful commands

Chapter 9 of the HSQLDB user guide contains the commands HSQLDB uses and their syntax. Some of these are

different from those used in Calc. For example, enter =NOW()into a Calc cell and you get todays date (data and

time if the cell is formatted for both). If you want to use todays date in a query, you must useCURDATE()

. This isfound in theBuilt- in Functions and Stored Proceduressection of chapter 9.

The commands can be directly submitted to database engine through menu Tools>SQL...command window.

There are two useful commands which you can use when you want to force changes to write into file. Use these

commands, when you have inserted a lot of data into tables and you want to avoid losing them.

The first is CHECKPOINT; the second is SHUTDOWN.

CHECKPOINT closes the database cleanly, then reopens it.

When used with the DEFRAG option, CHECKPOINT also shrinks the data to its minimal size.

SHUTDOWN closes the database open in Base; after that you need to close the ... .odb file and reopen it, if you want

to continue your work. I mention here only one option to this command, others will be discussed in Chapter 9.

SHUTDOWN COMPACT rewrites the data and shrinks all files to the minimum size. This is a good time to create a

backup copy of your file. Never create a backup copy of an open .odb file; it will not contain all the data.

Goals: planning for the database

A database is a collection of data. It might be a text file, a flat database, or a relational database. Whatever type it is,

we must create a structure for the data before we can use any of it. This is a three step process. First a plan needs to

be made based upon the purpose of the database. Then a design is made based upon the plan. Finally, Base is used to

create the database structure as it was designed.

Begin with some general ideas of what you want to accomplish. Then break these goals down into smaller, more

specific goals. These specific goals may need to be even further divided. You continue doing this until you are


7/41

satisfied that the goals have become specific as possible.

The first decision to make is what type of database should be used. Then, beginning with the stated, written goals,

determine the structure of that database. So we will begin by describing the three types of databases. Later we will

divide our goals into more specific goals until we have the structure we need.

Database descriptions

We will discuss three types of databases: text, flat, and relational. Each type serves its own purposes.

Text databases

Text database files are exactly that: files that contain text only. They have a defined structure so that their information

can be retrieved. These text files may be one of several different types. One common format is comma separated

value (CSV). Another common type is vCard. (The latter is used in some address books.) LDIF is another common

format for an address book. (The vCard and LDIF formats are similar in structure.) Whether a given database engine

can access the data in the file depends upon its ability to use the built-in structure in the given format.

A variety of file extensions are used for text databases depending upon the format. For example CSV uses *.csv, and

LDIF uses *.ldif. When creating a text file for data using a word processor, any file extension can be used as long as

the file has a consistent structure that can be recognized by Base.

This structure must contain field separators and text separators. If the file contains decimals, decimal separators must

be present. (Although the database wizard includes a possible thousands separator, Base does not seem to recognize

thousands separators.)Figures 1 is a sample CSV file showing its format. The first row contains the names of the fields for this data source,

and the rest of the rows contain the data for these fields. All text is enclosed with double quotes, and the fields are

separated by commas.

Figure 1: Data source, CSV file

When Base accesses the CSV file, it creates a table.

Figure 2: Table for the CSV file

Using text data sources with BaseBase can recognize and access many text data source files. Maintaining these data source files requires using a

separate program to enter, remove, or modify the text in the file or files. Base will create the queries and reports from

the text files but will not modify the file by adding, changing, or removing text.

Normally text data source files should be created and modified using the same program. This will limit the potential

for errors. Figure 1 is an example of this. This file could be modified using a word processor, but you might then

misplace the commas. This type of error might make the data unusable. However, using spreadsheet programs such

as LO Calc, you can make your modifications and then save the file using the CSV format. Then all the commas will

be placed right where they need to be.

A text data source (database) can contain more than one text file with the same file structure. This might be several

address books using the CSV format: one address book for general personal use, another for business use, and

another for a social organization. As long as they are all placed in the same folder, Base will access all the files with

the format you tell Base to use.

If you want to have more than one text database, you need to have separate folders for each one. Otherwise, you

may find yourself accessing text files from more than one database especially if the two text databases use the same

file format.

CautionWhen modifying a text data source file, Isuggest you

use the program used to create the data source files.

You can make errors that will corrupt the data you

see when accessing the file using Base.

Flat database

A flat database contains data just as a text data source does, but the flat database is structured in a different way.

Figures 1 and 2 show the differences. The data is organized into one or more tables. (This example has only one

table shown in Figure 2.) Then the data is further organized into columns with each column containing the same


8/41

kind of data. (The first column contains only the first names of people, the second column contains only the last

names of people, the third column contains display names, the fourth column contains only nicknames of people,

the fifth column contains the primary email addresses, and the sixth column contains the secondary email addresses.

These are referred to as fields.

The data is even further organized by the rows. All the data of a row refers to a specific person, place, or thing.

Together they describe the specific person, place, or thing. (The data in the first row refers to John Smith.) These are

referred to as records.

Tables in a flat database are independent of each other. You work with one table at a time. Think of a flat database

with more than one table as being a collection of tables that together serve a general purpose. Each table provides a

part of that purpose. Each table has its own queries and reports that gather information for that one table. Each table

can have its own form.

Computer Address books are flat databases. You can divide your addresses into groups (tables) according your

needs. You can have the same address in different groups if they meet the requirements for more than one group.

One of these may contain only the person's name and business email; another may contain family information,

personal email, and social network information.

Consider a library database that includes book titles, authors names, information on each author, and publication

information. When the library contains two or more books by the same author, the authors name is a duplication, as

is the author information.

As an example of a database requiring a large number of fields, consider a family database. Some of the fields are

the family members first and last name (total of 2 fields). Additional fields are the spouses first and last name. But

then we would need the first and last names for all the spouses. One of my siblings has been married three times (sixfields). So the total number of fields has increased to 8. Then we need the first and last names of the children. Well,

one of my siblings has five children (10 fields). So the total number of fields has gone to 18. A relational database

providing the same information requires only 11 fields. And with these 11 fields, a relational database will also

identify the step children and to what parent they belonged before the marriage. Even more fields would be needed

to do this with a flat database.

Note

Base and Calc can create flat databases. Specifically, Calc

spreadsheets can be saved in dBase format. To save more

than one sheet in a spreadsheet, you have to access each

sheet and save it one sheet at a time. (If you have 3 sheets,

you have to save three times.) For more information on doing

this, see Chapter 5 of the Base Guide.

Using flat databases with BaseThere is nothing really complex about planning for a flat database. You first determine the data to be used in the

database and what characteristics the data will have. The types of data and their characteristics will determine the

number of fields and the characteristics of each field. (Each field should have one characteristic that distinguishes it

from all the other fields.) You then name each field and define the characteristics for each field.

Determine if you will need only one or more than one table. Since tables in a flat database are independent from

each other, divide the fields into groups which have no defined relationships with the other groups.

Will the database be contained in one file (as in those databases created with Base) or will it be contained in two or

more files as in dBase or CSV files? If the latter, then all the files need to be placed in the same folder.

Some database programs will create only flat databases. Most can create relational databases as well as flat ones. The

latter has the ability to relate data held in separate tables, while the former can not do this. But unless the program

defines one or more relationships between data contained in separate tables of a database, it will be a flat one.

Relational databases

Like a flat database, relational databases contain one or more tables. They were originally designed to solve the

problems of duplication of data and multiplication of fields in a flat database.

A relational database has a more complex structure than does a flat one. Below are some of the general principles

that have been found to produce good design in relational databases.

Each table of the database must have one or more fields that uniquely identify each row of the table. For a

given value in one field or given sequence of values in two or more fields, there can only be one row

containing this value or sequence of values.

The tables of a database must meet specific criteria known as first, second, third, and fourth normal form. (These are

discussed in the design phase. See First Normal Form [reference will be placed here.])


9/41

Tip

value. A cell with a null value has no entry at all in

it. Anything entered into a cell is considered to be a

value. A cell is considered to be null until a value is

entered into it. Similarly, a cell that has had its

value deleted has gone from being not null to have

the null value.

2) The field or fields that uniquely determine a

row have a name: primary key. This term isdefined and explained in chapter 3 of the Base

Guide, "Data Input and Removal." Relationships are defined by linking one or more fields of one table with one or more fields of another table.

The link is made by virtue of the fact that the two fields have identical values. The field or fields from one of

the tables will be the primary key for that table. The other fields from the second table may or may not

include that tables primary key. They are called foreign keys. Since sometimes we may want to use the

primary keys of two tables to define our relationship, it is possible that a primary key may also be a foreign

key.

TipRelationships in a relational database are defined by

primary-foreign key pairs.

Example of a relational database and the restraints required of it

We have family data that we want to put into a relational database. It begins with one table: the Family member. It

contains the family members first and last name, the spouses first and last name, and the childs first and last names.A sample table is shown below using the information given below as well. (This table does not have primary key

yet.)

Table 1: Family member table

FM First

name

FM Last

nameS First name S Last name C First name C Last name

Sam Silver Martha Silver Bob Silver

Wendy Whitter Dave Whitter Alice Whitter

Next we complicate things because Sam has since divorced Martha and married Wilma. Sam and Wilma now have

one daughter, Sandra.(Martha has kept her married name.)

Rather than just adding two more fields for the new spouse (her first and last name), we will create a second table

containing the first and last name of the spouses as well as the childs first and last name. Then we will add the fields

needed to link these two tables. By doing so, we will be able to associate each family member with a ll of the spouses

they have had.

The new table will be name Spouse for clarity. To identify each row of the Family member table, we have added the

field Family member ID (the primary key) to the Family member table and removed the four fields referring to the

spouse and child. The Family member ID field contains distinct values so we can identify each family member by

the value of the Family member ID field.

The Spouse table contains the Spouse ID field (the primary key), the four fields removed from the Family member

table, and the Family member ID field (foreign key). This last field is used to link the Spouse table to the Family

member table. The first two rows of the Spouse table are both linked with the first row of the Family member table.

For these rows, the values for both Family member ID fields is the same: 0. At the same time, individual spouses are

identified by the Spouse ID fields. Using the Family member ID with the Spouse ID fields, we can determine who has

been married to whom and what children belong to each one of these marriages. (See Tables 2, 3,and 4)

With the design we have now, we can list as many spouses for any given family member as we need. We do not

have to add any fields. Furthermore, we could add one or more fields such as the wedding date and date of birth.

Table 2: Modified Family member table

FM First name FM Last name Family member ID

Sam Silver 0

Wendy Whitter 1

Table 3: Spouse table

Spouse ID S First name S Last name C First name C Last nameFamily

member ID

0 Martha Silver Bob Silver 0


10/41

1 Wilma Silver Sandra Silver 0

2 Dave Whitter Alice Whitter 1

Wendy Whitter has now had a second child, David, so we need to apply the same principles to the Spouse table that

we did to the Family member table. We will create a Child table containing the childs first and last names. We also

need a field to identify each of the children listed. I have used Child ID (the primary key). The Spouse table also

needs a field, Child ID (foreign key), to link the Child table to the Spouse table.

Table 4: Modified Spouse table

Spouse ID S First name S Last name Family member ID

0 Martha Silver 0

1 Wilma Silver 0

2 Dave Whitter 1

Table 5: Child table

Child ID C First name C Last name Spouse ID

0 Bob Silver 0

1 Sandra Silver 1

2 Alice Whitter 2

3 David Whitter 2

We have three tables in our database with 11 fields. With this structure, we can list all the family members, their spouses, andtheir children regardless of how many times any of the family members have married, how many children were born in these

marriages, and how many step-children there may be.

With a flat database, it would take 10 fields to list the family members, their one spouse, and one child. For each

additional spouse a family member may have, 2 more fields must be added. For each additional child, 2 more fields

must be added.

Figure 3 shows the family member as Sam Silver, two spouses: Martha and Wilma, and one child, Bob, belongs to

Sam and Martha (a green arrow is on the left end of Martas row). Figure 4 shows that Sam has been married twice

and they have had a child, Sandra, during their marriage. (The green arrow is on the left end of Wilma's row.)

Figure 3: Showing the child from one spouse for a family member

Figure 4: Showing a child from another spouse from the same family memberThis figure shows that Wendy has been married once and two children have been the results of that marriage.

Figure 5: Showing two children from one marriage

Defined relationships in a relational database serve another purpose. They permit queries to be created using the

tables that have these relationships. So with this example, we can have a query that will use all three tables since all

three tables are related. This query would use the First name and Last name of the Family table, the First name and

Last name of the Spouse table, and the First name and Last name of the Child table. By specifying the values for the

names from the Family and Spouse tables, the query will tell us the names of the children who belong to this specific

couple. This will only work because of these two relationships: the Family member ID field pair, and the Spouse ID

field pair.

Reports are based upon one table, view, or query, but in relational databases, a query can contain fields from more

than one table. In the above example, a report can be generated containing fields from all three tables: Family

members, Spouses, and Child. This is because the report can be based upon a query that contains the three tables.

Relational databases are the most common type of databases used today.

Using a relational database with BasePlanning for a relational database is very similar to planning for a flat database. You are still determining the tables,

fields, and the tables to which the fields belong. The main difference comes from the additional restrictions that a

relational database must have. They are necessary to define the relationships which exist among the fields of the

database.


11/41

Planning for a flat or relational database

Tip

Many of the principles taught here apply to both

types of databases.. So, rather than mentioning both

all of the time, we will discuss flat databases. Then

we will discuss the additional things that apply to a

relational one.Regardless of what type of database will be used, the planning for creating the database begins at the same point:

you have some data that you want to use to provide you with some information. So, you begin with the question:

"What do I want the database to do?" The answer to this answer is the general goal for the database.

Planning for your database requires you to go from this general goal to more specific goals that will, when

completed, accomplish the general goal. In many cases, you will need to divide the specific goals into even more

specific goals. And you should continue this process until the specific goals cannot be further divided. When you

have finishing your planning, you have an outline of what the database should do and what its structure should be.

This planning process involves asking and answering questions based upon what you already have. This is how you

divide your goals into smaller parts. This is true whether you are creating the plan by yourself or someone works

with you creating the plan.

Make sure you take enough time to ask and answer the questions thoroughly, or the plan might not be as complete

as it could have been. You might not notice this until later in the creation of the database. If this happens, you willneed to come back to your plan to make changes and also change the things you have done since creating the

original plan. It is common for people not to ask enough questions to make their plan complete. (When I have had

to make changes when creating a database it was often because I did not even think of some possibilities.)

Your plan needs to be in outline form. This permits you to list your main goal in general terms. These general terms

will enough for the top level of the outline. As you divide each of these general terms, you create a second level.

Further division gives you additional levels of the outline.

You should create your plan using a word processor (such as Writer). You can review the higher and lower levels of

the outline to see the logic you have been using. This also permits you to add lines anywhere in the outline where it

is needed at any time, even long after the planning is done if the plan needs to be updated.

The outline of the plan can become rather complex because you will need to use several levels to get to the specifics.

If you are comfortable working with an outline containing six or more levels, then use the complete outline.

However, for most people, I suggest you break the outline into smaller parts. It should make your task much easier.

The outline can be divided into three parts: the purposes for the database, the type of database you will use, and thedata to be contained in the database. The first two part are not very complex. However, describing the data to be

used can produce a very complex outline. In this part, you will be describing the properties of the data in detail.

The plan: the outline

Note

1) The outline for the database plan will contain

comments inline. Each of the comments will

precede the portion of the outline to which the

comments apply.

2) Each part will be fully developed to the lowest

level of the outline before the next part is

begun.The top level of the outline contains the most important question: the purpose of the database. The answer may

contain only one item, or it may contain more than one (in which case you need a second level). On this level, youask what you want each of these items to do. The individual answers to these questions may contain one item or a

list. If an answer contains only one item, make sure the answer is detailed. If an answer contains a list, you go to a

third level using the same basic question as on the second level.

There is a pattern contained in the above paragraph. It is a pattern that you continue until all the lines of questions

end in single item answers. I have listed only two levels here and listed three levels in the outline shown below.

Some database plans require only one level, some require two, and others require three or more.

The answer to Question 3 in Part 1 should be very general in nature since Part 3 asks you to describe the data in very

specific terms. For example, data in a financial database will have to contain amounts, accounts, descriptions for

transactions, and dates and perhaps more. A library database will contain information about the books, information


12/41

about the people borrowing the books, and information about loans of books.

The answers to Question 3 in Part 1 should also be general in nature since Part 3 also asks you to describe the queries

and reports in specific terms including what fields will be used. The answers can be in the form of "I want the

database to tell me ..." Your answers to Question 3 will fill in the ellipses ().

Tip

Remember that your database plan needs to answer

questions that begin with the word "What." Your

database design answers question beginning withthe word "How." It is not very hard while

developing your plan to drift from asking "What" to

asking "How."

Part 1: Purpose of the database

1) What is the purpose of the database?

a) What do you want (item in the list) to do?

i) What do you want (item in the list) to do?

What do you want (item in the list) to do?

2) What general types of data will be in the database?

3) What information do you want from the database? (queries and reports)

a) What information do you want to get from queries?

b) What information do you want to get from printed reports?

Part 2: Type of database

Base works with two types of data sources: external ones that it accesses, and embedded databases that it has created.

Since text and spreadsheet data sources are somewhat different from others accessed by Base, we will consider them

separately from the others. (The Base wizard contains the list of data sources that Base can access as a drop-down list

on page 1.)

Question 1 contains mutually exclusive choices so you only have to answer the questions in the appropriate section.

Question 2 needs to be answered for people who use a network or are concerned with compute security.

1) What type of database file or files will you be using?

a) Accessed database

i) For text or spreadsheet data sources:

What program will you use to create, modify, or delete fields and tables from your

data source?

Should a folder be created for the data source file or files? (A text data source willneed to have a special folder.)

ii) Other accessed databases:

What type of database file or files will you be accessing? (For example: Access,

MySQL, PostgreSQL, or Oracle)

Should a folder be created for the data source file or files?

b) Embedded databases

i) Will you be using a flat or relational database?

ii) Preference as to where to place the database file?

2) Will you be accessing the database file or files on your computer or over a network?

a) Who should have access to the database?

b) Will you want to have user names and passwords to control people's access?

Part 3: Data to be used in the database

This part of the outline is very important. This is where you define the attributes (characteristics) of your data. Each

attribute will be a field for the database. Once you have defined the attributes, you can separate them into tables

based on common relations that exist among the fields. If you are creating a relational database, the tables you create

in your plan will be modified and additional tables added as you design it based upon the plan outline you are

creating now.

Part 3a: Defining the attributes of the data


13/41

Tip

.

in the Table Design dialog. (These choices are

discussed in more detail in the "Creating the

table" section of Chapter 3 of the Base Guide.

1) What data will you be using?

2) What names will you give to your data, the names of the fields in the database?

3) What are the characteristics of this data?

a) Name the characteristic for each field. (Choices: tiny integer, big integer, image, binary (variable or

fixed), memo, fixed length text, number, decimal, integer, small integer, floating decimal, real

number, double accuracy, variable length text, variable length text (ignore case), Yes/No, date, time,

date/time, other)

b) What are the additional properties for each field?

i) Will you require that an entry be always made for the field?

ii) What is the length of the field? (The maximum number of characters that any entry for the

given field can have.)

iii) Do you want a default value for this field that will entered for each record in the table? If so,

what is it?

iv) Is there a specific format that you want to use for this field? (For example, you can specify

currency format when selecting number or decimal as the field type, the date format, time

format, or the date stamp format.)

Part 3b: Planning the tables

Another thing that may determine some of the tables we will need: some fields will have a limited number of distinct

values. For each such field, we will create a table with a single field containing these distinct values. These will be

used to provide drop-down lists for these fields in the form(s) containing them. (We will use list boxes for the drop-

down lists.)

For example: A field for the days of a week. This has seven distinct values. Or, consider a field for the months of the

year. This will have 12 or 13 values. (Some non-Gregorian calendars have a leap month.)

It is possible that our initial list contains some "fields" that actually are not fields at all, but alternative values of a

single field. When this happens, list this field with the others. Name the table to be created for the field, and list the

distinct values with the name of the table.

For example: our budget database. The budget categories may seem like fields, but they are only distinct values for

specific budget levels. We will develop the plan for this database as our example after we define what a plan should

be.

The next step is to look at the fields to see if there are any relationships existing between them. We will be separating

the fields into tables based upon these relationships.

Note

A household inventory database might be a flat

database with one table.This permits queries to

provide detailed information needed for insurance

purposes and the value of the inventory. It might

also be a flat database with one table for each room

of the house. This permits queries or views that

display what is contained in each room. This might

be very helpful for a moving company. (A view

could also be created for each room displaying the

same information from a single table.)

1) What fields have a limited number of distinct values?

a) Name the table for each of these fields.

b) For each field, list the distinct values it has. For each field, list the field type and properties.

2) What fields are really distinct values for a specific field?

a) Name a table for each set of distinct values.

b) Name a field for each set (if it does not already exist) and list it with the fields of the database.

3) What are the relations that exist among the fields?

4) Do these relations agree with the purposes of our database?

a) Only if the tables suggested by the relations agree with the stated purposes of our database should we

use these tables.


14/41


15/41

Part 1 of the plan usually provides little specific information that can be used in the design phase, but it does provide

the general structure needed. It therefore places limits upon the other two parts as we work through them.

For example: the flat database described in Chapter 1 of the Base Guide, Introducing Base. Part 1 of that plan

requires that the information be useful to provide information that ones insurance agent could use to determine

what the insurance policy should be for the contents of the house. This limits the number of tables to just one.

Note

Designing a database is an art. This means that more than

one design may accomplish the goals listed in the databaseplan.

Tip

While designing a database, you should have a

separate copy of your plan that you can view. This

can be one or more sheets of paper or a text

document to be viewed on a monitor. If you are

designing it on a computer, you could have your

plan and design for your database on your monitor

at the same time (if it is big enough).

Part 1: Purpose of our database and a design for it

Part 1 provides the structure for the design's outline. So keep this part of the design in general terms. The more

specific parts of the design will be filled in in Parts 2 and 3.

Purpose of the database

The plan has a list of things to be obtained from the database. These will add structure to it as well as place

restrictions upon it. Design the general structure based on the answers to question 1 of Part 1 of the plan. Also design

the restrictions the same way.

1) What structure is required to accomplish each item listed in the answers to question 1a (including lower

levels) of Part 1 of the plan?

2) What restrictions are required because of these answers?

General types of data

These are described in greater detail in Part 3 of the design. In the plan, they served as an outline for all the data to

be used. Part 3 of the plan used this outline to develop all of the data needed. We will not have to include them in

Part 1 of the design.

General description of the information desired from the database

The general outline of the queries and reports is done here. In Part 3, these outlines are filled in based on the tables

created from the data.

Information from queries

This is the time to list the query or queries to be created, if any. Then we need to give each one of them some basic

structure: detailed or summary query. We should also list the general restrictions that apply to each query. Usually,

these restrictions will be in WHERE clauses, but summary queries can use the GROUP BY clause. A few of them use

the HAVING clause with GROUP BY. (This might not be obvious until you reach Part 3 of the design.)

1) Name the queries to be created.

2) Detailed queries: What restrictions are required using the WHERE clause?

3) Summary quer ies :

a) What restrictions require the WHERE clause?

b) What restrictions require the GROUP BY clause? Will the HAVING clause be required as well?

Information from reports

We should look carefully at the report descriptions in the plan. If there is not a query or table upon which the report

can be created, a query or view must be created before the report can exist.

For all the reports that have tables or queries from which to get the data, we need to determine what type of report it

should be: static, or dynamic. You also have the choice of using the Data window (F4) to insert data into a Writer

document or Calc spreadsheet. For example, the report could take the form of a letter describing the status of

investments for the previous year.

Now that LibreOffice also has the Report Builder extension installed; more elaborate reports can be created with this


16/41

than with the report wizard. These reports can take the form of a text document or spreadsheet.

1) How will each report be generated? (Writer document, Calc spreadsheet, report wizard, or report builder?

Include a brief outline of the desired report

Report Builder: Will it be a spreadsheet or text document?

1) What query, table, or view will the report use? (It can only use one.)

a) If no existing one will do, what is the basic structure of the needed query or view?

b) Name the query, table, or view if there is one.2) Answer for each report: will it be static or dynamic?

Part 2: Type of databaseand a design for it

The center of this part is the LO database document file (extension.odb). We may want the database to be embedded

within this file. Or we may want to use the file to connect to a database (an accessed database).

This raises even more questions to consider in the design. The type of database must be considered as well as the

location of the database when it is an accessed. This is especially true when a local area network is involved.

Security also plays a part in the design. Base does not encrypt its file. But it can access databases which require a

password. If on a network, access can be controlled by use of passwords and access rights. All of these become a

part of the design.

Tip

When LO is on a network, the security issues

belong to the IT people. They will tell you what youneed to know when designing your database.

We must consider the input and output of the database as well. With embedded databases, LO can create both data

input (tables and forms) and data output (queries and reports). With external databases, LO cannot always add,

modify, or delete data. It can create data output. Text databases and spreadsheets are the two examples for which LO

cannot input or modify data. Neither can LO create a view for either. So, if you will be using either of these, you

must decide what program you will use to input data into the database.

1) Will the database be embedded within the database document file, or will this file be used to access the

database?

2) If the database is a text or spreadsheet, what will you use to add, modify, or remove data?

3) For other accessed databases.

a) How do you access i t?

i) What database driver is needed?

ii) What settings are required for the database driver?b) Where will the database be located? (same computer, network server)

i) User and password required?

ii) What is the path to the database on the network?

What are the access rights to the database?

Database data and a structure for it

Part 1 of the design contains the general outline of the desired database while Part 2 of the design contains the

outline of its, its creation, and what is required to use it. Part 3 of the design describes it in very specific terms.

This includes defining how data will be added, modified, or removed (data input and removal). Tables, views, and

forms for it are defined where needed. And if a relational database will be used, the relationships between tables are

defined.

Part 3 also includes how the data will be used (data output). Queries and reports are defined including what fieldsand their tables are used, what functions are needed (if any), what restrictions are required, and how the output data

should be sorted.

We begin with Part 3 of the plan. The latter lists the specific fields and their possible tables for the database. Then we look at

how we can use what we have, and we begin to ask questions again. When planning it, we asked ourselves what we wanted.

Now we ask another question: how do we do it?

Fields

It may seem as if we have all the information we need about our fields in Part 3 of the plan section. But we must now

take the same information and look at it from a different perspective. For example, some of the fields may not really


17/41

be fields but only field values. (This will be more obvious when we begin our design of the tables.)

We need to create one or more three-column lists to contain our fields. The first column contains the field names, the

second column the field types, and the third column the field properties. Thus each row contains a field name, its

field type, and its field properties.

In the plan, we have already divided the fields into proposed tables for it. We should now create a list of fields for

each proposed table so that we can place fields that share a relationship in the same table. It is also easier to spot

errors or missing fields within a smaller list per table than might be possible when we put all of the fields into a single

large list.

What fields refer to the same thing?

Field namesWe have already defined and named our fields when we created our plan. At this point we should review our field

names for their usefulness.

Do we want to change any of the field names?

Do we want to add or remove any fields?

Field typesAs with the field names, we need to review the field types we have given to our fields. We might need to make some

changes. If we added a field when we reviewed the field names, we also need to assign a field type to the new field.

Have all of our fields been given an appropriate field type? If not, correct the field type.

Field propertiesThese are divided into four parts: AutoValue and Auto-increment statement, or Entry required; Length or Length

and Decimal; Default value; and format example.

AutoValue and Auto-increment statement

The AutoValue and Auto-increment statement properties apply only to the primary key of a table. The AutoValue

determines whether this fields values are automatically generated by the table or not. The Auto-increment statement

determines the value of the increments. (This statement is written in SQL.)

Entry required

For all fields of a table except for the primary key, you have a choice as to whether a value must be entered in every

record of the table or not. (This is one of the questions in Part 3d of the plan.) You need to look again at each one of

the fields to determine if the value for Entry required needs to be changed.

Length

Length describes the maximum number characters contained in a field value. (Fields having a date field type do not

have this property.) Its purpose is to prevent unnecessary space requirements in the database. Yet you want to make

sure that each of your fields is long enough. For example, a field for six digit odometer readings does not need a

length more than 6 digits. But a database for a very large bank may require much longer fields. Use your own

judgment as to what is enough, and what is too much.

Associated with the Length property is the Decimal property. This one determines how many decimal places the field

value will have. This divides the length into two parts: the number of digits to the left of the decimal point, and the

number of digits to the right of it.

Default value

You may want some fields to have a specific value until you change it. You specify what value you want the field to

have. This may be text, a number, or a date. Fields with the field type, Yes/No, permits three choices for the default

value: None, Yes, and No.

Format example

Field formatting is the last field property. It is tempting to apply all of the formatting that you can here. In most

cases, this should not be done. Only use essential formatting unless you are going to view the data in its table.

Field data is usually only seen in forms, queries, reports, and views. It is in these place that you should format the

field according to appearance you want in those places.

1) Primary key:

a) Is the AutoValue set properly for the primary key of each table?

i) Is it set to Yes when the Field type is Integer [INTEGER]?


18/41

ii) Is it set to No when the Field type is not Integer [INTEGER]?

b) If an Auto-increment statement is desired, use SQL to write it.

2) Entry required: Should a field have an entry in every record?

a) If so, select Yes.

b) If in doubt , select No.

3) Length:

a) For non-decimal fields, select a length that is appropriate for the data the field will contain.b) For fields that contain decimals, enter the appropriate number of decimal places.

4 ) Default value.

If the answer is yes, enter the value.

If the field is Yes/No (Boolean),select the value you will use.

Tables

In the planning phase, we placed the fields of the database into one or more tables based upon the relations that exist

between the fields. In the design phase, we will consider these more closely. It may require us to make some changes

to our tables: modifying some, and perhaps creating some.

Tables for list boxesIf you have identified one or more fields that have a limited number of distinct values, you have also named the

tables that will contain these values. We must design a table for each set of values. If you have not done so already,

for each field with distinct values, define its characteristics.

If when planning you listed one or more fields having a limited number of distinct values, you need to design a table

for each one of these fields. If so, you have a list of these fields, the table names associated with them, and their field

types and properties. All of these will be used to define the tables needed to create list boxes for these fields in the

forms.

List boxes are very useful for fields with a limited number of distinct values. We use them to create drop-down lists

in our forms. We also use them for consistency in our data entries for these fields.

List each field with distinct values, the table holding its values, the characteristics of the table's field, and its

distinct values.

Because of the structure we have placed upon each table used for a list box,

The table contains only one field.

This field is the primary key for the table.

You can only enter distinct values into this field.

TipIn most cases, you will be using VARCHAR as the

field type for the field. Other field types can be

used, but this will be the most common one.

Flat databasesFlat databases contains tables that are independent of each other while relational databases have tables in which one

or more fields of a table are related to one or more fields of another table. In fact, a field of a table may be related to

another field of the same table. Because of this we design the tables of a flat database differently from those of a

relational database.

For a flat database, our table has already been designed when we designed the fields with their field type properties.

The only thing left to do is to create a table that implements these fields. But since a field is likely to have several

properties, we will divide these into individual properties as shown.

Before doing this, consider in what order you want each field to appear in its table. It may well be sensible to havesome consistency between the order of fields in a table and the form that contains it.

So, create a list like the one below for each table of the database. Begin by listing the fields in the order you have

now given them. For each table of the database, enter the characteristics of its fields as you have already defined

them. You will be using these lists when you create the database.

As you fill out these lists of field characteristics, you should review the fields you have defined.

Are there any more fields that might need to be added?

What are the characteristics of these fields?

Tables 6 and 7 are really one table. Each one contains part of the characteristics for the four sample fields.


19/41

Table 6: Sample Characteristics of the fields of a database table

Field Name Field typeAutovalue Auto-Increment

StatementEntry Required

ID [INTEGER] Yes (none) Yes

Budget 1 [VARCHAR] Yes

Amount [DECIMAL] Yes

Reconcile [BOOLEAN] Yes

Table 7: Sample Characteristics of the fields of a database table

Field Name Length Decimal places Default value Format Example

ID

Budget 1 50

Amount 10 2

Reconcile 0 No

Table 8: Sample table with values.

ID Budget 1 Amount Reconcile

0 Auto 17.21 Yes

1 Food 12.00 No

2 Income 0.03 Yes

Relational databasesThese require a different method of design. While flat database tables are well defined from the time they became

part of the plan, relational database tables must meet more stringent requirements in the design phase. Fields may be

moved from one table to another. Additional tables may be needed and, in technical terms, the tables have to be

normalized. Relationships existing between tables have to be defined using one or more fields of one table and one

or more fields of another.

Normalized tableOne that has been shown or modified to meet the requirements for first through Boyce-Codd normal form.

When physically representing these tables, we will use the column headings (field names). For example, Table 8 would look

like:

ID Budget 1 Amount Reconcile

When modifying or creating a table, we will use this to indicate what fields belong to it. Sometimes we may need to

move one or more fields from one table to another. Or we may need to create a table with a field identical to one in

another table. We must then use the same name for the field in both tables.

One reason why fields have to be moved or duplicated and new tables created is to conform the tables to the rules

required for relational databases. This is done over four levels called the first, second, third, and fourth normal form.

Note

While additional normal levels exist, relational

databases that have fourth normal form will run

very well. In many cases, adding another

normal form level does not improve how well

the database runs.

Microsoft, perhaps some others, have combined

the original fourth normal form with anadditional requirement to define what they

define as fourth normal form.

While there are precise definitions of the normal

forms, they are very technical in nature. Unless

you have a good mathematical background,

these definitions are very difficult, if not

impossible to understand.

First normal formBefore making a table first normal form, we have to look for two possible characteristics in the columns and rows


20/41

that must be removed if they exist. Does the table contain two or more columns that are identical in nature? Does the

table contain two or more rows with the same value for one field but different values for another field? If the answer

to both questions is no, the table is first normal form.

These two characteristics are two different ways of looking at the same problem. For example, consider an address

book. It contains these fields: first name, last name, street address, city, state zip code (postal code), and phone

number. One person listed in it has three phone numbers: a personal cell phone, his office cell phone number, and

his home phone number (land line). Another person listed has three addresses: where she lives and post office box

(for her snail mail). This tables would not be first normal form. One row of data contains three entries in the phonenumber field, and another row contains three entries in the street address field.

Table 9: Address Book not first normal form

First Name Last NameStreet

AddressCity State Zip Code Phone no.

Albert James 345 First Hart IN 12345

(111)

111-1111

(222)

222-2222

(333)

333-3333

Wanda Anderson

111 Main

Apt. 14P O Box

1991

Amherst MA 15115(100)

100-1010

We need to consider the primary key for the table because this is important. Originally, all the data referred to

specific people and could be divided into rows accordingly: one row for each person. As long as there was no

duplication of names, these fields, First Name and Last Name, can be the composite primary key. If duplication

exists, an ID field can be used as the primary key. Then each individual's name is associated with a specific ID value.

We still have two rows in which one field has multiple values.

Table 10: Not first normal form

ID (PK) F N L N S A City State Z C P No.

1 Albert James 345 First Hart IN 12345

(111)

111-1111

(222)

222-222

2

(333)

333-333

3

2 Wanda Anderson

111

Main

Apt. 14

P O Box

1991

Amherst MA 15115

(100)

100-101

0

Another possibility is to associate each phone number with a specific ID value. In our case, we would have to also

associate each street address with a specific ID value. While we can determine what each of the addresses are for,

how do we know to what phone number to use to call a given phone? This possibility is making the situation too

complex. The more fields have multiple entries, the more complex this possibility becomes. Furthermore, a field is

still needed to distinguish between multiple entries of each field that has them.

Table 11: First normal form

F N L N S A City State Z C P no. ID (PK)


21/41


22/41

Tip

The new composite primary key will create additional

rows of data that contains repetitions with exception

of the field containing multiple values and the field

added because of these values.

Second normal formFor a table to be in second normal form, it must be in first normal form and the fields must be dependent upon the

entire primary key. Clearly, if the primary key consists of only one field, the table is in second normal form. So,look for tables that have composite primary keys. They are the only ones in first normal form that might not be in

second normal form.

The primary key determines the values for the rest of the fields in table. With composite primary keys, one of the

fields in it may determine the values of some of the other fields by itself. That is when problems can occur:

repetitions in the rows of the table. Modifying the table to make it second normal form removes the repetitions.

We continue to use the address as our example that is in first normal form. It does have a composite primary key, so

we will check the fields in it to see if the table is in second normal form.

The field, Phone no., is determined by a combination of two fields: Type of Phone, and ID. We do not need the

fields, Type of Address nor First Name nor Last Name, to determine this field.

Similarly, the fields, Street Address, City, State, and Zip Code are determined by a combination of two other fields:

ID and Type of Address. We do not need the other field in the composite primary key to determine the values of

former three fields.

Finally, the field, ID, determines First Name and Last Name by itself. So, this requires a third table.

DeterminantA field that can determine the values of another field in its table.

Tip

A primary key is a determinant since it determines

the values of all the other fields. When we have a

composite primary key, it may contain a field that is

also a determinant that determines the values of

some fields but not all of them. Having such a field

in a table prevents it from being second normal

form.

This means that we have three fields in our table that are determinants. To make the table second normal form, we have to

create three tables, one for each determinant. Each of these tables contains the determinant as its primary key. The fields that

it determines are moved from the original table to this new one. The original table is modified. It still contains the completecomposite primary key. It may contain other fields, but our example does not. When we describe the designing of the Budget

database, this example will have fields containing more than the primary key.

Table 13: Phone table

ID(PK) Type of Phone (PK) Phone no.

1 Cell (111) 111-1111

1 Bus Cell (222) 222-2222

1 Home (333) 333-3333

2 Home (100) 100-1010

Table 14: Address table

ID (PK)Type of

Address (PK)Street Address City State Zip Code

1 Home & mail 345 First Hart IN 12345

2 Home111 Main Apt

14Amherst MA 15115

2 Mail P O 1991 Amherst MA 15116-1991

Table 15: Name table

ID(PK) First Name Last Name

1 Albert James

2 Wanda Anderson


23/41

We have three new tables, so they should be examined to determine if they are first normal form. They are because

all the fields contain single values. But what about second normal form? (two tables contain a composite primary

key.) For the Phone table, both fields of the primary key are required to determine the Phone no. field. For the

Address table, the same thing is true. So, these tables are both second normal form. The third table, Name table, is

second normal form because its primary key contains only one field.

The modified Address Book has a single field primary key. So it is also second normal form.

Perhaps some observations need to be made about the modified Address Book table. In all five rows, the First Name

and Last Name field values are determined by the ID field (the primary key). The first two columns come directly

from the first two columns of the Phone table. Similarly, the first and third columns come from the Address table.

Table 16: Modified Address Book table that is now second normal form

ID (PK& FK) Type of Phone (PK & FK) Type of Address (PK & FK)

1 Cell Home & mail

1 Bus Cell Home & mail

1 Home Home & mail

2 Home Home

2 Home Mail

To check for second normal form:

Does the table have only one field upon which all the other fields depend?

If yes, the table is second normal form.

If a table has a composite primary key, can any of the fields of this key determine the values of a table field?

If no, the table is second normal form.

Modifying a first normal form table to make it second normal form.

1) What fields in the table depend upon just part of its composite primary key?

2) Separate these fields into groups depending upon which part of the primary key they depend upon.

3) Create a table for each group of these fields. Move each group to their own table. Add to each table a copy

of the primary key components upon which of its fields depend. (Each of these primary key components

becomes the primary key for its table.)

4) Modify the original table so that it contains its primary key and only the fields that depend only upon it.

Third normal formTo be third normal form, a table must be second normal form and its fields must depend only upon the primary key.Some times a field depends upon another key that is not the primary key nor part of a composite primary key.

Check list for third normal form:

Are there one or more fields that depend upon a field that is not the primary key?

! If Yes, the table is not third normal form.

! If No, the table is third normal form.Asking this question of the Phone and Address Book tables results in a No answer, so these tables are third normal

form. However, the Address table will give a Yes answer to this question. It is not third normal form.

Because the Zip Code field depends upon the stated three fields, we will create a new table using them as its

composite primary key. We also move the Zip Code to the new table.

Table 17: Zip Code table

Street Address (PK) City (PK) State (PK) Zip Code345 First Hart IN 12345

111 Main Apt 14 Amherst MA 15115

P O 1991 Amherst MA 15116-1991

Table 18: Modified Address table to make it third normal form

ID (PK)Type of Address

(PK)

Street Address

(FK)City (FK) State (FK)

1 Home & mail 345 First Hart IN


24/41

2 Home 111 Main Apt 14 Amherst MA

2 Mail P O 1991 Amherst MA

At this point it pays to consider what tables you now have. We began with a single Address Book table. This has

been modified twice: once to make it first normal form, and the second time to make it second normal. We have also

created the Phone and Address table. The latter has been modified to make it third normal form.

So we need to check two tables, Address Book and Phone, to make sure they are also third normal form. In both

tables, the only non-key fields are only dependent upon their respective primary key. So they are third normal form.Modifying a second normal form to make it third normal form:

1) Determine what fields depend upon one or more non-key fields.

2) List each of these non-key fields or group of not-key fields.

3) Add to this list what fields depend upon each of them.

4) Create a table for each of the determinants you found.

5) Move the fields depending upon each determinant to the new table containing it.

Summary of our example table

Table 19: Original Address Book table

ID (PK) F N L N S A City State Z C P No.

1 Albert James 345 First Hart IN 12345

(111)

111-111

1

(222)

222-222

2

(333)

333-333

3

2 Wanda Anderson

111

Main

Apt. 14

P O Box

1991

Amherst MA 15115

(100)

100-101

0

Table 20: Modified Address Book table

ID (PK& FK) Type of Phone (PK & FK) Type of Address (PK & FK)

1 Cell Home & mail

1 Bus Cell Home & mail

1 Home Home & mail

2 Home Home

2 Home Mail

Table 21: Modified Address table

ID (PK)Type of Address

(PK)

Street Address

(FK)City (FK) State (FK)

1 Home & mail 345 First Hart IN

2 Home 111 Main Apt 14 Amherst MA2 Mail P O 1991 Amherst MA

Table 22: Name table

ID(PK) First Name Last Name

1 Albert James

2 Wanda Anderson

Table 23: Phone table


25/41

ID(PK) Type of Phone (PK) Phone no.

1 Cell (111) 111-1111

1 Bus Cell (222) 222-2222

1 Home (333) 333-3333

2 Home (100) 100-1010

Table 24: Zip Code table

Street Address (PK) City (PK) State (PK) Zip Code

345 First Hart IN 12345

111 Main Apt 14 Amherst MA 15115

P O 1991 Amherst MA 15116-1991

Table 19 is the original Address Book with all of its flaws. The five following it are the tables we created as we

worked toward making them third normal form.

We will study the six tables to see what we can discover comparing Table 19 with the five tables following it. The

values for the First and Last Name fields are listed once in the Name table. The values for the Street Address, City,

and State fields are listed once in the modified Address table. The values for the Zip Code field are listed once in the

Zip Code table. And the values of the Phone no. field are listed once in the Phone table. As a result, fewer errors are

likely to be made when entering the data.

Table 20 is the modified Address Book table that looks much different from the original table. All the information in

the latter can be gotten from the modified table and the tables we created using a query or a form which containsdata from all the tables.

Note

As an introduction into the next normal form, I will

state that these three tables above are a lso Boyce-Codd

normal form. You can check for yourself if you want

to do so.

Boyce-Codd normal form (BCNF)There is an other normal form that is similar to third normal form. It is known as the Boyce-Codd normal

form(BCNF). Most third normal form tables are also BCNF. The only time when this statement may not be true is

when the table contains composite candidate keys which overlap. When a candidate key contains one or more fields

that are also part of another candidate key, the keys overlap. This potential problem very seldom occurs.

Boyce-Codd normal form

A relation (table) is in Boyce-Codd Normal Form (BCNF) if every determinant is a candidate key.

Candidate keyOne or more fields in a table that uniquely determines the other fields.

Tip

A table may have more than one candidate key, and

any one of these can become the designated primary

key. But remember that a candidate key can consist

of one or more fields of the table.

An example of BCNF: Consider a table, Enrollment, with these fields: student number, student name, course number,

course name, date-enrolled. We make these two assumptions: no two students have the same name, and no two

courses have the same name.

The candidate keys are these groups of fields: student number, course number; student name, course number, student number,

course name, and student name, course name). Any one of these for composite candidate keys can be used as the primary key.

Notice how these candidate keys overlap.

Student number Student name Course number Course name Date-enrolled

Student number determines Student name, Student name determines Student number, Course number determines

Course name, and Course name determines Course number. But none of these fields taken by itself is a candidate

key. So the table is not Boyce-Codd normal form.

To correct this problem, we create modify the original and create two new tables: Student and Course.

Student, Course, and the modified Enrollment tables (respectively):

Student number(PK) Student name

Course number(PK) Course name


26/41

Student number(PK&FK) Course number(PK&FK) Date-enrolled

To show that there is a problem and its correction, we add some data to the original Enrollment table and then to the

tables correcting the problem. When a student takes more than one course, the Student number and Student name

are repeated. Similarly, the Course number and Course name are repeated when more than one student takes the

course. When this table is made third normal form, we eliminate this repetition.

Table 25: Original Enrollment table

Student number Student name Course number Course name Date enrolled

1001 Sam Livingston CS101Introduction to

ComputersAugust 8, 2015

1001 Sam Livingston CS102Introduction to

Operating SystemsAugust 8, 2015

1002 Max Caprilla CS101Introduction to

ComputersJanuary 4, 2015

Table 26: Student table

Column 1 Column 2

1001 Sam Livingston

1002 Max Caprilla

Table 27: Course tableColumn 1 Column 2

CS101 Introduction to Computers

CS102 Introduction to Databases

Table 28: modified Enrollment table

Student number(PK & FK) Course number(PK & FK) Date enrolled

1001 CS101 August 8, 2015

1001 CS102 August 8, 2015

1002 CS101 January 4, 2015

We enter data in the Student table one time for each student and in the Course table for each course. In the

Enrollment table, we enter data for the three fields for each course a student takes. When data is added or modified

for these tables, it only has to be done one time. For example, Sam Livingston legally changes his name to Walley

Habbernack. To modify our data, we only change his Student name field. When we do this, everywhere Studentnumber, 1001, appears, the database correctly lists the Student name as Walley Habbernack. We do not have to

worry about changing his name for every course he has taken.

From my studies, tables with this problem have fields that are similar in nature. This leads to candidate keys that

overlap. In our example, Course name and Course number do this. Student number and Student name would seem

to do it also. However, two or more students can have the exact same name. (I once had a class with three boys

having identical names.) In this case, the field

Date post:	04-Jun-2018
Category:	Documents
Upload:	quelqune
View:	214 times
Download:	0 times

PlanningDesigningYourDatabase DEL 20121214

Documents