+ All Categories
Home > Documents > SQL and Modeling Word

SQL and Modeling Word

Date post: 10-Apr-2018
Category:
Upload: kent-mabait
View: 223 times
Download: 0 times
Share this document with a friend
58
Tutorial on database modeling and the SQL language About the Database and SQL tutorial Building a database is like building a house - you have to get it right at the design stage (also called the modeling stage). Once all the walls are up and the plumbing is installed, it's a bit late to realize that the bathroom should have been three feet wider. It can be done but, it gonna cost you, baby! So, in these lessons we're going to spend some time on database modeling. By studying several examples of database applications we'll define what the rules are for building a model of a database that we can then expand to cover all kinds of situations. Modeling is what we do when we are designing a database. It's sort of like building a model car except that it's not 3D and it's not plastic . Our model is on paper , or in a modeling program's data file, but it does have a lot of parts that must be ass embled in a logical fashion and must all be glued together to work properly. About the SQL language Once the modeling part is down pat, we will create the databases and learn how to work with them using theSQL language. SQL is the lingua franca of databases. It allows you to communicate between relational databases from Microsoft, MySQL and Oracle, to name  just a few. We will look at all the important SQL commands that need to be mastered, again through the use of many examples and exercises. Throughout this tutorial we will use Microsoft Access 2003 (from Microsoft Office 2003 Pro) as the source of most of our examples. Almost all the sample databases are built in Microsoft Access. Gradually, we are including MySQL into the tutorial. The MySQL database server is a powerful, very versatile server which is part of the whole open-source environment along with other tools such as the Apache Web server, thePHP language and the Open Office suite. All those tools are readily available free on the Internet. They are now a credible alternative to the proprietary software sold by Microsoft and Oracle. And since the recent announcement of a joint venture between Sun and Google, you can bet that we'll be seeing a lot of action in that ar ea in the coming months and years! In addition to the database server itself, MySQL has a whole series of tools that work with it. Most of the tools are developed by third-party vendors who usually sell them but there are always free versions of these clients that help to make the developer's life a lot easier. Since new tools are coming out all the time you may see different screen shots of the examples in different lessons as we update things. We'll try to make it not too confusing! To highlight some of the uses of the SQL language we've developed a sample project that uses a Visual Basic 6 clientto connect to our MySQL database through SQL commands.
Transcript

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 1/58

Tutorial on database modeling and the SQL language

About the Database and SQL tutorial

Building a database is like building a house - you have to get it right at the design

stage (also called the modeling stage). Once all the walls are up and the plumbing isinstalled, it's a bit late to realize that the bathroom should have been three feet wider. It

can be done but, it gonna cost you, baby!

So, in these lessons we're going to spend some time on database modeling. By studying

several examples of database applications we'll define what the rules are for building

a model of a database that we can then expand to cover all kinds of situations. Modeling is

what we do when we are designing a database. It's sort of like building a model car except

that it's not 3D and it's not plastic. Our model is on paper, or in a modeling program's data

file, but it does have a lot of parts that must be assembled in a logical fashion and must all

be glued together to work properly.

About the SQL language

Once the modeling part is down pat, we will create the databases and learn how to work

with them using theSQL language. SQL is the lingua franca of databases. It allows you to

communicate between relational databases from Microsoft, MySQL and Oracle, to name

 just a few. We will look at all the important SQL commands that need to be mastered, again

through the use of many examples and exercises.

Throughout this tutorial we will use Microsoft Access 2003 (from Microsoft Office 2003

Pro) as the source of most of our examples. Almost all the sample databases are built in

Microsoft Access.

Gradually, we are including MySQL into the tutorial. The MySQL database server is a

powerful, very versatile server which is part of the whole open-source environment along

with other tools such as the Apache Web server, thePHP language and the Open Office

suite. All those tools are readily available free on the Internet. They are now a credible

alternative to the proprietary software sold by Microsoft and Oracle. And since the recent

announcement of a joint venture between Sun and Google, you can bet that we'll be seeing

a lot of action in that area in the coming months and years!

In addition to the database server itself, MySQL has a whole series of tools that work withit. Most of the tools are developed by third-party vendors who usually sell them but there

are always free versions of these clients that help to make the developer's life a lot easier.

Since new tools are coming out all the time you may see different screen shots of the

examples in different lessons as we update things. We'll try to make it not too confusing!

To highlight some of the uses of the SQL language we've developed a sample project that

uses a Visual Basic 6 clientto connect to our MySQL database through SQL commands.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 2/58

Once you've mastered the SQL language you may want to look at the Visual Basic 6 ADO

database programming project which is an introduction to the use of a visual client in

extracting and manipulating data from a relational database via SQL.

Even if you are still a user of Sybase Powerbuilder (which is the subject of another

tutorial, in French, atWebProfesseur database programming projects) you can run allthe examples using SQL Anywhere which has all the tools necessary to create and

manipulate databases. As an aside, I used to work with Sybase Powerbuilder and I taught

courses on it. I loved it for development work. But for some reason it never caught-on much

here in Canada and now I don't know of anyone who actually uses it. It's too bad.

And if you're looking for some great information on Visual Basic 6.0, you heve to check-out

this site: Visual Basic 6 programming tutorials.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 3/58

Lesson 1 - Introduction to data modeling

A short history lesson

Once upon a time all files were stored on magnetic tape and all access was sequential. Then

came the disk drive and random access and there was joy in the land! But you know,

because your parents told you, that too much of a good thing is bad for you. And so it is

with random access - unless you organize your data it will be so randomized that you'll

never find it again.

So, somebody came up with a way to store large amounts of data in such a way that it

could be updated and retrieved when needed. They called the structure containing the data

a database and the programs doing all the administrative work of handling the data

a Database Management System (DBMS).

The first databases were modeled upon COBOL data structures (in those days every

programmer was a COBOL programmer) and were called hierarchical because of the way

in which the data are structured. Eventually they improved upon the first model and came

up with a network model, which has absolutely nothing to do with Novell or NT, but

describes the way the data elements relate to one another. Some of those large databases

are still in use today in legacy applications all over the place.

In the early 70's Dr. E.F.Codd, who happened to be a mathematician rather than a

programmer, came up with a new model he called relational. This relational model, built

on the mathematics of set theory, was powerful, flexible and easy to use. But it turned out

to be such a hog for disk space and processing time that it wasn't really a viable alternativeto the previous models. It wasn't until hardware performance improved in the late 70's that

the model started gaining acceptance.

By the mid 80's, cheap PC's with ever-increasing capabilities made it possible to develop

small versions of relational databases. It was Oracle Corp. that really put relational

database development on the map and today Oracle is still the leader in the field.

Starting the design process

• Step 1: Feasibility study

o Is the project too small? Too big? Technically feasible?

o What are the costs involved - for development, for maintenance?

Is it cost-effective - are the savings greater than the costs?

o Is the timeframe realistic? Can it be done in the time alloted?

• Step 2: Detailed analysis

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 4/58

o Start with the output and work backward. What output does the client want to

see? What input will be required to produce that output?

Describe the screens, the reports, the results that will be produced.

o Study the existing system (there's always something, even if it's done with

quill and ink).

Determine what must be kept, what must be changed and what must bescrapped.

o Keep this in mind: Any system you deliver must perform at least as

well as the system the client is using now. Although that sounds simpleenough, I could come up with a pile of cases where an improved system took

twice as long to do the work and cost twice as much to operate asthe inadequate system it replaced.

• Step 3: Data modeling

o Create a model (drawing, graphic representation, schema) of the data. Use a

pencil and paper if you have to or, preferably, a software modeling tool. Thisis the equivalent of blueprints for a house. It does not require much effort toadd or remove things from a drawing. It is a lot harder to do once the house

is built or the database is coded.

o The model is created with the help of the client. The client knows what needs

to be done although he may not know how it will be done - that's your job. Always keep the client involved at the design stage.

Data modeling

• Definitions

o Entity: an object, a thing in the system about which data is kept - equivalentto a file - it will be implemented as a table in the database.

o Attribute: an item of data refering to an entity - equivalent to a field - it will

be implemented as a columnin a table in the database.

o Primary key: the attribute (or combination of attributes) that uniquely

identifies every occurence of an entity.

o Relationship: the way entities link to one another

• Examples

DEFINITION EXAMPLE

Entity 

Student,Professor,Class,RegisteredStudents in "School" applicationCustomer in "Billing" applicationEmployee in "Payroll" application

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 5/58

Attribute 

Student_Id,Student_Name,Student_Major for "Student" entityCust_Number,Cust_Ship_to_Address for "Customer" entity

Employee_Salary for "Employee" entity

Primary key 

Student_Id for "Student" entityClass_Number for "Class" entityStudent_Id + Class_Number for "RegisteredStudents" entity

Relationship Professor teaches ClassStudent registered in ClassCustomer orders Product

• Graphical representation

o The main tool in the modeling process is called an Entity-

Relationship diagram or E-R diagram for short. It shows all the

components we have been discussing:Entities, Attributes of Entities, Keyattributes of Entities and Relationships between Entities.

o

o There is one symbol that appears in the diagram that we haven't yet

discussed: the line with a crow's foot at the end.

The line tells us that the entities are related and the ends of the line describethe degrees of relationship, also called cardinality. Degrees identify howmany occurences of one entity are related to how many occurences of 

another entity. Degrees are expressed in one of 3 ways:

One-to-one

One-to-many

Many-to-many

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 6/58

For example: the Student <--> Class relationship is many to many - a givenstudent (an occurence of the Student entity) may take many classes (an

occurence of the Class entity) and each class may contain many students.That is shown on the diagram by a crow's foot at each end of the line. The o

with the crow's foot says that a student may be signed-up for no classes (afootball player?) and a given class may have no students (not offered this

term).

The Class <--> Professor relationship is one to many: a given professor mayteach zero or many classes but each class must have one and only one

professor.

If you were told that each Professor only teaches one Class and that eachClass only has one Professor, you would be looking at a one to

one relationship.

It is very important to describe the degrees of relationships accurately when

you do the preliminary design. The client will not say: "There is a one tomany relationship between Class and Professor". He'll tell you, if you bother

to ask: "In this School, every class only has one Professor assigned to it". Inanother school, you may hear: "We're very proud of our team-teachingapproach. A class may be taught by several Professors working together."

There you're looking at a many to many relationship and you will have toimplement the database accordingly.

Lesson 2 - Database design short case study

ezconsulting Inc. is a small consulting company offering database design and creation

services to a fairly wide range of customers. The company employs about 30 consultants,

analysts, programmers, network specialists, who will work in teams on projects for periods

of time ranging from a few days to several months.

At any given time there may be 10-12 different projects on the go. Because ressources are

scarce a specialist may be called upon to work on several projects simultaneously. In order

to keep some control over scheduling and costing, every employee is assigned to a

department and reports to only one manager, even when he's working on projects for other

departments. Every week every employee must submit a timesheet showing the number of 

hours spent on each project.

As was the case for the shoemaker's children (they had no shoes because dad was too busy

making shoes to sell in order to put bread on the table), this company has

no Project Management database,simply because nobody has had the time to set one up.And this is typical in this kind of environment. Do you take an analyst who bills $800/day

and put him to work on in-house maintenance? No, you don't. You wait until some bright

college student shows up for a co-op work assignment and you give him/her the job. The

company hopes that after the basic Project management application is operational other

modules such as Employee Skills Management and control of bids and RFP's can be

integrated to the database.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 7/58

Designing the Project Management application

Here is what our first draft of the E-R diagram should look like for the Project Managementcase:

 

Fig. 2-1

The diagram contains the information we would have gathered by talking to the client.Notice that the attributes for Employee represent the minimum amount of information we

have to keep at this time. We haven't included things like "Home address", "Date of birth"and so on. When we start working with SQL later we will add more information to the table.

The same applies for the other entities - we will add attibutes as we develop the model lateron.

It is important to make sure at this point that you understand the degrees of the

relationships shown.Department <--> Employee is a one-to-many relationship - a given employee is assigned

to one and only one department and a given department contains zero or many employees.This means that every employee in the company will be assigned to a department, even thePresident who will be in Administration. A department may exist and have no employees

assigned to it. For example, we could create a new department in the database and, until it

is staffed, it will have zero employees assigned to it.Employee <--> Project is a many-to-many relationship - a given employee works on one

or many projects and a given project may have zero or many employees. In order to keeptrack of every employee's hours, all the work that is done will be billed to a project.

However, projects could be things like "In-house systems development", "Professional

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 8/58

development leave" or "Administrative duties". Any project may have no employees workingon it.

Now, once you have the E-R diagram down, you go over it one more time with the client to

make sure that you have the details down correctly and you are almost ready to startcreating the actual database. Notice that I said "almost".

Normalization

It is possible to start creating the database at this point. It's just a question of creating anew table for every entity identified in the diagram. We'll be using MS-Access to do that

shortly. But how do you code the relationships?

There is a formal process to do that in database modeling. It's called normalization. Itmeans applying a set of rules to the data so that you group the attributes in such a way

that the relationships work. It's not really that complicated but it is a formula approach. If you prefer to use that approach, get any good book on databases, look-up "normalization"

and follow the steps.

We'll do normalization using the intuitive approach - work with the data until it "feels" OK.

This could also be calledprototyping - create a working model of the database that is closeto what you want and keep improving it until it works perfectly, then put it into production.

However, whatever the approach taken, there are some basic rules that have to be adheredto. The rules apply to any relational database and cannot be broken. They can't even be

stretched. Think of them as the Prime directives. The rules are:

1. Every table must have a primary key - an attribute or combination of attributes

that uniquely identifies every occurence in the table.

2.The primary key can never contain an empty or Null value. That makes sense -if you had 2 that were empty, they wouldn't be unique anymore.

3. Every attribute of every occurence in the table can contain only one

value. Think of the Employee table as a grid. Every occurence, or line, represents

one employee and every column is an attribute. So, every employee can only haveone ID and one First-name and one Last-name, and so on.

The one-to-many relationship

Let's start with the easiest relationship: Employee <--> Department. 

First we create a new database and call it ProjMgt.mdb. Then we create the first twotables in the database: call them Employee and Department, and put in the fields from

the E-R diagram. Notice that the column-names in the tables are all coded with a prefix: e_for Employee, d_ for Department and p_ for Project. This is a good habit to get into. It will

make your life easier later on. This is what we now have:

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 9/58

Fig. 2-2

Remembering that that is a one-to-many relationship, how do we associate the employee

with the department? There are 2 ways it could be done:

1. Add a column for Employee ID to the Department table. You get this:

Fig. 2-3

See the problem? When you start entering data, what do you put into thed_Employee column? Rule 3 says you can have only one value. What if there are 2

employees in the Department? You could try to add another column ford_Second_employee, but what if there are 20 employees, or 200? Obviously this is

not going to work. So we scrap this brilliant idea.

2. Add a column for Department to the Employee table. You get this:

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 10/58

Fig. 2-4

Any problem with this? Doesn't seem to be. Since every employee is assigned to only

one department, I only have one value to put into the column: employee 101 works

for department 10, and that's all.

In summary, to normalize a one-to-many relationship you add a column to the table at the

"many" end of the relationship to refer to the primary key at the "one" end.

The many-to-many relationship

The many-to-many Employee <--> Project relationship is a bit trickier.

In the end we want to associate projects and employees, to see who is working on whatproject. To see how it must not be done we'll go through the exercise of adding columns to

the tables. So we add the Project table to the relationships:

Fig. 2-5

To create the relationship we could add a Project_Number column to the Employee table.

When we try it we see that we come up with the same problem we had in the previousrelationship: when we get to the e_Project column, what do we write? The employee could

be working on 7 different projects. Rule 3 says we can only enter one value.

Fig. 2-6

So we try it the other way - add an Employee_Number column to the Project table. Again,

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 11/58

when we get to the p_Employee column what do we write? There could be 25 employeesworking on this project.

Fig. 2-7

Since those two attempts obviously won't work, there has to be something else. It's called

a link entity or link table. Most textbooks will just call that table Employee-Project orProject-Employee. But in real life the entity does exist in our system. What is it that links

employees with projects? Right! It's the timesheet. The timesheet contains all theinformation we need. So we add the Timesheet entity to the mix and modify our E-R

diagram:

 

Fig. 2-8

t_Employee is an employee-ID that refers to the Employee table, t_Project is a

project_number that refers to the Project table, t_Date is the period_ending date for thetimesheet and t_Hours is the number of hours the employee spent on that project. We also

specify that every line in the Timesheet table must have one and only one employee_ID andmust have one and only one project_number. In other words we cannot create a Timesheet

for an employee who doesn't exist or charge for work on a project that doesn't exist. Who

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 12/58

would ever think of doing such a thing anyway!

What is the primary key for Timesheet? To get a feel for the key, let's look at the data thatwill be input:

Fig. 2-9

It's clear that t_Employee or t_Project can't be the primary key because they both repeat;

remember: every occurence in a primary key column must be unique. How about aconcatenation of t_Employee + t_Project. That looks good so we try it. It works fine for

one week. The following week, employee 202 has worked on project S4440 again and weget a duplicate key error!

Fig. 2-10

So we add t_Date to the key and that solves the problem. Now, assuming that the client

has said that if an employee works on a project twice in one week he adds-up the hours,

the combination of employee + project + date is truly unique.

Conclusion: there is only one way to normalize a many-to-many relationship and

that is to create a link table.The link table must contain columns that refer back to the

other tables so that the many-to-many relationship becomes two one-to-manyrelationships.

Lesson 3 - Introduction to the SQL language

The SQL language

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 13/58

SQL = Structured Query LanguageUsually pronounced 'Sequel' or, sometimes, 'ess-queue-el'

Relational database manipulation language developed in the 1970's by Dr. E.F.Codd and

IBM.

Popularized by ORACLE 

Advantage is that it allows databases that are not programmed the same to talk to each

other - it is the basis of Client/Server architecture.

A Client application written in Visual Basic under Windows can communicate with

a Server running Oracle - the Client sends the Server a SQL command which is interpretedand the result sent back to the Client.

Also, all DBMS's use SQL in their internal operations. Database Administrators (the peoplewho build and maintain the database structure) need in-depth knowledge of the language.

To build the database and test our SQL commands we'll be using MS-Access andthe Project Management database that we designed in previous lessons. If you haven't used

Access before, take a look at our MS-Access tutorial to get the basics of the tool.

To create a SQL query with Access you simply go through the normal query procedure and

select SQL View instead of the Query wizard; then specify that it is a new query (you don'thave to identify the tables used):

Fig. 3-1

and then write the code and run it:

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 14/58

Fig. 3-2

SQL syntax in Access

SQL syntax is not very strict. A statement can be be written over several lines but, mostimplementations of SQL will insist on the semicolon (;) at the end.

Upper and lower cases don't matter but again, in most installations you will see common

practices such as writing all command verbs and clauses in uppercase and table names,column names, etc. in lowercase.

Syntax errors will be flagged as soon as you try to execute the statement. Experience willshow what the most common errors tend to be. Since SQL syntax is not very complicated to

begin with, errors are usually easy to detect and to fix.

There is not a whole lot of punctuation involved. The ; at the end of the statement is

important and, of course, parentheses have to be in the right places, like any otherlanguage. As for data types, string values are inclosed in single quotes, dates in pound signs

and numeric in nothing.

For example,e_salary = 55000

e_fname = 'Mike'e_hiredate = #1995-10-10#

SQL INSTRUCTIONS

The SQL instruction set consists of only about 30 instructions. Although there are SQL

instructions to create and manipulate tables and the data they contain, it is quite possiblethat all the maintenance functions will be done using the DBMS (Access in this case). If that

is the way your system is set up your applications will end up using the SELECT instruction95% of the time.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 15/58

In case you missed it previously, a query is a question, an interrogation, a lookup. That iswhat SQL is built for - it exists to get information from databases.

In this tutorial, we will assume that the database itself is already created and named. Some

of the tables may have been created in Access but we will use SQL statements to create theothers, just to make sure that we know how to do it.

Table manipulation statements

There are SQL statements to create tables, modify them or remove them.

To create a new table in the current active database:

CREATE TABLE table_name (column1 datatype not null, column2 datatype, ...);

Example:

CREATE TABLE employee (e_id string(3) not null, e_fname string(20),e_salary single, e_hiredate date);

The usual datatypes are:

INTEGER Integer values between -32K and +32K 

SINGLE Single-precision floating point

DOUBLE Double-precision floating point

DATE Date/time

STRING(n) Fixed-length string; n = number of characters

BOOLEAN True/False

We add the not null clause to the statement to indicate that null is not allowed in the

column, if it is to be a primary key, for example.

Speaking of NULL

Everyone knows by now that when we speak of characters, we mean the lettersof the alphabet and the numbers and punctuation signs and so on. The space ( )is a character and so is the zero (0).

In SQL we will often have to refer to the NULL value. NULL is not acharacter; it is the absence of a character. In books they say that NULL meansthat the value is undetermined. In fact, it means that there is no value assignedto the field, it is completely empty. NULL is not numeric, nor string, nor date.Any type field can be NULL.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 16/58

When the quantity-on-hand of an item in stock becomes 0, it is not null; itcontains the numeric character 0. When I assign a 0 grade to an assignment(which happens all too frequently), that grade is included in the class average.If there is no grade assigned because the student was ill, that field is null and

therefore it is not computed as part of the class average.

SQL commands will not consider nulls when they count or compute data. Insome cases it is necessary to test if a field contains a value or not by using theclauses: IS NULL or IS NOT NULL in a statement.

In the example above, when entering data in the Employee table, you couldtheoretically have one employee with spaces as an Id but, you are not allowedto have one with an empty Id.

To change the structure of a table:

ALTER TABLE table_name ADD (column datatype);

Example:

ALTER TABLE Employee ADD (e_Address string(30));

ALTER TABLE Project ADD (p_Country string (20));

Note that there is no statement to change or remove a column.

To delete a table from the database:DROP TABLE table_name

Example:

DROP TABLE Employee;

Data manipulation statements

Data manipulation statements are used to work on the data contained in the tables.

To create a new record, a new row, in a table:

INSERT INTO table_name VALUES (value1, value2, ...);

Assuming that we have executed the CREATE TABLE and the ALTER TABLE statements fromabove (and not the DROP statement), the Employee table now contains 5 columns: Id,First name, Salary, Hire date and Address.

The INSERT statement will create a new employee record; it will add a row to the table.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 17/58

The number of data items must correspond to the number of columns and the type of datamust correspond to the datatype of each column.

Example:

To change data in an existing record:

UPDATE table_name SET column1 = value1, column2=value2, ...

WHERE condition;

Example:

UPDATE Employee SET e_salary = 30000

  WHERE e_Id = '222';

UPDATE Employee SET e_Salary = e_Salary * 1.1  WHERE e_Departement = '101';

The WHERE clause is SQL's IF statement. The update is done only if the condition in theWHERE clause is true.

In the first example, the update is performed only for the employee whose Id is '222'. His

salary is set at 30000.

In the second example, the update is performed for every employee whose department is'101'.

In example 2, the command from the boss to launch the SQL statement would have been:

"Give everybody in department 101 a 10% raise".

UPDATE Employee SET e_Salary = 100000;

If there is no WHERE clause in a statement, the update is performed on all the records in

the table.

To remove records or rows from a table:DELETE FROM table_name WHERE condition;

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 18/58

Example:DELETE FROM Employee

  WHERE e_Id = '222';

DELETE FROM Employee  WHERE e_Salary > 100000;

In the first case, delete employee '222'. In the second case, delete everybody earning more

than 100K. Hey! What you gonna do? Times are tough all over!

DELETE FROM Employee;

If there is no WHERE clause, every record in the table is deleted! And it won't even ask"Are you sure?".

Database servers

In the previous lesson we started looking at the SQL language syntax. I used an Accessdatabase to illustrate how to test our SQL commands against a real database.

You have to understand that Access is not typical of the database environment you will

probably work with in the real world.

For one thing, it's meant to work in standalone mode (with one user) or, at most, shared

among 4-5 users. Access is not a database server.

Also, in Access you don't get to see the SQL code very much. It's always there but it's

behind the scenes, written by the various wizards and hidden behind forms or reports or

QBE queries.

For real databases (and by that I mean really big or with many users) you will normally

work with a database server. A database server is software that resides on a large

computer with good communications capabilities. It stores the data, handles the

maintenance of the tables and responds to the demands of clients who want to manipulate

the data.

The best-known server is Oracle. It can handle any database from a few dozen users to

thousands of users.

Oracle's biggest competitor in the really big applications is SAP.

Then, for small to medium jobs you've got SQL Server from Microsoft, which is sort of the

big brother to Access.

All these products are based on what we call the relational model. They all store the data

in tables, they have primary keys and they build relationships between the tables.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 19/58

And they all use the SQL language to communicate between the server and the clients.

Some of the servers have modified SQL somewhat for their own use but, it's essentially the

same language for everyone.

Lesson 4 – Queries

The database

Before going any further, please make sure that the ProjectMgt database you are working

with matches the model we created initially. You may have experimented with the tablesand the columns in the previous lesson and that is perfectly OK! But before going on to

the query statements, it is recommended that you consult Fig. 2-8 in Lesson 2 and matchyour database to that model. Don't worry about primary keys and relationships and so on at

this point. We'll take care of that later. But do enter some meaningful data in the tables sothat your queries will have something to display when you run them (15 or 20 records in

each table should be enough).

When you input data into the tables, if you haven't created the relationships in Access, tryto maintain referential integrity. That is: when you assign a department number to an

employee, that department number should already exist in the Department table. When youcreate a Timesheet record, the employee number should exist in the Employee table and

the project number should exist in the Project table.

You may use SQL statements to change the database or you may do it with Access. If 

you're really lazy...er, sorry, really busy, you can download the database from theDownload area after the last lesson.

Import from Access

Since we've already built the Project management database in Access, it seems a shame to

waste all that work.

Fortunately, DBManager has a Wizard to convert the Access database into MySQL.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 20/58

Creating Queries

As we mentionned earlier, it is quite probable that 95% of your work with SQL will consist of 

questions to the database. If the database structure is well-built and the information hasbeen input, any question can be answered, no matter how tricky. "How many widgets were

bought by women aged between 25 and 30 on Tuesdays in months ending in R over the

past 5 years?" Give us 5 minutes and we'll build a query that will answer that for you. That'scalled an ad hoc query, which means "as needed" rather than one which has to be planned

and programmed in advance. It can impress the hell out of the Boss or the Sales Manager!

Hey! before you know it we'll be as good at this stuff as the guys who do the baseball statson TV. "Yes Frank, it's amazing that this guy hit 255 when batting left-handed against right-

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 21/58

handed pitchers in night games when the moon was full and the temperature was over 75degrees and there was a light breeze from the west!"

The only statement needed to build a query is the SELECT statement.

The basic syntax of the SELECT statement is:

SELECT column1,column2, FROM table_name1,table_name2,WHERE condition;

The SELECT clause lets you specify which columns to display (they may be table columns or

they may be calculated from the data in other columns). The FROM clause lets you specifythe table or tables from which the data will be obtained. Note that the standard SELECT

statement allows you to get the data from as many tables as you need. If you have toaccess the Employee table and the Timesheet table to build the query, you can do it. If you

have to access 15 tables, you can do it. But that's a lot more involved and we'll leave it foranother day, more specifically, Lesson 7. For the next few lessons we'll master the SELECT

statement to access any information we need in one table at a time. Finally,the WHERE clause (see below) will determine which records, also refered to as rows, will be

selected.

Here are some examples of the SELECT in action:

  Fig. 4-1

Instead of listing the columns, use the * to mean 'All the columns'

And the result is:

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 22/58

  Fig. 4-2

Or display only certain columns:

  Fig. 4-3

and the result is:

  Fig. 4-4

To get data from the Employee table:

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 23/58

  Fig. 4-5

from which we get:

  Fig. 4-6

Same query, different look:

  Fig. 4-7

The 'AS' clause allows you to display a column heading that is more representativethan the field name usually displayed by the query. Compare with Fig. 4.6.

  Fig. 4-8

This is what it look like with the Query Editor

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 24/58

THE WHERE ... CLAUSE

As stated previously, the WHERE clause is in fact an IF statement. If a record returns TRUEto the WHERE clause, it is selected to be displayed.

If the table contains 10,000 records, or rows, you may wish to see only a few or even only

one. In that case you would specify the condition as "... WHERE primary_key_column ='value' ...".

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 25/58

The WHERE clause uses the usual operators to build the condition:

= > < >= <= <> or !=

and a few you may not be as familiar with but which we'll see in the examples:

BETWEEN LIKE IN NOT

For the next examples, suppose we have a new table called Products. Note that we can

create this table in the ProjectMgt database, even though it ha absolutely nothing to do withthe application. We have to put the table somewhere and that's as good a place as any. It's

important to understand that the tables have no relationships between each other until we

define those relationships. If we want to create a table to be used on it's own and then dropit when we're done, there is no problem with that.

PRODUCTS

ProdNumProdName

SellPriceCost

Fig. 4-9

EXAMPLES:

SELECT * FROM Products

WHERE ProdNum = 'A1234';

SELECT ProdNum, ProdName, SellPrice

FROM ProductsWHERE SellPrice > 50;

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 26/58

SELECT ProdNum, SellPrice, (SellPrice * 1.1)FROM Products;

we can display a calculated column, in this case, what a 10% price increase would look like

use the usual arithmetic operators:

+ - * / ^ ( )

There is a common misconception aboutcalculated columns in the SELECT statement - people think that the calculation will somehowchange the data in the table. That is impossible.The SELECT statement is strictlya display statement. Any calculations done areread-only. There is no way that a SELECT canmodify a table. The only statements that can do

that are the ones we looked at in the previouslesson: INSERT, UPDATE and DELETE.

SELECT ProdNum, SellPrice, Cost, (SellPrice - Cost) AS [Profit]FROM Products

WHERE ProdNum LIKE 'A*';

we can display the calculated column with an appropriate title, for all products whose

number starts with 'A'.

* and ? are the widcard characters

* = character string (any number of characters)

? = 1 character

SELECT ProdNum, ProdName

FROM ProductsWHERE ProdNum LIKE "A?5??";

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 27/58

SELECT ProdNum, ProdName, SellPriceFROM Products

WHERE SellPrice BETWEEN 50 AND 150;

could also be written as >= 50 AND <= 150

SELECT ProdNum, ProdNameFROM Products

WHERE ProdName LIKE "*general*";

display if the name contains the string 'general'

SELECT ProdNum, ProdNameFROM Products

WHERE ProdNum IN ('A100', 'A200', 'B500', 'D800');

if the product number is one of those named

AND and OR are used like in all other languages:

SELECT * FROM ProductsWHERE ProdName LIKE "A*" AND SellPrice > 500; 

SELECT * FROM Products

WHERE (SellPrice - Cost) < 10 OR (SellPrice - Cost) > 500; 

display the low-profit and the high-profit items

Working with dates

Whenever you develop a commercial application, there is absolutely no way that you can

get by without using date fields. There are Birth dates, Hire dates, Delivery dates, Order

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 28/58

dates, and so on, and so on ....

In ancient times, like 20 years ago, dates were stored as strings and we all remember whatthat brought about in 1999. Now all DBMSs handle dates in a Date/Time format, which

makes our lives a lot simpler, but we have to be aware of the particular properties of Dateformats.

To begin with, know that you can do calculations with dates as you do with numbers.

#2001-01-31# - #2001-01-01# will return 30, the number of days between the 2 dates.

#2001-01-01# + 3 will return #2001-01-04# because a numeric constant is always

taken to mean days.

When using the comparison operators, > #date1# is taken to mean later than or afterand < #date1# is taken to mean earlier than or before.

In the WHERE ... clause, ... BETWEEN #date1# AND #date2# sets a date betweendate1 and date2, inclusive.

To work with date fields in SQL, we'll use the Date and Time functions that Access supplies.Note that those functions are available in just about every environment that supports SQL.

The main functions: NOW( ) and DATE( ) return the current date. The difference between

the two is that NOW( ) returns date and time, at this moment, and DATE( ) returns onlythe current date.

In Access, a date or time constant must be identified with # ... #, as in:

... WHERE p_startdate = #2001-01-01#;

Date formatsIf you intend to do e-commerce in the global village, you have tounderstand that different folks have different ways of doing things.

For example, if you are American and you tell your Frenchgirlfriend, the love of your life, that you'll meet her under the Eiffeltower on 01/02/03, there is a good chance that you'll never see her again. To you it is obvious that you specified the date as January2nd, 2003. In France, as in other French areas, like Quebec, the dateis understood to be the 1st of February, 2003. In your case, it maywork out. If you straighten out the misunderstanding in time, you go

 back a month later and she's waiting for you. Good luck!

To avoid problems, get used to using the ANSI international

standard date format: yyyy-mm-dd, as: 2003-01-02. Note the useof the 4-digit year. Remember all that anguish we went through in1999 with the 2-digit 00 year? We don't want that to happen again.Also, note that the separator is the dash character - , and not the slash/.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 29/58

To set the date format, go through the Windows Control Panel,Regional settings. Since SQL and Access get their formatting fromWindows, the format will be selected automatically.

In Access and SQL, one of the most useful functions is called: DateDiff( )

DateDiff('interval', #date1#, #date2#) returns the time difference between date1 and

date2, expressed in interval units which could be: days, months, years, weeks or hours.

The interval is specified as: 'd' for days, 'w' for weeks, 'm' for months and 'yyyy' for years.

For example:

Datediff('d', #2001-01-01#, now()) returns the number of days between January 1stand today.

Datediff('m', p_StartDate, p_EndDate) returns the length of the project, in months.

If the result displays too many numbers after the decimal, use the ROUND(number,

digits) function to display the number rounded to 'digits' positions after the decimal:

ROUND(Datediff('m', p_StartDate, p_EndDate), 2).

In theory, Datediff('yyyy', e_BirthDate, now()) returns the employee's age, expressedin years. In practice however, you will find that it works or doesn't work depending on

whether the employee has had his birthday yet this year or not.

To calculate the exact age, use the following formula:

INT(Datediff('d', e_BirthDate, now())/365.25)

Calculate the number of days and divide by the exact number of days in a year, which, asyou know, is 365.25 and not 365. That takes leap years into account.

The INT( ) function truncates the result so that 25.9 becomes 25, for example; theemployee is 25 years old until the day she turns 26; after the age of 5, you rarely hear

people say that they are 25 and a half years old.

When working with age, remember that you can often use Date-of-birth directly, withoutdoing the age calculation. Don't forget that the smallest date refers to the oldest person.

Eliminating duplicates

To close out this section on SELECTs, we'll look at how to eliminate duplicate lines from

query results.

For example, suppose we want to see the list of countries where we have projects. If we dothis:

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 30/58

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 31/58

The database is created in MySQL and MySQL Query Browser is used to execute the managers'requests.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 32/58

Which companies bid on project 05-7777?

SELECT WITH THE AGGREGATE FUNCTIONS

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 33/58

You use aggregates in the SELECT statement when you want to get summary information(statistics) on sets of data.

Here we assume that you've done enough programming to know that a function is a system-defined program that accepts a parameter from the user and returns an answer. A function isalways composed of a keyword followed by parentheses ( ).

Aggregate functions

SUM (expression) total values in a numeric expression

AVG (expression) average values in a numeric expression

COUNT (expression) the number of non-null values

COUNT (*) the number of selected rows

MAX (expression) the highest value in the expression

MIN (expression) the lowest value in the expression

First, note that an aggregate function will always return only one row. That's because it answersa question refering to a group or set of data. You can find the biggest value in the set but, youcan't know what item that biggest value refers to. Same with smallest value or average.

It looks like a good idea to write something like:

SELECT ProdNum, ProdName, MIN(Cost) FROM Products;to get the name of the lowest-cost item. But it won't work because the aggregates don't work onindividual rows.

EXAMPLES:

To obtain the biggest SellPrice in the Products table:

SELECT MAX(SellPrice) FROM Products;

To obtain the number of rows in the Products table, in fact the number of products carried:

SELECT COUNT(*) FROM Products; 

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 34/58

 Now, the previous statement will count the number of products based on the number of primarykeys or Product numbers entered.

If you thought that there might be many duplicates in the items carried, you assume that theduplicates would have the same ProdName; so by counting ProdName and DISTINCTProdName you would get an idea of how many duplicates there are, although you cannotestablish what they are:

SELECT COUNT(ProdName) FROM Products;

SELECT DISTINCT COUNT(ProdName) FROM Products;

The WHERE clause can also be used with aggregates to define the set of data to be calculated.

To find out how many big-profit items you have (assuming that big means more that $500), youdo this:

SELECT COUNT(*) AS [Number of big-profit items]

FROM Products

WHERE (SellPrice - Cost) > 500;

Or, in this case, to get the average cost of sportings goods, assuming that items in the Sportsdepartment all have a number starting with 'S':

SELECT AVG(Cost) AS [Average cost of Sports]

FROM Products

WHERE ProdNum LIKE "S*";

The AVG function will return the average of a set of numerical values and SUM will return atotal:

SELECT AVG(SellPrice) FROM Products;

SELECT SUM(Cost) FROM Products;

Although you are not allowed to work on individual rows, you are allowed to use severalaggregates in the same statement:

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 35/58

SELECT SUM(Cost), COUNT(Cost), AVG(Cost), AVG(SellPrice)

FROM Products;

Top

USING SUBQUERIES

We said earlier that you cannot use an aggregate function with an individual query. You can findwhat the biggest SellPrice is but you can't find what that Product is. Although that's true with thenormal SELECT statement, there is a way to work around it. It's called a subquery and it relieson what we call the priority of operators in programming - the fact that any operation in parentheses, ( ), is executed first in a statement because ( ) is the operator with the highest priority.

When we do this:

SELECT MAX(SellPrice) FROM Products;

the query returns the value of the biggest SellPrice.

 Now, if we enclose that statement in parentheses and use it as a subquery in another statementlike this:

SELECT ProdNum, ProdName, SellPrice FROM Products

WHERE SellPrice = (SELECT MAX(SellPrice) FROM Products);

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 36/58

the subquery returns a single value which is then used in the WHERE clause of the mainstatement to display the number and name of the product having the biggest SellPrice. If morethan one products have the max price, several rows will be displayed.

What products cost more than the average cost of products?First, calculate the average cost in a subquery and then, compare the table with that value:

SELECT ProdNum, ProdName, Cost FROM Products

WHERE Cost >= (SELECT AVG(Cost) FROM Products);

The subquery can also be used to answer questions where you have to compare data with other rows from the same table.

Getting back to our ProjectMgt example, we'll use the Employee table.

How would you answer this: "Which employees live in the same city as employee '1234'?".

You could do it in steps.

First you have to find the employee's city:

SELECT e_city FROM Employee

WHERE e_Id = '1234';

and, if it is, let's say, 'Boston', use that in the next statement:

SELECT e_Id, e_Fname, e_Lname FROM Employee

WHERE e_City = 'Boston';

Or, you could decide to do it efficiently and use the subquery technique:

SELECT e_Id, e_Fname, e_Lname FROM Employee

WHERE e_City =

(SELECT e_city FROM Employee WHERE e_Id = '1234');

Which employees are older than John Smith?

SELECT e_Fname, e_Lname, e_BirthDate FROM Employee

WHERE e_BirthDate <

(SELECT e_BirthDate FROM Employee

WHERE e_Fname = 'John' AND e_Lname = 'Smith' );

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 37/58

 Note that the subquery must return one and only one value. The WHERE clause in the mainquery can only compare to a single value and that means one column from one row. In the

 previous statement, if there is more than one 'John Smith' in the company, we've got a problem.In that case we would have to use e_Id instead of name to identify the person.

You have to recognize that the following 2 statements don't make any kind of sense:

SELECT e_Id, e_Fname, e_Lname FROM Employee

WHERE e_City =

(SELECT * FROM Employee WHERE e_Id = '1234');

SELECT e_Id, e_Fname, e_Lname FROM Employee

WHERE e_City =

(SELECT e_city FROM Employee );

DISPLAYING RESULTS IN ORDER 

It may not have been mentionned specifically yet: in a database table, there is no way to inputdata in a given order. In other words, if you add a row to a table you cannot insert it betweenother rows so that it comes out automatically in alphabetical order. Whenever a new row isadded, it is simply appended to the end of the table.

If you want the rows to come out in a given order you have to sort them. To sort rows you usethe ORDER BY clause in the SELECT statement.

The syntax of the ORDER BY clause is:SELECT select_list

FROM table

ORDER BY expression [ASC ¦ DESC];

ASC stands for Ascending order and it is the default value

DESC is used to sort in Descending order 

To list all projects in order of their StartDate, with the oldest first (smallest date):

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 38/58

SELECT p_Number, p_Title, p_StartDate, p_EndDate

FROM Project

ORDER BY p_StartDate; 

To get the list in reverse order, with the most recent at the beginning, use the same SELECT butadd the DESC option:

SELECT p_Number, p_Title, p_StartDate, p_EndDate

FROM Project

ORDER BY p_StartDate DESC; 

You can also have a sort within a sort by specifying 2 sort fields. The most common example of that is sorting in First name order within Last name - all Smiths will be in First name order, etc.

 Note that the main sort field is named first, the secondary sort field is second and so on. In thiscase Last name is the main sort order:

SELECT e_Id, e_Fname, e_Lname

FROM Employee

ORDER BY e_Lname, e_Fname; 

What if you have a calculated expression that you want to sort on.

If you need to list the length of all projects and sort on that expression:

SELECT p_Number, p_Title, p_StartDate, p_EndDate,Datediff('m', p_StartDate, p_EndDate) AS [Project length]

FROM Project

ORDER BY 5;

There are 5 elements in the select_list. 'ORDER BY 5' specifies to sort on the fifth element, thecalculated field. You could do the same for any other sort specification. For example, 'ORDER BY 2' in this example will sort in order of p_Title.

Lesson 6 - Sorting and grouping

DISPLAYING RESULTS IN ORDER 

It may not have been mentionned specifically yet: in a database table, there is no way to inputdata in a given order. In other words, if you add a row to a table you cannot insert it betweenother rows so that it comes out automatically in alphabetical order. Whenever a new row is

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 39/58

added, it is simply appended to the end of the table.

If you want the rows to come out in a given order you have to sort them. To sort rows you usethe ORDER BY clause in the SELECT statement.

The syntax of the ORDER BY clause is:SELECT select_list

FROM table

ORDER BY expression [ASC ¦ DESC];

ASC stands for Ascending order and it is the default value

DESC is used to sort in Descending order 

To list all projects in order of their StartDate, with the oldest first (smallest date):

SELECT p_Number, p_Title, p_StartDate, p_EndDate

FROM Project

ORDER BY p_StartDate; 

To get the list in reverse order, with the most recent at the beginning, use the same SELECT but

add the DESC option:

SELECT p_Number, p_Title, p_StartDate, p_EndDate

FROM Project

ORDER BY p_StartDate DESC; 

You can also have a sort within a sort by specifying 2 sort fields. The most common example of that is sorting in First name order within Last name - all Smiths will be in First name order, etc.

 Note that the main sort field is named first, the secondary sort field is second and so on. In this

case Last name is the main sort order:

SELECT e_Id, e_Fname, e_Lname

FROM Employee

ORDER BY e_Lname, e_Fname; 

What if you have a calculated expression that you want to sort on.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 40/58

If you need to list the length of all projects and sort on that expression:

SELECT p_Number, p_Title, p_StartDate, p_EndDate,

Datediff('m', p_StartDate, p_EndDate) AS [Project length]

FROM ProjectORDER BY 5;

There are 5 elements in the select_list. 'ORDER BY 5' specifies to sort on the fifth element, thecalculated field. You could do the same for any other sort specification. For example, 'ORDER BY 2' in this example will sort in order of p_Title.

GROUPING DATA

For this next section we are going to use a new database

called BookStor. You can download it now from the Download area. Select the version that youneed. For this lesson we'll use the Authors table only. But, keep the database handy. Later, whenwe get to Lesson 9, we'll use the other tables and Queries in BookStor to look at more advancedconcepts such as Crosstab queries and Union queries, etc.

In the previous lesson, we learned how to use the aggregate functions to produce summaryinformation on sets of data.

From my table of authors, I want to know how many authors produce 'Romance' novels. I use asimple query with the aggregate function:SELECT COUNT(au_id) AS [Number of authors]

FROM AuthorsWHERE au_subject = 'Romance'; 

But, what if I want to know how many authors I have in each category, even if I don't know whatthe categories are? It can't be done with a simple Select with aggregate.

The answer is a new clause called: GROUP BY which is used with the Select.

The syntax of the GROUP BY clause is:

SELECT select-list

FROM table

GROUP BY group_by_list;

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 41/58

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 42/58

 Now, let's say you don't want to see details of all the states in the Authors table, you just want tosee authors from Utah and Kansas.

There is another clause to select groups, the same way as the WHERE clause selects rows. That

clause is called HAVING and is used the same as the WHERE but, on groups.

SELECT au_state AS [State],

COUNT(au_id) AS [Number of authors],

AVG(au_salary) AS [Average salary]

FROM Authors

GROUP BY au_state

HAVING au_state IN ('UT', 'KS');

 Note the use of the IN operator to mean 'If the author's state is in this list, the row is selected'.The clause could also have been written as: HAVING au_state = 'KS' OR au_state = 'UT'.

You can also group data on more than one column.

How many authors of each category are there in each state?

SELECT au_state, au_subject, COUNT(au_subject) AS [Number]

FROM authors

GROUP BY au_state, au_subject

ORDER BY 2;

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 43/58

If you want to see it displayed differently, just change the sort order:

SELECT au_state, au_subject, COUNT(au_subject) AS [Number]

FROM authors

GROUP BY au_state, au_subject

ORDER BY 1;

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 44/58

Lesson 7 - Joining tables

A new case study - The Editor Project

This is one that's been around for many years. It's been used to teach SQL for the last 25 years, at

least. I converted it to run in MySQL.

You've got Authors. Authors write books. Sometimes, a book is written by several authors, eachof whom will receive a percentage of the royalties. Some authors have written several books;others have yet to write any (we may have a file on them because they're in the process of writingand we gave them an advance). So, the relationship between Authors and Books is many-to-

many.

The BookAuthor table is the linking table between authors and books.

Publishers are companies that print and distribute the books. We'll get to that relationship later.

P.S. You'll notice that the last column in most tables has a funny sign at the end of the data.That's a carry-over of a CRLF because the table was imported from Access. It would be better tomodify those because some queries will not work properly.

As an exercice in SQL, use an Update command to change them, as in this example:

If you're not sure how that works, look-up the Update syntax in the Query browser.

The % sign is the widcard character in MySQL, like the * in Access.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 45/58

A FEW MORE WORDS ON MODELING

If you've been following this from the beginning, you've been playing around with theProjectMgt database. It's perfectly OK to have run tests on it, to have added or changed data or tohave changed the structure of the tables themselves. Before continuing, however, we should

standardize the database so that we're all on the same wavelength for the rest of the lessons.

Let's review. We started out with this model:

Then we added a few columns to different tables: e_BirthDate in Employee, p_Country inProject and maybe a few more.

But there is still one problem with the design. In fact, the problem is that the database is notnormalized to the Third Normal Form (3NF). Uh? Let's look at it in practical terms.

If you have one table for Timesheets, you get one row for each timesheet entry: on a givenFriday, an employee who has worked on 2 projects submits his timesheet. You input the

timesheet date, the employee-Id, the project number and the hours for the first project, creatingone row in the table and then you repeat for the second project, creating another row in the table. Now, if you have an application (in VB, Powerbuilder or Access) that wants to print a timesheetreport, it will probably print something like this:

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 46/58

IMPORTANT: THE MASTER/DETAIL FORM

This is very standard form format. It's called a Master/Detail form.

In business applications you will use dozens of these: Orders,Invoices, Purchase Orders, PO Requisitions, etc.

What they all have in common is that there is a Master section whichcontains information on the transaction as a whole, and a Detailsection which contains information on the details of the transaction.

In an Invoice, for example, the invoice date, customer name andaddress, shipping date are in the Master while items purchased,quantities, prices are in the Detail section.

It is very difficult to produce a Master/Detail form from a single table.

Therefore, what we will do in our ProjectMgt database is normalize the Timesheet table into aTimesheet-Master table and a Timesheet-Detail table. Master will contain the Timesheet number as Pk, the Employee-Id and the Timesheet date (all the information common to all transactions).

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 47/58

Detail will contain the Timesheet number, Project number and Hours-worked for each project.

Since there may be several Project numbers associated with one Timesheet number in Detail, wewill assign Timesheet number + Project number as Pk for the Detail table.

You may feel a bit overwhelmed at this point. Take your time.

You should download a new copy of The Project management database now. If you prefer towork with the 97 version, go to the Downloads area - it has several versions of the database.Study it carefully and try to relate the design of the database to the Timesheet form shownabove. Remember that a database is not a theoretical concept - it has to be applied to real-lifeapplications.

Top

USING MULTIPLE TABLES IN A SELECT

Let's go back to the ProjectMgt example.

When you have to look at Employee data you do a SELECT from the Employee table.Remember that the Employee table contains the employee's department but as a number only.

When you run the SELECT you can't tell what the department's name is from the output.

SELECT e_Id, e_Fname, e_Lname, e_Dept

FROM Employee;

When you look at data from 2 or more tables in a SQL statement, the operation is called a JOIN.You are in fact joining 2 tables to provide the result needed. However, there is no JOIN clause inSQL - everything is done with the SELECT statement.

In the example above, you want to see the department's name instead of it's number when you

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 48/58

look at an employee record. Since the department name is in the Department table and all theother fields are in the Employee table, it is fairly obvious that you will have to open 2 tables inthe SELECT. Let's try it:

SELECT e_Id, e_Fname, e_Lname,

e_Dept, d_DeptNum, d_DeptNameFROM Employee, Department;

It should be immediately obvious to you that although the query worked, it produced way, waytoo much data.

And that brings us to talk about how a Join operation works.

When you tell SQL to join 2 tables, it really joins them! In fact, it joins every row in the firsttable with every row in the second table. If the first table, Employee, contains 5 rows and thesecond table, Department, contains 3 rows, the result displays 5 x 3 = 15 rows. Which is whathappened in the example above. However, since there are only 5 employees, it means that 10 of those rows are meaningless.

The trick to know about joining table is fairly simple yet, absolutely crucial:The tables you are joining must have common columns. Those columns don't have to have

the same name but, they must contain the same kind of data: same datatype and size.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 49/58

e_Dept and d_DeptNum are both Numeric, Long integer and the Dept. numbers assigned

to employees exist in the Department table.

The only meaninful information in a JOIN operation is that which occurs when data in the

two common columns is the same.

In database jargon, the field that is used as a reference from one table is called a foreign

key (Fk) and it must correspond to another field which is a primary key (Pk) in it's table. In

our example, e_Dept is a Fk in the Employee table and d_DeptNum is a Pk in the

Department table.

The thing to recognize about the result of the query above is that the only good results are theones where e_Dept and d_DeptNum are the same.

So, we implement the JOIN with a WHERE clause:

SELECT e_Id, e_Fname, e_Lname,

e_Dept, d_DeptNum, d_DeptName

FROM Employee, Department

WHERE e_Dept = d_DeptNum;

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 50/58

Let's look at more examples:

List all the timesheets, showing the employee's name and phone.

SELECT tm_Num, tm_Date, tm_EmpID,e_Fname, e_Lname, e_Tel

FROM Employee, TS_Master

WHERE tm_EmpID = e_ID;

List all the timesheets, showing project titles, start and end dates.SELECT td_Num, td_ProjNum, td_Hours,

p_Title, p_StartDate, p_EndDate

FROM Project, TS_Detail

WHERE td_ProjNum = p_Number;

To obtain a particular employee's timesheets, add the condition to the WHERE clause:

SELECT tm_Num, tm_Date, tm_EmpID,

e_Fname, e_Lname, e_Tel

FROM Employee, TS_Master

WHERE tm_EmpID = e_ID

AND tm_EmpID = 'A1111';

You may be able to guess from the previous examples that joining 3 or 4 tables requires that alltables have pairs of common columns.

To obtain data from the Department, Employee and TS_Master tableswe have to know that Dept.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 51/58

 Number exists in both Employee and Department and that Employee ID exists in both Employeeand Timesheet Master:SELECT tm_Num, tm_Date, tm_EmpID,

e_Fname, e_Lname, d_Name

FROM Employee, TS_Master, Department

WHERE tm_EmpID = e_IDAND e_Dept = d_DeptNum;

To list all timesheets, with employee names and project titles, we know that Timesheet Number exists in both Timesheet Master and Timesheet Detail, that Employee Id exists in both Employeeand Timesheet Master and finally, that Project Number exists in both Timesheet Detail andProject:

SELECT tm_Num, tm_Date, tm_EmpID, td_ProjNum,

e_Fname, e_Lname, p_Title

FROM Employee, TS_Master, TS_Detail, Project

WHERE tm_Num = td_Num

AND tm_EmpID = e_ID

AND td_ProjNum = p_Number;

OK, so it doesn't look all that great! But it works. All you have to do is arrange the columnnames and use the ORDER clause to sort it in proper order. And again, if you want to see thetimesheets relating to a particular project, modify the WHERE clause:

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 52/58

SELECT tm_Num, tm_Date, tm_EmpID, td_ProjNum,

e_Fname, e_Lname, p_Title

FROM Employee, TS_Master, TS_Detail, Project

WHERE tm_Num = td_Num

AND tm_EmpID = e_ID

AND td_ProjNum = p_NumberAND td_ProjNum = 'C33333';

The 'JOIN' Formula

Joining multiple tables is not difficult as long as the database is designed properly: tables that are to be joined must have columns in common.The formula is applied in the WHERE clause:WHERE table1_column_w = table2_column_x AND table2_column_y =

table3_column_z AND ...If 2 tables have no common columns they cannot be joined. For example, if westill had the Products table in our database, we couldn't join Products andEmployee or Products and Project because there is no common data in thosetables.

The great thing about JOINS is that once you've mastered the technique you can obtaininformation from anywhere in the database. It may involve 4 or 5 or 10 joins but, so what!

The Boss wants to know which Departments are involved in projects in Germany at the moment.

Follow the joins:

SELECT DISTINCT d_DeptName, td_ProjNum, p_Title, p_Country

FROM Department, Employee, TS_Master, TS_Detail, Project

WHERE td_ProjNum = p_Number

AND tm_Num = td_Num

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 53/58

AND tm_EmpID = e_ID

AND e_Dept = d_DeptNum

AND p_Country LIKE 'Germany*'

AND DATE( ) BETWEEN p_StartDate AND p_EndDate;

There are several points that should be noted about this query:

• If a project in Germany has many timesheets submitted on it from one department, eachoccurence will generate one row - we only want to know the name of the department, nothow many times it shows up so, we use the DISTINCT clause.

• In the WHERE clause, always do the joins first - there are 5 tables involved andtherefore, there are 4 joins.

• Whenever you are comparing to a string or text field, use the LIKE operator - the country

could have been mistakenly entered as "Germany " in the project data - the strings"Germany" and "Germany " do not match.

• The Boss said "...in Germany at the moment". Listen to the question. That meanscurrently active. You don't want project that are already over or that haven't started yet. If today is between the start and end dates, the project is currently active.

• If there is an active project in Germany but it hasn't had timesheets submitted for it yet, itwon't show up in the list. There is a way to list it in a query and we'll cover that in thenext Lesson.

Lesson 8 - Specialized Joins

USING ALIASES AKA NICKNAMES

An alias is a nickname, a second name given to an object. In SQL there are a few occasionswhere you may want to use aliases for tables.

Case 1: you have 2 tables with the same field names. It happens frequently and there is no particular problem with it. We have been using different names, with prefixes, for all columns but not everyone does that.

Suppose that you used EmpID in both the TS_Master and Employee tables. Now, when you do a join on the tables, like this:

SELECT tm_Num, tm_Date, EmpID,

e_Fname, e_Lname, e_Tel

FROM Employee, TS_Master

WHERE EmpID = EmpID

AND EmpID = 'A1111';

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 54/58

you soon run into all kinds of problems and all of them will mention somethingabout ...ambiguous reference... which means that the system doesn't know what the heck you'retalking about - it can't figure out which EmpID you are refering to because there are 2 of them.

The solution is to add the table names, with dot notation, to the fields which are ambiguous:

SELECT tm_Num, tm_Date, TS_Master.EmpID,e_Fname, e_Lname, e_Tel

FROM Employee, TS_Master WHERE TS_Master.EmpID = Employee.EmpID

AND TS_Master.EmpID = 'A1111';

 Now, adding Table_name. to all fields is standard SQL syntax. If you look at SQL codegenerated by the Access query wizard, you will see that it is done all the time, regardless of whether the name is ambiguous or not. But in most applications where field names are alldifferent you don't bother doing it because it's just too much extra work.

Which brings me to my next point. If you're a typical programmer, you will want to save every

keystroke you can. Typing table names all over the place is a pain. To avoid it, use aliases for thetables - the alias is a name that you give the table in the FROM clause and then you use iteverywhere else in the statement:

SELECT tm_Num, tm_Date, T.EmpID,e_Fname, e_Lname, e_Tel

FROM Employee E, TS_Master T

WHERE T.EmpID = E.EmpID

AND T.EmpID = 'A1111';

OK, so it may not be an earth-shattering improvement but it is an improvement.

The standard syntax for using an alias is:

SELECT Alias1.field, Alias2.field, .....

FROM Table_name AS Alias1, Table_name AS Alias2 ...;

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 55/58

The 'AS' operator is optional and most people don't use it.

Don't worry about using the alias in the SELECT clause before it's been named in the FROMclause - that's the way it works.

Case 2: You have to join 2 tables to one primary key:

In our application, Department Head, the column d_Head, contains an employee_ID. If you wantto create a query to display both the employee's name and the department head's name, you willhave to join 2 Fk to the Employee Pk. But the rules don't allow for 2 joins to one key. Thesolution is to use an alias for the Employee table - in fact, consider the Employee as if it were 2tables, one for the employee's name and the other one for the department head's name.

SELECT tm_Num, tm_empID, E1.e_Lname, E1.e_Fname,d_deptname, d_Head, E2.e_Lname, E2.e_fName

FROM TS_Master, Department, Employee E1, Employee E2

WHERE tm_EmpID = E1.e_ID

AND d_Head = E2.e_ID

AND E1.e_Dept = d_DeptNum;

The same technique applies when you must show an employee's name and a project leader'sname from timesheet information. In the application, project leaders are only identified in theProject table by their employee ID. As in the previous example, you have to use an alias for the

Employee table to get both employee name and project leader name in the query.

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 56/58

SELECT tm_Num, tm_empID, E1.e_Lname, E1.e_Fname, td_ProjNum,

p_Title, p_Leader, E2.e_Lname, E2.e_fName

FROM TS_Master, TS_Detail, Project, Employee E1, Employee E2

WHERE tm_Num = td_Num

AND p_Number = td_ProjNum

AND tm_EmpID = E1.e_ID

AND p_Leader = E2.e_ID;

OUTER JOINS

The joins we have been doing so far have all been Inner Joins. That is the kind of join that isdone by default when you join two or more tables. It means that the only rows displayed by thequery are those where all the columns asked for by the query contain valid data. But what if some columns are null and you still want to see them? There is another form of join for thosecases and it's called an Outer Join.

In the previous lesson we had a case where we wanted to see all the departments involved in

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 57/58

 projects in Germany. The only way to get that is by joining all the tables involved in providingthe chain from 'Project' to 'Department'. Since all the joins are inner joins by default, only those joins where both columns exist will be displayed.

Let's look at the example again, but simplified.

Query: List all countries for which there are timesheets.

SELECT p_Number, p_Title, p_Country, td_Num, td_Hours

FROM Project, TS_Detail

WHERE p_Number = td_ProjNum;

You get the list of all countries for which timesheets have been submitted:

However, if you want to see all countries for which there are projects, including those with notimeshheets, you must do an outer join which, in Access, is written as LEFT JOIN because youwill see all the rows from the table named named on the left in the JOIN clause:

SELECT p_Number, p_Title, p_Country, td_Num, td_Hours

FROM Project LEFT JOIN TS_Detail

ON Project.p_Number = TS_Detail.td_ProjNum;

For some reason Access insists on having the table_name included in the ON clause.

You get the list of all countries even if timesheets have not been submitted:

8/8/2019 SQL and Modeling Word

http://slidepdf.com/reader/full/sql-and-modeling-word 58/58

Outer joins are not used a lot in SQL but they do have a few crucial applications.

If you ever have to design any kind of scheduling or reservations system, you will have to use

Outer Joins.

For example, a Doctor's office where you have an Appointments table joined to a Patients table,

the only way you can see both the booked slots and the empty slots in a query is to use an Outer 

Join.

For Hotel reservations where you have to see the rooms which are occupied as well as those that

are free, you will have to Outer Join the tables Rooms and Customers.


Recommended