Date post: | 20-Dec-2015 |
Category: |
Documents |
View: | 223 times |
Download: | 3 times |
AUAdvanced Database
Topics
Professor J. Alberto Espinosa
Business Analysis and Data Design ITEC-630 Fall 2008
2
Agenda
• Client-server computing and database servers
• Connecting databases to the web
• Other advanced database topics:– Database administration– Transactions– Concurrency– Distributed databases– Data warehouses
4
Client-Server Computing
• A key technological development in the 90’s
• A form of “distributed computing”
• Most predominant computing architecture today
• Software application (i.e., processing) is split into tasks
• These tasks are distributed among computers
• Depending where it is more efficient to do the processing
5
Clients and Servers
Clients– Request specialized services from servers, and – Perform other tasks for users (e.g., screen displays)
Servers– Acknowledge service requests from clients, and – Provide requested services (i.e., tasks, processes)– Via responses to clients
• Servers and clients connect via networks
6
Response (e.g., query
results)
Client-Server Computing
Network
DatabaseClient
Service Request(e.g., a web page)
Response(e.g., a web page)
OtherClient
Web Server
DatabaseServer
Browser OtherServer
Request (e.g., SQL query)
7
Examples of Servers
• A server can be hardware, software or both• File Server central file storage, process file requests
(ex. Novell’s NetWare, Windows NT)• Database Server back-end DBMS functions
(ex. MS SQL Server, Oracle Server, Lotus Notes Server)• Web Server store and fetch web files on request
(ex. Apache, Microsoft IIS)• Print Server print job queuing for central printers• Mail Server routes mail to users and other mail servers
8
Examples of Clients
• A client can be hardware, software or both
• Networked PCs request files and other services from file servers (Windows 2000, XP)
• Database Clients request records from database server, process data locally, screen formatting, etc. (Lotus Notes client, MS Access)
• Web Browsers request web files from web servers, translate HTML code into formatted screen displays(Internet Explorer, Netscape)
• Mail Client Send/retrieve mail to/from mail servers, organize and display user mail(Outlook Express; Lotus Notes mail client)
9
Generic Client-Server Architecture
Client Server
Request
Network
Hardware Platform
Client Operating System
Application Software (Client Portion)
Client Communication Software
Hardware Platform
Server Operating System
Application Software (Server Portion)
Server Communication Software
Response
Communication Protocols
10
Ex.: “Thin Client” or “Fat Server” model Most of the processing is done by the server
Thin Client Fat Server
Network
Hardware Platform
Client Operating System
Client Communication Software
Hardware Platform
Server Operating System
Application Software
Server Communication Software
Presentation Software
DBMS
Example:
•Web Server-Browser Applications
•Easy to deploy applications
•Great for electronic commerce
•Easy to support and upgrade applications for distributed use
11
Fat Client Thin Server
Network
Hardware Platform
Client Operating System
Client Communication Software
Hardware Platform
Server Operating System(incl. file management)
Server Communication Software
Presentation Software
Application Software
Example:
•File Servers (Novell’s NetWare—your G drive, Windows NT) – file servers do very little – e.g., giving you access to shared drives and folders
Ex.: “Fat Client” or “Thin Server” model Most of the processing is done by the client
13
Example: Client-Server Database Management Systems (DBMS) (fat client)
Client Computer Server Computer
Request
Network
Hardware Platform
Client Operating System
Front-EndDatabase Client SW
Client Communication Software
Presentation Software
Hardware Platform
Server Operating System
Back-EndDatabase Server SW
Server Communication Software
Response
Database Application
Databases
14
Ex.: Web Client-Server
Client Server
Network
Hardware Platform
Client Operating System
Client Communication Software
Hardware Platform
Server Operating System
HTTP
Server Communication Software
Browser Web Server
HTTP
TCP/IP TCP/IP
Web pages (HTML and other files)
15
SQL Queries
Ex.: Web Client-Server + Database Server (thin client)
ClientServer
Network
Hardware Platform
Client Operating System
Client Communication Software
Hardware Platform
Server Operating System
HTTP
Server Communication Software
Browser
Web Server
HTTP
TCP/IP TCP/IP
Web pages (HTML and other files)
DBMS Server
Databases
HTML Form
HTML Response
AUDynamic Web Pages:
Connecting Web Pages to Databases
Request (ex. get a price quote, place an order)
Response (ex. query results with HTML-formatted product price or order confirmation notice)
18
Client / Server Computing:BrowserClient / Web + Database Servers
= Dynamic Web Pages
Network Service Request
Response:DynamicallyFormatted
HTML Pagew/Results
Client
WebServer
Browser Server
Database Server
(usually runs in the same
computer as the web server)
DatabaseQuery String Query
ResultsResults
ClickSubmit
19
Setting Up Your Own Site With Dynamic Web Capabilities
Steps:
1. Register your own domain name (e.g., my domain is www.Jibe4Fun.com) – there are hundreds of domain registration services ($20 to $40 per year to keep your domain name active) (through a service like http://DomainName.com)
2. Contract web hosting services with a company to hold your web pages – there are hundreds of web hosting services ranging from ($100 per year for a few MB of storage to highly priced commercial-strength e-commerce services (through a service like http://www.Alentus.com)
3. Map your domain name to your web hosting service (5 minutes)4. Design, normalize and populate your database(s)5. Design and develop your HTML files and related scripts6. Upload your HTML files, scripts and databases to your assigned
web space with your web hosting service
21
HTTP and Static HTML
HTTP = a document fetching protocol:
1. User clicks on URL with HTTP protocol2. Client requests connection to server, server connects3. Browser requests HTML page to web site4. Server finds/sends HTML page to client “AS IS”5. Client’s browser interprets and displays HTML doc6. Server disconnects from client
HTML is static: text (info) and tags (formatting), ex.:
<FONT SIZE=2><BOLD>Hello!! </BOLD><U>there</U></FONT>
Displays as: Hello!! there
22
Client BrowserInternet Explorer
Netscape Navigator
Static HTML via HTTP
Web ServerMicrosoft Internet Information Server (IIS)
Apache
Requestconnection
to server andfile.html
Open connection and
findHTML docfile.html
Send HTML doc and close
connection
23
Static HTML:HTTP Shortcomings
• Corporate information is dynamic As corporate information changes, Database contents change too Web pages need to change too By hand? Or, do we link to databases?
• How to customize displays for different users?
24
How to make web pages Dynamic?
2 generic solutions (workarounds) to static HTML:
1. Client-side scripting• Scripts that are processed by the browser in
the local machine
2. Server-side scripting• Scripts that are processed by the web server
25
Client-Side Scripting
• Script commands embedded in HTML file• Browsers need capability to process scripts• Processing is done by browser AFTER page is
fetched from server• Useful for interactive and visual effects• Browser must support scripting language• Most popular: JavaScript, VB Script
26
HTML lines
<SCRIPT LANGUAGE = “JavaScript”> script lines</SCRIPT>
More HTML lines
<SCRIPT LANGUAGE = “Perl”> script lines</SCRIPT>
More HTML lines …………
Client-Side ScriptingEmbedding Client-Side Scripts in HTML
29
Other examples: http://auapps.american.edu/~alberto/images/BouncingDots.html http://auapps.american.edu/~alberto/images/BouncingHearts.html
Example 3
30
Server-Side Scripting
• Script commands embedded in HTML file• The server must have capability to process scripts• Processing is done by web server BEFORE page is sent to
browser• Useful to customize pages based on data stored on the server
(databases, images, etc.)• And for centralized processing (at the server)• Web sever must support the scripting language
• For example: – Microsoft’s Active Server Pages (ASP)– Which is a web scripting environment– It runs on Microsoft IIS (Internet Info Server) Web Servers– Supports VB Script or JScript (MS version of JavaScript)
• Other scripting languages– PHP: Like ASP, Open Source for Apache servers– Perl: used with CGI scripts (Unix servers)
31
• Embedded scripts in HTML page
HTML code (i.e., tags and text)<% ‘ Everything after <% is an ASP script
‘ Note: use quote for comments ASP script code (using VB Script as default or other as declared) ………..………. ………………....
%> ‘ ASP script ends with %>More HTML code
<% more ASP %>Etc.
Server-Side Scripting with Microsoft’s ASP
32
How ASP Works:
1. Web file needs to be named .asp (instead of .html) User clicks on URL with .asp file Browser sends request for .asp file to server
2. Web server notices file extension .asp and Loads a program (DLL) called ASP.DLL Which processes this and other .asp files Server generates a “new” web file Contains all original HTML stuff Plus processing results from ASP scripts These are dynamically formatted w/HTML tags
3. Server sends the “new” web file to the browser Not the “original” ASP file!!
33
How ASP Works
Microsoft’s Web Server(ASP + MS Access or SQL Server)
Client BrowserInternet Explorer
Netscape Navigator
Requestfile.html
Requestfile.asp
Query Results
(recordset)
Databases
SQL Query (if any)
Process Scripts
HTMLfile.aspasp.dll
HTML docFetched
(+ Client-Side scripts, if any)
Response HTML doc Generated On-the-Fly
file.asp file.asp
=file.html file.html
34
Dynamic HTML with ASP
<H3>Welcome to my page</H3>
<H2>Here is my product list</H2>
<% ‘Start ASP script
Open a database connection
SQL queries to database
Copy results to a record set
Display records one at a time
Close database connection
%> ‘End ASP script
<P>Thank you very much for inquiring about our products
ASP file on web server (file.asp) HTML file sent to browser (file.asp)
<H3>Welcome to my page</H3><H2>Here is my product list</H2>
<P>Thank you very much for inquiring about our products
HTML
HTML
Dynamically generated HTML lines
by ASP
<P><B>Product Price</B><HR><P>Hammer ……... $8.50<P>Pliers ……….… $7.79<P>Screwdriver ..… $4.50<P>Power Drill ….. $49.99<P>Chainsaw …… $95.95<P>Wrench ……….. $6.50
35
Common Uses of ASP with Databases
• Register a client (add record in database)
• List products & services (query database)
• Place orders (add records in database)[Illustrations: Database Design Shopping Cart Order Entry]
• Track order status (query database)
• Tech support (query a knowledge database)
• Fill out a survey (add records in database)
38
Example: Query Results Sent to Browser(HTML dynamically generated by previous ASP script)
<IMG SRC="music22.gif"><B>Alberto's Music Instruments, Inc.<p>
<TABLE BORDER="0"><B>Customer List</B>
<TR><TH>ClientID</TH> <TH>Client Name</TH>
<TH>Shipping Address</TH> <TH>Telephone</TH> </TR>
<TR><TD>josee</TD>
<TD>Alberto Espinosa</TD>
<TD>Schenley Park, GSIA Building, #20</TD>
<TD>412-268-3681<BR></TD> </TR>
<TR><TD>sandy</TD>
<TD>Sandra Slaughter</TD>
<TD>5000 Forbes Avenue, Pittsburgh PA 15213</TD>
<TD>412-268-3681<BR></TD> </TR>
etc.
</TABLE></BODY></HTML>
40
Using Forms with ASP, HTML and Databases
• Capture data from user using HTML forms• Feed form data to an ASP script • Which is what the “Submit” button does• HTML forms contain data items with field names• Which are passed to ASP scripts for processing• Often used to embed an SQL command• To query a database (product list, etc.)• Or to insert records in a database (orders, etc.)
41
Example: HTML Form (Data Input)Doesn’t have to be ASP, can be plain HTML
<B>Customer Registration</B><P>
<FORM ACTION= “http://softrade-11.gsia.cmu.edu/data/customerSubmit.asp” METHOD=“POST”>
<TABLE><TR><TD>Please enter a customer ID (4 to 16 characters)</TD> <TD><INPUT TYPE=“text” SIZE=“35” NAME="CustomerID"> </TD></TR><TR><TD>Please enter your name</TD> <TD><INPUT TYPE=“text” SIZE=“35” NAME="CustName"> </TD></TR>etc.</TABLE>
<INPUT TYPE="submit" VALUE=“Submit”></TD></TR>
</TABLE></FORM>
On submit,Pass on to
Form Object
42
See: http://www.jibe4fun.com/scripts/orders/Customer_Input.htmlhttp://www.jibe4fun.com/scripts/orders/
43
Example: ASP Processing Data from Forms<!-- customerSubmit.asp -->
Request From Form
Object
Add record in database
44
ASP Resources• A periodic publication on ASP. It contains articles with ASP issues
as well as some tips and tricks: http://www.asptoday.com
• Nice introductory book with examples and a web site where you can download running code: Beginning Active Server Pages 3.0 http://www.wrox.com/books/0764543636.shtml
• A more advanced book. Probably one of the most useful reference books to have for people who are doing serious ASP coding: Professional ASP.NET 1.0. http://www.wrox.com/books/0764543962.shtml
• A useful book if you need more help with Visual Basic Script: VB Script Programmer's Reference. http://www.wrox.com/books/0764543679.shtml
• A good and concise reference of ASP objects for those who already know ASP fairly well: ASP in a Nutshell, Weissinger and Petrusha, O'Reilly series. http://www.amazon.com/exec/obidos/ASIN/1565924908/
45
Server-Side Processing:• JSP (Java Server Pages): Sun's version of ASP (*.jsp files)• ColdFusion (*.cfm files), Dreamweaver (Macromedia)
http://www.macromedia.com/
• (Like ASP but) Open Source – PHP (*.php files)• Lotus Notes & Domino
IBM, http://lotus.com/home.nsf/welcome/domino
Other Related Technologies
46
Extensible Markup Language (XML)
• Standard for inter and intra-organizational data exchange• Very important for B2B e-commerce applications• Like HTML, but used to fetch data, not documents• Each tag is defined data, not formats, ex.:
<LastName>Espinosa</LastName><FirstName>Alberto</FirstName><U>josee</U> (not underline, just a variable called U)
• Data defined in "Document Type Definition" files (DTD)• Data itself in XML file• Need and XML processor to process XML data
(a browser is an HTML processor)
Other Related Technologies (cont'd.)
47
<?xml version="1.0" encoding="UTF-8"?> <Recipe name="bread" prep_time="5 mins" cook_time="3 hours">
<title>Basic bread</title> <ingredient amount="3" unit="cups">Flour</ingredient> <ingredient amount="0.25" unit="ounce">Yeast</ingredient> <ingredient amount="1.5" unit="cups“ state="warm">Water</ingredient> <ingredient amount="1" unit="teaspoon">Salt</ingredient>
<Instructions> <step>Mix all ingredients together, and knead thoroughly.</step> <step>Cover with a cloth, and leave for one hour in warm room.</step> <step>Knead again, place in a tin, and then bake in the oven.</step>
</Instructions> </Recipe>
Example: baking bread with XML
48
<RECORD>1</RECORD><FIRSTNAME>Alberto</FIRSTNAME><LASTNAME>Espinosa</LASTNAME><EMAIL>[email protected]</EMAIL><PROFESSION>Professor</PROFESSION><SCHOOL>American University</SCHOOL><DEPARTMENT>Information Technology</DEPARTMENT><REMARKS>Looks tired, needs vacation</REMARKS>
<RECORD>2</RECORD><FIRSTNAME>Gwanhoo</FIRSTNAME><LASTNAME>Lee</LASTNAME><EMAIL>[email protected]</EMAIL><PROFESSION>Professor</PROFESSION><SCHOOL>American University</SCHOOL><DEPARTMENT>Information Technology</DEPARTMENT><REMARKS>He teaches the other 2 MIS sections</REMARKS>
<RECORD>3</RECORD><FIRSTNAME>Jill</FIRSTNAME><LASTNAME>Klein</LASTNAME><EMAIL>[email protected]</EMAIL><PROFESSION>Professor</PROFESSION><SCHOOL>American University</SCHOOL><DEPARTMENT>Information Technology</DEPARTMENT><REMARKS>She teaches the MBA MIS course</REMARKS>
AnotherXML
ExampleCan you draw a
table that contains the following data?
50
Business to Business E-Commerce Example
using XML
Internet
Supplier
Buyer DBMS(e.g., Oracle)SELECT
query
XML Processor
XML Document (e.g., Purchase
Order)
DBMS(e.g., MS
SQL Server)
INSERT query
XML Processor
XML Document (e.g., Purchase
Order)
Query results
52
Scale Issues
• Scale Issue #1: Large databases need to be managed Database Administration
• Scale Issue #2: Large database applications need to update multiple tables simultaneously Transactions
• Scale Issue #3: Multiple users using, updating and querying the database Concurrency
• Scale Issue #4: Large database application with wide geographic scope Distributed databases
• Scale Issue #5: Multiple data sources needed for decision making Data warehouses
54
Database Administration
“A technical function that is responsible for physical database design and for dealing with technical issues such as security enforcement, database performance, and backup and recovery.” -- Hoffer et. al.
55
Database Administration Functions
• Data policies (e.g. every user must have a password, authorizations, access to data), procedures (e.g., data must be backed up daily) and standards
• Develop the organization’s information architecture(i.e., understand the org’s information requirements)
• Data ownership conflict resolution• Manage data repositories (metadata, data dictionary)
56
Database Administration Functions (cont’d.)
• Hardware and software selection• Install and upgrade the DBMS• Tune database and query processing performance• Manage data security (threats to security, user views,
access/authorization rules for users and applications), privacy (encryption) and integrity
• Data backup and recovery (audit trails, transaction logs, change logs, transaction integrity, recovery management)
57
Metadata and Data Dictionaries• Metadata = data about the data• Data dictionary = a form of metadata• Relational data dictionary = tables with data about the
database Tables(TableName, TableDescription) Fields (TableName, FieldName, FieldType, Length) Forms (FormName, FormDescription) Etc.
• Data dictionaries can be passive or active, depending on the DBMS
• Passive data dictionary: only used to document the data• Active data dictionary: database design, access and update is
all done through the data dictionary
58
Data Dictionary Example
http://auapps.american.edu/~alberto/itec630/DBLab&DataDictionary.mdb [local copy]
60
What is a Transaction?• A transaction is a logical unit of work• No portion of a transaction stands by itself• It represents a real-world event• For example, a product sale has an effect on
accounting records, inventory records, customer transaction files, cash register balances, etc.
• A transaction must take a database from one consistent state (i.e., one in which all data integrity constraints are satisfied – e.g., entity integrity, referential integrity, etc.) to another consistent state
• So, all portions of the transaction must execute as a whole, or none at all
61
Preventing an inconsistent database state
• Acceptance of an incomplete transaction will yield an inconsistent database state
• To avoid such a state, the DBMS ensures that all of a transaction's database operations are completed before they are committed to the database
• A transaction begins with a database in consistent state A• All table updates in a transaction are then executed• When/if completed, the database ends in consistent state
B• When/if this happens, the DBMS COMMITS the transaction• When/if the transaction is interrupted before full completion,
the transaction is aborted and DBMS ROLLS BACK the database to its previous consistent state A
62
Transaction support• SQL in advanced DMBS’s provides transaction• This is supported with the COMMIT and ROLLBACK
statements
• COMMIT: Permanently saves changes to disk• ROLLBACK: Restores the database to its previous
consistent state before the transaction started
63
Example of a Transaction•For example, to process an order:
1.Add a record in the Orders table2.Add a record in the LineItems table for each product
ordered3.Update client history file
Program (a “stored procedure”):
BEGIN TRANSACTION ON TRANSACTION INCOMPLETE ROLLBACK (often implicit, thus omitted) INSERT (“19944”, “alberto”, 12/12/2003, “Top Priority”) INTO Orders INSERT (“19944”, 1, “comp”, 24) INTO LineItems INSERT (“19944”, 2, “keybd”, 14) INTO LineItems INSERT (“19944”, 3, “mouse”, 22) INTO LineItems UPDATE Clients SET ClientAmt = ClientAmt + $12,340 WHERE ClientID =
“alberto” END TRANSACTION COMMIT
64
The four transaction properties are:
• Atomicity requires that all parts of a transaction must be completed or the transaction is aborted. This property ensures that the database will remain in a consistent state.
• Durability indicates that the database will be in a permanent consistent state after the execution of a transaction. In other words, once a consistent state is reached, it cannot be lost.
• Serializability means that a series of concurrent transactions will yield the same result as if they were executed one after another.
• Isolation means that the data required by an executing transaction cannot be accessed by any other transaction until the first transaction finishes. This property ensures data consistency for concurrently executing transactions.
65
Transaction Log• Is a special DBMS table that contains a description of
all the database transactions executed by the DBMS
• It plays a key role in maintaining database concurrency control and integrity.
• The information stored in the transaction log is used by the DBMS to ROLLBACK the database after a transaction is aborted or after a system failure.
• The transaction log is often stored in a different hard disk or in a different media (tape) to prevent the failure caused by a media error.
67
What is concurrency?• A common problem in computer systems occurs when a
shared resource (e.g., screen display, hard disk, data file, database record) need to be used simultaneously by more than one device, application or user.
• In database, concurrency refers to the management of simultaneous access to a shared table, record or data element by more than one person, application or transaction in a multi-user environment (e.g., 100 data entry clerks entering data in the same table)
• This is managed by “locking” tables, records or individual data elements when necessary
68
What is a lock?• Mechanism used in concurrency control to guarantee the
exclusive use of a data element to the transaction that “owns the lock”.
• For example, if the data element X is locked by transaction T1, transaction T2 will not have access to the data element X until T2 releases the lock.
• Generally speaking, a data item can be in only two states: locked (being used by some transaction) or unlocked (not in use by any transaction).
• To access a data element X, a transaction T1 must request a lock to the DBMS. If the data element is not in use, the DBMS will lock X to be used by T1 exclusively. No other transaction will have access to X while T1 is executed
• Soft lock: locked element can be read (queried) but not modified
• Hard lock: locked element cannot be accessed at all
70
Definition
• Distributed Database: “a single logical database that is spread physically across computers in multiple locations that are connected by a data communications link.” -- Hoffer et al.
71
Why distributed databases?
• Distributed autonomous business units• Data sharing across business units• Database recovery and redundant systems• Risk: eliminate single points of failure• Efficiency: locate data where it is needed the most
72
Distributed Database Options
• Homogeneous:Same DBMS at each node
• Heterogeneous:Different DBMSs at different nodes.
73
Objectives of Distributed DBMS
• Location Transparency: – The user doesn’t need to know the location
of the data– The location of the data is stored in the data
dictionary so that the DBMS can find it
• Local Autonomy: – Local site can operate with its database
when other sites are down
74
Trade-offs in distributed databases
• Synchronous Distributed Database:– All copies of the database data are always identical –
i.e., synchronized
• Asynchronous Distributed Database:– Data may be temporarily unsynchronized– Data is replicated and synchronized with delay
75
Options forDistributing a Database
• Data replication: keeping separate copies of the same data, replicated/synchronized periodically
• Horizontal partitioning: store some table rows (i.e., records) in one location and some in another
• Vertical partitioning: store some table columns (i.e., fields) in one location and some in another
• Distributed tables: store some tables in one location and some in another location
• Combinations of the above• The (distributed) data dictionary contains information about the
physical location of all tables, columns and rows
77
Definition• Data Warehouse: “a subject-oriented,
integrated, time-variant, non-updatable, organized collection of data gathered from a variety of sources to support management decisions”
-- Hoffer et al.
78
Data Warehouse Architectures
• Two-level – Data is extracted from various internal and sources– Then transformed and integrated in data warehouse
• Independent Data Mart Warehousing Environment– Data is extracted from various internal and sources– Then transformed and exported to independent data marts– A data mart is a smaller data warehouse of limited scope– Customized for decision making of different groups
• Dependent Data Mart with EDW (three-level)– Combines the two methods above– Data is integrated into an Enterprise Data Warehouse (EDW)– Which are used to load the dependent data marts
79
Two-Level Architecture
DataSource
DataSource
DataSource
Transformand
Integrate
DataWarehouse
Decision SupportEnvironment
OperationalData
80
Independent Data Mart
DataSource
DataSource
DataSource
Transformand
Integrate
Decision SupportEnvironment
DataMart
DataMart
DataMart
OperationalData
81
DataSource
DataSource
DataSource
Transformand
Integrate
EnterpriseData
Warehouse
Decision SupportEnvironment
Dependent Data Mart & EDW
DataMart
DataMart
DataMart
OperationalData
82
Star Schema• Also called the dimensional model.• Fact and dimension tables.
– Fact table: consists of factual or quantitative data about the business
– Dimension table: hold descriptive data
• Grain of a fact table - time period for each record (e.g. Monthly, weekly, every transaction).
85
Size of the fact table• Total number of stores: 1,000
• Total number of products: 10,000
• Total number of periods: 24
• Total rows: 1000 * 10,000 * 24 = 240,000,000
• On average 50% items record sales,– no of rows = 120,000,000
86
Data Warehouse“A database that stores and consolidates current and historical data from various systems (internal and external) with tools for
management reporting and sophisticated analysis—i.e., Datamining”