+ All Categories
Home > Documents > Data Management Task Force Final Report (166163745)

Data Management Task Force Final Report (166163745)

Date post: 14-Apr-2018
Category:
Upload: educause
View: 224 times
Download: 0 times
Share this document with a friend
25
7/29/2019 Data Management Task Force Final Report (166163745) http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 1/25 Data Management Task Force Final Report CAUSE INFORMATION RESOURCES LIBRARY The attached document is provided through the CAUSE Information Resources Library. As part of the CAUSE Information Resources Program, the Library provides CAUSE members access to a collection of information related to the development, use, management, and evaluation of information resources- technology, services, and information- in higher education. Most of the documents have not been formally published and thus are not in general distribution. Statements of fact or opinion in the attached document are made on the responsibility of the author(s) alone and do not imply an opinion on the part of the CAUSE Board of Directors, officers, staff, or membership. This document was contributed by the named organization to the CAUSE Information Resources Library. It is the intellectual property of the author(s). Permission to copy or disseminate all or part of this material is granted provided that the copies are not made or distributed for commercial advantage, that the title and organization that submitted the document appear, and that notice is given that this document was obtained from the CAUSE Information Resources Library. To copy or disseminate otherwise, or to republish in any form, requires written permission from the contributing organization. For further information: CAUSE, 4840 Pearl East Circle, Suite 302E, Boulder, CO 80301; 303- 449-4430; e-mail [email protected]. To order a hard copy of this document contact CAUSE or send e-mail to [email protected]. Administrative Information Services (AIS) Academic Information Systems (AcIS)  Data Management Task Force: Final Report 31 January 1992  SECTION 1: OVERVIEW AND BACKGROUND  OBJECTIVES  The task force on Data Management was commissioned by Michael Marinaccio, DVP for Administrative Information Services, and by Vace Kundakci, Deputy Vice President for Academic Information Systems. The group was tasked with a two-fold objective of recommending (1) a technical environment to support data demands, and (2) an organizational structure that can support data ownership, custodianship, and administration. The orientation of the task force was to be on future projects and technology rather than on the current reengineering of existing systems.
Transcript
Page 1: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 1/25

Data Management Task Force Final Report

CAUSE INFORMATION RESOURCES LIBRARY

The attached document is provided through the CAUSEInformation Resources Library.

As part of the CAUSE Information Resources Program, theLibrary provides CAUSE members access to a collection ofinformation related to the development, use, management, andevaluation of information resources- technology, services,and information- in higher education. Most of the documentshave not been formally published and thus are not in generaldistribution.

Statements of fact or opinion in the attached document aremade on the responsibility of the author(s) alone and do notimply an opinion on the part of the CAUSE Board of Directors,officers, staff, or membership.

This document was contributed by the named organization tothe CAUSE Information Resources Library. It is theintellectual property of the author(s). Permission to copy

or disseminate all or part of this material is grantedprovided that the copies are not made or distributed forcommercial advantage, that the title and organization thatsubmitted the document appear, and that notice is given thatthis document was obtained from the CAUSE InformationResources Library. To copy or disseminate otherwise, or torepublish in any form, requires written permission from thecontributing organization. For further information: CAUSE,4840 Pearl East Circle, Suite 302E, Boulder, CO 80301; 303-449-4430; e-mail [email protected].

To order a hard copy of this document contact CAUSE or sende-mail to

[email protected].

Administrative Information Services (AIS)Academic Information Systems (AcIS)

 Data Management Task Force: Final Report

31 January 1992 SECTION 1: OVERVIEW AND BACKGROUND OBJECTIVES

 The task force on Data Management was commissioned by MichaelMarinaccio, DVP for Administrative Information Services, and byVace Kundakci, Deputy Vice President for Academic InformationSystems. The group was tasked with a two-fold objective ofrecommending (1) a technical environment to support data demands,and (2) an organizational structure that can support dataownership, custodianship, and administration. The orientation ofthe task force was to be on future projects and technology ratherthan on the current reengineering of existing systems.

Page 2: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 2/25

 The task force members included:

 Alan CrosswellTerry DavidsonBob Juckiewicz (Chair)Ken LeeDavid MillmanLou ProyectSteve RosenthalFred Trickey

 Interviews and presentations were held with:

 David Bloom, AMSJoe Judenberg, Computer AssistantsMike Titmus, AMS

 CURRENT ENVIRONMENT IS FUNCTIONALLY ORIENTED

 Our current portfolio of applications was designed over many yearswith differing technologies, making it difficult to meet newrequirements, or to answer managerial questions. Today we maintainredundant data in functionally-oriented systems, thus making it

difficult to link data between applications. THE TIME IS RIGHT

 AIS is presented with the task of reengineering its applicationportfolio within a relatively short period of five years. Thisprovides us with an opportunity to put into place a strongorganization and process for data requirements to become the basisof its architecture. As a beginning step, AIS is developing a high-level entity model of the University's administrative data needs.This model will be the starting point for integrating data acrossnew application systems. As the University implements thesesystems, AIS will work with the vendor(s)--most likely AMS--to

insure that the new systems correlate to the model and that dataredundancy is eliminated. 

GUIDING PRINCIPLES 

The following major principles guided the task force as itexplored the issues of data management:

 - Data is a University resource and its use for decision-making

will be made available to all those with a need to view theinformation.

 - The architecture for administrative information systems will be

driven by the data model. A data architecture is the foundationby which the University can quickly respond to changingregulatory and University requirements. New applications mustfollow the model, with every attempt to eliminate dataredundancy. (The task force, however, realizes that there may bea need for "planned redundancy" of data, particularly indecision support systems.)

 - There are three classes of data--University, departmental, and

personal--that must be considered when defining new systems and

Page 3: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 3/25

access privileges. 

- The data model and associated tools will be made available toall, with appropriate safeguards and procedures to protect theintegrity of the model.

 - Standard access methods will be employed for connectivity and

interoperability. SECTION 2: TECHNICAL ISSUES 

RELATIONAL DATABASE MANAGEMENT SOFTWARE ON A HOST COMPUTER 

For the next couple of years AIS will be implementing new systemsby acquiring packages. American Management Systems (AMS), theleading application software vendor, has designed computerizedsystems for large complex universities, primarily on IBM mainframeplatforms. With the recent installation of the ES 9000 computer,Columbia will, for the next several years, be an IBM mainframe-based shop. Higher education is coming under increasing financialconstraints and additional regulatory requirements. Thisenvironment requires AIS to respond quickly to the demands formanagement information for decision making. We believe that arelational database management system will provide the underlying

technology to permit quicker response to these changing needs. Inour IBM environment, DB2 is the logical relational database ofchoice, and the Task Force worked under this assumption. DB2 haswon wide acceptance in the marketplace and many third-partyvendors have developed tools to support the product.

 A major attraction of a relational database is that users willfind it easier to understand their data and reports. Users caneasily see the idea of a table made up of rows and columns. Forthe technical staff we expect to see increased productivity, asthe staff will be able to alter the system, to add fields, and todefine new relations without affecting production programs.

 

A drawback to relational systems is that they require more CPUresources than VSAM file structures or an IMS database. Relationalsystems, because of their nature, will always require moreresources than other file structures and therefore DB2 may not besuitable for high-volume transaction systems. However, IBM hasmade important strides to address this issue and each new releasehas seen an improvement in performance. In Columbia's environment,there is the luxury of not having a tremendous need for highperformance, high volume transaction processing systems. Anoccasional high volume, for example, during peak processing ofregistration, can be accommodated through prudent machine tuningand timing of other work.

 

The use of a relational database on a mainframe will notnecessarily marry the University to this solution forever. ThroughAMS's layered approached to software development, it is possibleto migrate from this environment to another. The current idea ofdistributing data across processors is an attractive one becauseit offers the prospect for exploiting cheap MIPs available onworkstations and UNIX hardware. However, the task force's researchhas found that the distributed database software to do large,complex systems will probably not be available at the earliestuntil 1993-94  ( see section on distributed database).

Page 4: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 4/25

 OLTP vs. DSS: HOW MANY SUBSYSTEMS?

 It is necessary to isolate operational DB2 for On-line TransactionProcessing (OLTP) in its own subsystem. If a Decision SupportSystem (DSS) or any other SQL application is to be developed, aseparate DB2 development subsystem is required. Each subsystemrequires at least 15M of real storage for the DB2 address spaces and additional real storage for the "allied agent" address spacefor the host language program. An additional CICS region wouldrequire approximately 5M of real storage.

 Two DB2 runtime environments should be established, controlled byseparate catalogs. One can be used as a development, test, andperhaps as a DSS region for a pilot application. The productionenvironment would be tuned for OLTP, exclusively. We will needprocedures, and probably tools, to facilitate migration ofapplication plan packages between test and production DB2. Thisraises the question of the relationship between databaseadministration and tech support, which will have to be empowered(and trained) to perform this task.

 It would seem that the concurrency effects described below mightrequire yet another DB2 subsystem for a DSS to be usable by its

customer base. The package plan feature of DB2 V2.3 makes the planrebind in (in 3, below) unnecessary, so that applicationdevelopers hold exclusive locks on the catalog for shorterperiods. Of course, this allows more development to take place.OLTP, DSS, and application development make different andcontradictory demands of DB2 subsystems. We recommend that forDSS, AIS explore distributing data to a functional server.However, we do not know the capacity constraints of such adistributed system. A pilot project supporting actual DSSrequirements should be started with the particular task ofexploring the performance envelope of different approaches to DSS.

 What about standard reports? Reporting can be handled as it is

today: it is a matter of scheduling reports to run at a time whenthe concurrency effects noted below are not a problem. There is noDB2 batch mode as such, although DB2 can be configured for batchprocessing (it needs to be brought down to accomplish this, orelse an expensive tool will change the Z parms on the fly).Reports that reference the same file may run at the same time: thebuffer pool will then contain a large number of the file's pages,which are a shared resource in the context of batch reporting.

 The following cases illustrate concurring effects within a DB2subsystem:

 1) It would be unusual to allow DB2 data that is participating in

OLTP to be referenced by queries from outside the transactionprocessing system. An OLTP system supports quick access to alarge database by a large population of concurrent users. Thetransactions often include updates. A user process that wishesto update a database must first lock some portion of thedatabase so that no one else may update, or even read, the datathat is about to be changed. In DB2 the granule of locking istypically the 4K page containing the data, although locks canbe obtained (and sometimes promoted dynamically) to the tableor tablespace. If the column to be updated is an indexed field,

Page 5: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 5/25

then the index page(s) to be changed must be locked. In thisway, updates are not lost, and the integrity of the data isensured.

 2) To satisfy concurrent requests for access to shared data

resources, application programs must be coded to hold locks forthe shortest time possible. While many readers share access todata, a reader will prevent a writer from acquiring theexclusive lock she needs to update a page. If the DB2 optimizerdetects that a user process is reading sequentially through atable, as would often be the case for a batch report, or eventhat the predicate in a DSS query uses an index that is notselective enough, it (the optimizer) determines that it iswasteful to lock each page in turn, scan the page, and releasethe lock. Instead, an S (share) lock is obtained on the entiretable or tablespace (depending on the storage characteristicsof the tablespace). Now no further updates are allowed to anypage of the table or tablespace until the sequential scan iscomplete.

 3) An application developer binding a new version of an embedded

SQL Data Base Request Module (DBRM) causes the plan catalog(SYSIBM.SYSPLAN) to be X (exclusive for update) locked, so thatthe new plan can be written to the catalog. Under the current

version of DB2, V2.2, all of the SQL associated with the CICStransaction is automatically rebound. While this process isgoing on, the plan table is locked, so that queries cannot readtheir plans, and cannot execute.

 4) A DSS might support dynamic SQL, so that the query needs to be

optimized in a "mini-bind" before it can execute. Theoptimization process reads certain catalog tables to obtain thestatistics it needs to generate an application plan for thequery to navigate through the database. In addition to the Slocks on the these tables, the optimizer obtains IX (intend toget exclusive) locks on the plan table (SYSIBM.SYSPLAN) whichhas the effect of making it impossible for OLTP processes to

read their plans. 5) Finally, a static SQL application plan that does not lock the

catalog and merely seeks to scan a table that is of no interestto anyone else, will flood the buffer pool with its own data.DB2 assumes that data is required as a shared resource. Tableor tablespace scans should not be allowed in OLTP environments.

 STRUCTURED QUERY LANGUAGE (SQL)

 STANDARD ACCESS THROUGH SQL

 Use of a standard dialect of SQL will enable applications and

databases to be transported from one system type to another, orto participate in distributed databases among heterogeneousdatabase server and client platforms. In an ideal world, thiswould allow a database residing on DB2, for example, to bemoved (or extracted) to a Unix or OS/2 SQL server with nochanges required to the SQL statements embedded withinapplications.

 STATUS OF SQL STANDARDIZATION

 

Page 6: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 6/25

IBM invented the SQL language for a research predecessor toSQL/DS and DB2. ANSI adopted an early version of SQL as anAmerican National Standard, warts and all. The ANSI SQLstandard has a number of failings, including not coveringenough ground.

 ANSI SQL also has some mistakes based on the IBM version whichIBM has since fixed. For the past several years, ANSI has beenworking on a SQL2 standard. However, this is still in the draftstage. SQL2 addresses many of the shortcomings of the originalstandard.

 Many database products do provide ANSI SQL compatibility, butalways as a subset of the product's full capabilities. Manyexperts say that ANSI SQL is simply too limited a subset to beuseful in real applications.

 IBM has an SAA standard for SQL. None of IBM's SQL productscurrently fully implement it. Each version (DB2, SQL/DS,AS/400, OS2/EE) has its own idiosyncrasies.

 The SQL Access Group (SAG) is a consortium of about 30 majorSQL vendors. SAG has been working in conjunction with X/Open,an open systems vendor consortium to come up with its own

standard for SQL, as well as a standard for distributed SQLbased on the OSI Remote Data Access protocol (RDA). Thisstandard is expected to appear in the next release (4) of theX/Open Portability Guide, due out in mid-January 1992.

 SAG's distributed SQL standard is a competitor to IBM'sDistributed Relational Database Access (DRDA). DRDA is an IBM-only product, designed for compatibility across IBM's severalSQL platforms. One drawback of DRDA is that it requires theapplication to know which SQL server platform it is talking toso that it can take advantage of that particular server'sflavor of SQL. The SAG-X/Open solution is to use a common SQLdialect but to allow explicit extensions to the dialect when

needed. RECOMMENDATIONS

 While SQL standardization is still in a state of flux, databaseand SQL application designers should take into account thefollowing:

 - To plan on the possibility of the database and/or application

to be re-homed to a different SQL platform in the future. 

- To keep abreast of standardization efforts and vendorcompliance. What the SQL Access Group and X/Open are up to,

especially, should be tracked, as they represent just aboutevery SQL vendor other than IBM. Also, the ANSI SQL2 draftshould be reviewed.

 - When defining database schemas, portable types should be

used, not vendor extension data types. If a vendor extensiondata type is used, it should be justified and how that datawill be transported to a different SQL platform should bedocumented.

 

Page 7: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 7/25

- When writing SQL data manipulation statements, standardconstructs should be adhered to. For example, while an outer-join operator is handy, there may be problems transportingcode to a new platform since many SQL servers don implementit.

 - SQL software vendors should be asked what their approach to

SQL standards and portability is. 

DISTRIBUTED DATABASE MANAGEMENT SYSTEMS 

INTRODUCTION: WHY WE CARE ABOUT DISTRIBUTED DBMS 

Ideally a distributed DBMS is a set of independent DBM systems,connected via network, on different hardware and softwareplatforms, perhaps also in physically disparate places, whichappear to any client as a single DBMS.

 Benefits achieved from such a configuration include:

 (1) Conformance to a data model which assumes that individual

units of the enterprise have distinct data interests whichthey do not necessarily share with all other units.

 

(2) Savings in communications costs, especially for units whichhave a high transaction volume on specific data. In suchcases network traffic remains localized within the unitphysically possessing either the "primary" copy of the data(OLTP) or the copy representing the particular researchinterests of the unit, (DSS or Decision Support Systems).

 (3) Savings in processing costs. The cost per unit of

processing (MIPS, CPU cycles, etc.) is lower fordecentralized computing. Incremental costs can be moreaccurately estimated. Particular departments mayindependently adjust their processing requirements.

 

(4) Increased reliability. Decentralization limits the effectsof a single point of failure, either in the communicationsnetwork or in a central point of processing.

 (5) Encourages managerial research (DSS). By providing the

mechanisms for separate summary information to be removedfrom the more performance-sensitive daily transactions, adistributed system suggests taking advantage of thesemechanisms, which are often targeted to common desktopapplications (e.g., Paradox, DBase-XBase) familiar tomanagement and sufficient for their processingrequirements.

 

(6) Conformance to the strategic directions of both IBM andeveryone else who sells us things. The IBM DRDAspecification, its SAA strategy, the ability of DB2 (rel2.2) to cooperate as a unified data base with other peers,the plans for DB2 (2.3) to cooperate in that way withSQL/DS, and the future plans to include OS/2 and AS/400platforms in the same scheme all indicate that IBM hastaken this architecture quite seriously. That their plansinclude only their own products is, as usual, simply abargaining chip against the rest of the market, which will

Page 8: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 8/25

permit them a certain edge for a limited amount of time.Most other vendors have also undertaken a strategy towardsinteroperability between data base systems, but they havegenerally decided to cooperate with each other as well. IBMsends a double message: cooperation among disparateplatforms--but just our disparate platforms. This kind ofunfortunate hypocrisy is marketing alone; it will solve fewof the real problems we face.

 (7) Vendor independence. Despite the IBM problem above,

distributed data base systems are, by nature, designed topermit an individual department to select the mostappropriate data base software for its particular needs,while also maintaining accurate copies of "other people's"data. Several quite active standards bodies, which includealmost all vendors, are involved in insuring this capacity(OSI, ANSI, X/Open, SQL Access Group).

 MARKET SURVEY: WHY WE CAN'T HAVE DISTRIBUTED DBMS NOW

 At the risk of this entire endeavor it remains very importantto note that there are currently no commercial products capableof satisfying the desires above. Surprisingly enough, we findourselves at the very edge of technology. While, as mentioned,

virtually all data base vendors are actively attempting toprovide the above benefits there have been, as yet, no realproducts. A "real" product is lately measured by its ability tomeet the now-famous standards set out by Chris Date in "TwelveRules for a Distributed Data Base" (Computerworld, June 8,1987, p.75). These standards mostly specify the independenceand the mutual transparency of the various data base softwareparticipants over the network.

 A few reasons why there are no commercial products availableright now:

 (1) Protocol Standardization. In an arena where high-level

cooperation will be the key to a distributed DBMS,standardization has not been finalized. Many vendors appearto be poised for the outcome of that process and it appearsworthwhile to permit them the time to make their bestefforts to work with each other. While an IBM-onlyalternative is perhaps more readily promised, it requiressubstantial reinvestment in hardware, which is perhapsinappropriate and is certainly premature at this time.

 (2) Two-Phase Commit. The "two phase commit" is a network

protocol enabling cooperating data base management systemsto perform distributed "update" transactions. (An "update"is the ability to change information, rather than just look

at it.) While this process is well understood and perfectlypractical within the data base research community, it doesnot adequately address the performance of a productionnetwork nor does it comprehensively suggest behavior on thepart of the cooperating systems (especially if some aredown).

 (3) Network and Processor Reliability. While this is a special

focus of the latter problem (in 2 above), it has thepotential to solve it. Server redundancy and the guarantee

Page 9: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 9/25

of network performance could make the two-phase commit arealistic possibility, therefore, making the distributedDBMS happen. Because large technological strides haverecently been made in these areas, they may well be viewedas the current bottom-line bottlenecks to the solution.

 (4) Design. Many of our current development and design methods

retain built-in assumptions of a centralized data base. Ifour modeling and "CASE" efforts can establish a broaderconstituency, for client enterprise units and for computerproducts respectively, then we will be in a better positionto accommodate changes in either.

 CRITERIA SUMMARY: WE WOULD LIKE A SINGLE DBMS

 We believe we want to realize the model, variously describedabove, of a single DBMS. It should be distributed across anetwork so that it is optimized for transaction performance inthe units which perform the most number of transactions, andfor reporting performance in the units which do the mostreporting--quite independently of each other. A case hasperhaps been made that a distributed DBMS is the best possiblesolution. Without currently reasonable choices, we shouldremain "shoppers" in this market while making note of the

following issues: (1) Protocol standardization. Our vendors need to know that we

are interested in this. And further, that we will not bewilling to work with vendors who aren't actively pursuingstandardization. A couple of levels should be watchedcarefully: the high-level syntax, e.g., SQL; and the lower-level transport mechanisms, like IBM DRDA and OSI RDA.Conformance, commitments, and gateways between differentprotocols should be part of any vendor strategy.

 (2) Query Optimization. A principal design issue in the

distributed DBMS products now available is the method of

query optimization. Vendors' strategies must account fornetwork traffic incurred by their DBMS query optimizationmethods.

 (3) Recovery Control. The two-phase commit, above, may not be

the most effective way of ensuring data integrity. Vendorsmust demonstrate rollback procedures, and those whichcooperate across standard protocols.

 (4) Security. How is authorization information passed across

the network between applications? Thus far, this work hasbeen quite primitive. We are analyzing this question and italso needs to be answered by our vendors. Vendors should

never require the "manual" maintenance of userauthentication information, but rather they should be in aposition to batch-load this data from whatever centralsecurity database is adopted here.

 (5) Design Methodology. How does the distributed data base

product work with CASE tools? Products should offerdistributed data dictionaries to properly supportdistributed application development. And, in what ways doour own DBMS modeling efforts include assumptions about

Page 10: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 10/25

data distribution? Do our "CASE" data management toolsassume a single, centralized data base? Will they generatedata base dictionaries which can been used across thepotentially distributed environment described above?

 RECOMMENDATIONS: WHAT WE CAN DO IN THE MEANTIME

 While waiting for a reliable set of products to implement theideal distributed DBMS, we can certainly take some proactivesteps now. We can immediately benefit from a distributed datamethodology by:

 (1) Insuring that our data-modeling and data-management efforts

account for potential distribution of the data. Forexample, our CASE tools should support a wide variety ofDBMS and offer distributed data dictionaries.

 (2) Designing replicated-data applications. As an intermediate

step, physical copies of data may be used to actuallyachieve most of the benefits of a truly distributed system.Also, this experience is necessary for us to explore theimplications of the more widely distributed systems we willbe required to support in the future. And further, thisexperience will enable us to evaluate network performance

and to recommend network strategy. (3) Continuing our pursuit of a network-based security system.

Such a system must be in stable production before we canembark on any real distributed DBMS.

 SECTION 3: ORGANIZATIONAL ISSUES 

A STRONG ORGANIZATION NEEDS TO BE IN PLACE 

In a previous recommendation, the task force stated that thearchitecture for administrative information systems will be drivenby the data requirements of the University. To insure that our

data can answer the required managerial questions, a strong,diverse organization with formalized responsibilities needs to bein place.

 The purpose of this section is to cover the fundamental issues, tocontrol objectives and techniques related to functions of dataadministration, database administration, data ownership, and datacustodianship which are vital to the implementation andmaintenance of a database at Columbia University.

 To adequately support a database environment at Columbia, anorganizational infrastructure must be created where none currentlyexists.

 DATA MANAGEMENT RESPONSIBILITY HIERARCHY

 The recommended organizational responsibilities include thefollowing:

 - Data Administration: Based in the Provost's Office, the data

administrator defines data requirements and definitions at thehighest level and maintains data integrity across organizationallines. Data administration's objectives are strategic and

Page 11: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 11/25

organization-wide. 

- Database Administration: Based in AIS, the databaseadministrator is responsible for the tactical implementation ofthe corporate database model developed within dataadministration. He or she is also responsible for theoperational integrity of the physical database.

 - Data Ownership: The owner of data is a senior manager within the

university who is ultimately responsible for the data createdand maintained within his or her department.

 - Data Custodianship: The data custodian is a liaison between the

data owner and the database administrator and is responsible forauthorizing access to data.

 ORGANIZATIONAL RESPONSIBILITY

 Department Responsibility

 STRATEGIC

ProvostData Administrator

 

TACTICALAISDatabase Administrator

 Personnel Management & Human ResourcesStudent Financial & Information ServicesUniversity Development & Alumni RelationsTreasurer & Controller

 Data Ownership

 OPERATIONAL

AIS

Database Administrator Personnel Management & Human ResourcesStudent Financial & Information ServicesUniversity Development & Alumni RelationsTreasurer & Controller

 Data Custodian

 These responsibilities only describe general functionalrequirements. We do not indicate whether each responsibility is tobe carried out on a part-time basis by one individual or by astaff of people on a full-time basis.

 DATA MANAGEMENT RESPONSIBILITIES AND FUNCTIONS

 DATA ADMINISTRATION

 In organizations that have a adopted a corporate data basemodel--such as Columbia has done explicitly through the high-level information architecture project carried out inconjunction with AMS--the responsibility of data administrationbecomes critical. If the model is to be successful in helping

Page 12: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 12/25

to implement mission-critical systems, the data administratormust be the ultimate guarantor of the integrity and reliabilityof the database model. The model is a definition of data, it ismetadata.

 Not only must the data administrator ensure the accuracy of thedata definition, he or she must understand how to make thedefinition of the data available to the broad user anddeveloper community. The data definition is like a road-map.Without such a map, it becomes difficult to exploit the datacontained in DASD or other media, or to get from one point toanother. The data administrator is responsible for the accuracyof the map and its dissemination. Based on theseconsiderations, we recommend that the responsibility of thedata administrator be defined for Columbia University.

 The data administrator will set policies and plans for thedefinition, organization, protection, and efficient utilizationof University-wide data. The data administrator will functionat a high level, since he or she must have a corporate-wideview of the data. Not only must this person be in a position todetermine the logical view of the corporate data, the dataadministrator must also be in a position to arbitrate betweendifferent functional areas of the University whenever conflicts

over ownership or interpretation arise. Consequently, werecommend that the responsibility for the data administrator belodged in the Provost's office of the University.

 The tasks of the data administrator include the following:

 - To determine the scope of data to be contained in the

database (i.e., administrative, Health Sciences, academic,etc.). This task generally is associated with strategicplanning and should be carried out prior to any full-scalesystems implementation.

 - To create entity-relationship data model of organization's

data, ideally using an automated modelling tool such asExcelerator or Bachman DBA. The model must be verifiedagainst user's view of the business and modified whenbusiness changes. E-R model supports creation of logicalschema within DBMS.

 - To define data elements and their synonyms, preferably using

automated data dictionary or repository (Excelerator, DBExcel, etc.). Data elements must be reviewed to ensure thatno redundant definitions have been entered. Standards must beestablished to ensure that element names follow certainconventions (e.g., accounts must be prefixed with acct, dateswith dte, etc.)

 - To develop standard reports from dictionary or repository

showing relationships between various entities (programs,data elements, records, etc.) Reports should show cross-referencing between various entities in order to revealimpact of proposed database modifications. Reports can beavailable on-line through QMF, SPUFI or other facilities andthrough batch with imbedded SQL statements.

 - To coordinate and plan for compatibility between DBMS's and

Page 13: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 13/25

existing data structures. Sets plans for conversion ofexisting data structures to DBMS.

 MAINTAINING THE DATA MODEL

 The data model is an evolving one. As new initiatives areundertaken, AIS must examine new data requirements and makesure that there is congruency with the model. If not, decisionsmust be made to determine if the data entities are required forUniversity, departmental, or personal needs. If the entity isrequired for the University needs, then the model must reflectthis and appropriate changes must be made to the systems.

 During the requirements phase of implementation, departmentaland personal data needs will be examined. During this phase ofthe project, the DA will be responsible for making decisionsabout whether or not to include these specific data needs intothe model will be made .

 DATABASE ADMINISTRATION

 In contrast to the data administrator, who is concerned withthe strategic direction of how data is defined and utilized,the database administrator is more concerned with the day-to-

day tactical implementation and maintenance of physicaldatabases. The database administrator must work closely withthe data administrator to ensure that the high-level logicalmodel of data is accurately reflected in the schemas created bythe specific database management system. By implication, thedatabase administrator must have technical mastery of the DBMSbeing used, whether it is DB2, Ingres, or some other package.

 The DBA is also more primarily concerned with the operationalaspects of the database. Both the users and the applicationsdevelopers must have access to databases on a continuous basisif mission-critical activities are to be successful. When anorganization relies on project-oriented flat-file systems, it

may be possible to allow a certain amount of latitude. In afull-scale database environment such as the one we areprojecting, there is little room for latitude. Correct tuningparameters for memory or DASD allocations can make thedifference between 3 and 30 second response times.

 The tasks of the database administrator include the following:

 - To be responsible for database design. The database

administrator analyzes the entity-relationship diagram andconstructs a schema for the DBMS. The schema takes one-to-many, many-to-many relationships into account; it defineskeys, both primary, secondary, and foreign. It defines

record/record element relationships in detail. The resultinglogical structure should be reviewed with the applicationteam.

 - To be responsible for physical design and access methods.

(DB2, VSAM, etc.). This includes specification of databasesize (number of tracks, blocks, etc.), amount of free space,indexes, data clustering, data compression, controlledredundancy, distribution of files across volumes, etc.

 

Page 14: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 14/25

- To assist application staff in using the database (SQLsyntax, etc.). Ensures that they make the most efficient useof database resources (record locking vs. table locking,etc.). Controls access to database to prevent excessive time-consuming search operations through QMF, SPUFI or otheruncontrolled on-line operations.

 - To establish restart and recovery procedures. This includes

timing and scope of periodic backups, methods of partialrecovery; cold-start vs. warm-start procedures, etc.. Taskmust include plan for recovery of data in the event ofcatastrophic destruction of major portions of database andbackup media.

 - To monitor database performance through on-line and batch

tools. Recommends database or application modifications toenhance performance. The database administrator shouldprimarily be familiar with tools that concentrate on DBMSstatistics (e.g., average I/O per execution of transaction),but should have some knowledge of MVS and CICS performancemonitors. This facilitates a holistic view of systemperformance and makes enhancement tuning more effective.

 - To monitor space utilization. Ensures that database is

expanded in timely fashion so that applications can operatewithout interruption. Must have expertise with variousmethods of expanding available space (reorganizing database,increasing size of blocks, etc.)

 - To determine how database is to be distributed if need be.

Determines how deadlocking and integrity problems can beavoided within distribution architecture. This responsibilitywould entail some expertise with non-mainframe products andtechnologies such as LAN's, client-server DBMS's, microDBMS's, etc.

 DATA OWNER AND DATA CUSTODIAN

  In a distributed processing environment, data will bemaintained on various platforms and in various technologies.Whether maintained in a main-frame DBMS or on a local platform,however, responsibility for the data must be clearly defined.Responsibility for the data in University administrativesystems should reside with the appropriate administrativedivision of the University, not with AIS. We recommend that theData Owner and Data Custodian functions described below beclearly defined and assigned within the client community.

 Both of these functions, in fact, already exist and have beenin practice within the University for some time, although not

specifically defined or spelled out. AIS has always operated onthe assumption that each functional user area owns the data forapplications for that area. Individuals within client areas(e.g., Don Burd for Student Systems, Nick Goudoras forFinancial Accounting Systems, etc.) have served as DataCustodians. We recommend that these responsibilities beformally assigned to each functional area.

 DATA OWNER FUNCTION

 

Page 15: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 15/25

All administrative information is a University corporateresource and as such is owned by the University.

 Data (the representation of that information) should be ownedat the highest appropriate administrative level. Assuming thatthe data model proposed is implemented (to divide data intothree categories: University-wide, departmental, and personal),ownership of the last two categories is self-defining. Personaldata will be owned by the individual creating and maintainingsuch data. Departmental data will be owned by the director,department chairman, or other University officer responsiblefor the department under which the data is created andmaintained. For University-wide data, ownership resides withthe senior University officer responsible for the functionalarea which that data primarily serves.

 - Financial data is owned by the Controller of the University

 - Alumni data is owned by the Vice President for University

Development and Alumni Relations 

- Student data is owned by the Director of Student Financialand Information Services

 

- Human Resources data is owned by the Vice President forPersonnel Services 

(Note: The University Data Administrator would be responsiblefor arbitrating any disagreements about who "owns" specificdata or data elements.)

 "Data" in this context includes both the data itself (e.g.,name and address information for students) and applicationprograms, data dictionaries, etc., which are created and usedto maintain such data.

 Operating systems software is owned by the Data Center (that

is, by the DVP for Administrative Information Services) butapplication software is not. 

Specific responsibilities of the Data Owner: 

- To approve data elements included in the application andtheir classification.

 - To sign-off on access and security policies for the data.

 - To approve general-use policy for the data (including cross-

application access). 

- To be responsible for identification and enforcement ofstatutory and other external controls on use and maintenanceof the data.

 The Data Owner's responsibilities are at the level of approvingand/or determining general policies. For example, theController of the University would not be expected to identifyand define every data element to be included in financialsystems, but would be responsible for reviewing any proposedscheme of data elements to ensure its completeness and for

Page 16: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 16/25

approving it. 

DATA CUSTODIAN FUNCTION 

The Data Custodian is essential to the accurate and timelymaintenance of University data. Responsibility forcustodianship of specific data should be delegated by the ownerof the data. The data owner is ultimately responsible forsetting or approving policies for the definition andclassification of data elements, for authorization for use ofthe data, and for authorization for access to the data. Thedata custodian is responsible for the implementation andadministration of those policies to ensure validity,consistency and accuracy of data.

 Specific responsibilities of the Data Custodian:

 - To assign data classification to specific data elements.

 - To authorize cross-application use of data.

 - To participate in establishment of general policies on access

to the data for creation/modification/retrieval. 

- To participate in design of access security profiles. - To approve specific access requests for individuals or

functional groups of individuals. 

- To determine policies for retention, deletion and archivingof data.

 The Data Custodian will be the primary liaison among the DataOwner and the Database Administrator, the Security Officer, andother Data Center staff supporting the application and taskedwith the day-to-day maintenance and administration of thedatabase. Many of the above policies and the procedures to

enforce them will be developed in cooperation with the DatabaseAdministrator, the Information Security Officer, and other DataCenter staff. The Data Custodian will provide client-basedknowledge of business requirements for the data, and DataCenter staff will provide technical knowledge of how to bestmeet those business requirements.

 TOOLS ARE AN IMPORTANT FACTOR FOR SUCCESS

 As part of the original charge to the task force, datamanagement tools were to be examined and recommended. The taskforce agrees that automated tools are an important ingredientto the success of data administration. Our data models, even at

this early stage, are just too complex to design and administerby hand.

 Some members of the task force have spent considerable timeexamining tools. It became obvious, however, that it would beimpossible to arrive at specific recommendations within thetimeframe that the Task Force members had to complete theirtasks. Therefore, the Task Force has two recommendations:

 1) The data model currently being developed should be placed

Page 17: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 17/25

into Excelerator. AMS is currently using this product fortheir SIS and CUFS products.

 2) One of the first tasks for the Data Administrator and the

Database Administrator should be to examine theirrequirements carefully and to decide on the exact toolsetthat will be required to manage the University's data.

 As a starting point for the second recommendation, the TaskForce has developed guidelines and a matrix for selecting datamanagement tools (see appendix).

 REQUIREMENTS TO INSTALL DB2 V2.2.

 For each DB2 subsystem, IBM recommends:

 9M + (1.1M * tmax) of real memory, where tmax corresponds tothe maximum allowable number of CICS transactions per second.Tmax should be construed as the anticipated number oftransactions per second. This storage is for the DBMS nucleusand associated data structures.

 4-6M of real storage for buffer pools, (they recommend that BP0be used exclusively).

  Additional real storage for the EDM pool, (where plans go), ifpossible.

 The default amount of real storage is 15,575K for a test DB2subsystem.

 IBM estimates 43M of real storage for a production system. Theestimates given above are for a "medium-sized site". A "small"site has 100 application plans, 50 application databases, and500 tables. Note that a CICS transaction is associated withexactly one (large) plan. A database in the sense of DB2 is acollection of tables related in an application. In the near

term, AIS would have perhaps 1 application database (for AMSCUFS), as many (large) plans as there are CICS transactions inCUFS, and perhaps 50 tables. So the estimate of required spaceis generous for the establishment of a test DB2 Subsystem.However, it is clear that a production DB2 subsystem willrequire more real storage than is currently available in theES/9120. Additional memory will require doubling the installedreal storage from 256M to 512M.

 700 cylinders of 3380 DASD for DB2 libraries, internal workfiles (the DSNDB07 database where result tables arematerialized), as well as 2 logs. The DASD should bedistributed across 2 actuators for a test DB2 subsystem; 4-6

actuators for production. In addition, cache control for theexisting control unit is considered a requirement. Applicationdatabases should reside on other DASD.

 IBM recommends that libraries not be shared across subsystems;DB2 is considered a fairly volatile piece of software, and atest and production subsystem might be at different maintenancelevels, or even different releases. Should there be multipleproduction subsystems, (say, an operational OLTP system and areporting system running at the same level of maintenance), DB2

Page 18: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 18/25

could share the same libraries. We do not know at this time towhat extent DB2 is itself reentrant, so that real storage couldbe shared across subsystems.

 The requirement for dual logging in a DB2 development subsystemseems excessive. However, if there is to be only one DB2subsystem, then dual logging and all recovery procedures shouldbe fully implemented, as part of the development of anoperational system.

 DB2 V2.2 (the current release) and V2.3 (which is scheduled tobe fully supported on March 27, 1992), both require CICS 2.1.1(or higher). They are incompatible with CICS 1.7, which we arecurrently running. It is not anticipated that existing CICS 1.7production applications at AIS will be migrated to CICS 2.1.1in the near term. We will be running multiple versions of CICS,which requires approximately 5M of additional real storage.

 In order to access DB2, the TSO, batch, and CICS attachmentsmust be installed. The batch interface can be invoked bysubmitting a Pl/1 program provided with the product, withdynamic SQL in stream. Should a more convenient batch interfacebe required, it would have to be constructed on the basis ofthis utility. The AIS PL/I compiler and macro preprocessor may

require maintenance. TSO access needs to be provided at least to databaseadministrators, and possibly to developers. TSO securityprofiles need to be created (in TSS) to restrict staff involvedwith DB2 to just those TSO resources necessary to connect toDB2 (and any associated software tools). This is just thebeginning of the security issue.

 SECURITY ISSUES

 The TSS/DB2 interface is now in Beta test. It requires anupgrade to TSS/MVS 4.3 that is now in ESP. The latter is only

compatible with version 2.2 of TSS/VM that is in Beta. Assumingthat AIS will not acquire these products until they becomegenerally available, DB2 will have to maintained by way of itsown internal security (Grant/Revoke). This should be a functionof central database administration, authorized by the centralsecurity office.

 When the TSS/DB2 interface (and prerequisite products) aregenerally available, the DB2 user authorization exit forTSS/DB2 will need to be defined, and TSS tables populated fromthe DB2 catalog. The administration of DB2 security shouldstill be carried out by central database administration:effective administration of external DB2 security requires some

understanding and manipulation of the DB2 environment.Authority to create DB2 objects needs to be set up, and a localdatabase administrator identified for each application.

 CUFS IMPLEMENTATION

 Assuming the University purchases the AMS CUFS DB2 applicationpackage, AMS should be consulted as to the appropriate settingsfor DB2 install-time parameters (Z parms), as well asoperational issues and scripts for the creation and recreation

Page 19: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 19/25

of CUFS DB2 objects. This needn't wait for DB2 to be installed.AIS may wish to change these settings in due course, but a goodplace to start making DB2 operational is to understand the AMSparameter settings, and their rationale.

 We are told by AMS that implementation of CUFS at Columbia willnot require the development of additional SQL. All of the SQLthat is required comes with the DB2 flavor of a Core FoundationSoftware application package, embedded in external COBOLsubroutines. The embedding programs need to be preprocessed andthe resultant Data Base Request Module needs to be boundexactly once. There is no need for dynamic SQL access to DB2,and so the requirement for TSO connectivity is exactly as it isnow, with the exception of TSO authorization for a databaseadministrator local to the application system.

 Thereafter, access to DB2 by CUFS developers is by way of planexecution in the context of CICS transaction processing. CUFSdevelopers do not require dynamic SQL access to DB2.

 There are at least two exceptions to this scenario: First, anybridgeback conversion system, in which changes to data in thenew system must be applied back to the old, would require theuse of SQL. In that case developers would require dynamic SQL

and TSO connectivity. Secondly, a new, native SQL developmentproject, such as a Decision Support System (DSS) would alsorequire that the programmer workbench connect to the DB2development environment.

 UPGRADING TO VERSION 2.3

 Assuming that DB2 V2.2 is to be installed this Spring, AISshould set aside time during the Summer for the migration toDB2 V2.3. The new version is easier to administer, due to thepackage bind feature. It is also considerably more difficult tounderstand, thanks to optimizer enhancements. Without packagebinds, any change to the SQL of a CICS/DB2 application in

development requires that all of the SQL associated with a CICStransaction be rebound. Any rational version control ofCICS/DB2 development requires package binds. AIS should migrateto DB2 V2.3 and require that CUFS development at Columbia usepackage binds.

 INSTALLATION SCHEDULE

 The amount of time required to install DB2, as estimated byIBM, is as follows:

 2 days to plan installation parameters.1 to 5 days to install via IPO tape (customized by IBM).

 After the product is installed in the MVS/ESA test system(which at the time of this writing does not exist), the DB2subsystem needs to be migrated to the live MVS/ESA system. Thisinvolves "flipping" the system residence packs, which takes 1to 2 days, and the scheduling of systems time to verify. An IPLof the live MVS/ESA system is required. With the change controlprocess, DB2 can be migrated from MVS/ESA test to the livesystem in about one week.

 

Page 20: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 20/25

IBM estimates for setting up a DB2 environment include thefollowing:

 Establish Secure Environment 3 daysOperations and Recovery:Startup/Shutdown procedures 2-3 daysBackup/Recovery Strategy 2-3 weeksMonitoring/Control procedures 1-2 weeksRecovery Administrative procedures 1 weekSystem Backup/Recovery procedures 4 weeksApplication Backup/Recovery procedures 2-3 weeks

 Installation time training:

DB2 System Programming Workshop (D2SP)Vendor: AmdahlClass location: Columbia, MD.Time: 4 days: Feb 11, 1992; Mar 10, 1992Tuition: $1260.Audience: Systems personnel assigned to install product;database administrative personnel assigned to assist in(know about) initial configuration.- installation and management of DB2 as an MVS subsystem.- connecting to DB2 via TSO, batch.- CICS Call Attach Facility

- establish DBA access to DB2 presumably using TSO.- establish procedures for operation and recovery.- students perform an installation.

 Issues to resolve: determination of division of responsibilitybetween systems, dba staff (e.g., who has the power todetermine MVS console activity as related to DB2 Subsystem?).

 Conformance of AIS DB2 environment with requirements for CUFS;Establishment of number of DB2 subsystems considering realstorage may be unavailable.

 DB2 AND ASSOCIATED RESOURCE COSTS

  The cost for DB2 itself is as follows, including 15%educational institution discount:

 DB2 V2.2

Monthly Licence Group 40 processor $3506.00DB2 Performance Monitor - Online Monitor

Monthly Licence Group 38 processor $368.05DB2 Performance Batch Monitor

Monthly Licence Group 38 processor $952.00 

THE COST OF ASSOCIATED HARDWARE: 

256M memory $3390.00 

DB2 DATABASE ADMINISTRATION TOOLS. 

There are a number of contending products to assist in DBAtasks. Some give improved performance over utilities that comewith DB2, and we should not acquire these products until wehave evaluated the DB2 utilities in V2.3 and determined that weneed the enhanced performance of the utilities in question.

 

Page 21: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 21/25

DB2PM uses the same statistics that other monitors use; unlikeother monitors, DB2PM can be leased. We can use DB2PM until wefind out why we need something better.

 Other DBA tools assist in crucial tasks for which there is noutility provided with DB2. The core example is dropping andrecreating an object:

 For example, say a table has a clustering index (so that thedata in the table is in the same order as the index), and aftera number of insertions the table goes out of cluster. Existingapplication plans will continue to use the index, although notas efficiently. New plans, including any subsequent rebindsmight not use the index at all. Now the table needs to bereorganized.

 To reorganize the table, it must be unloaded, dropped,recreated, and reloaded. When it is dropped, all entries inSYSIBM.SYSCOLUMNS for the table's columns are dropped; allviews based (at all) on the table are dropped, as are thecorresponding columns; all views based on those views aredropped, etc. All catalog record of permissions granted withrespect to any of these objects are dropped. Any plan thatrefers to any of these objects is invalidated (but not

dropped). To recreate this state of affairs requiresconstructing the Data Definition Language necessary to recreatethe objects, including the Data Manipulation Language clausesincluded in the view definitions, and the Data Control Languageto restore the grants.

 Similarly, If an authid is removed from the system, then anyprivileges granted by that authid, (possibly with grantoption), are removed, which may not be what is wanted.

 It is not difficult to write a program to record thesedependencies. Different products (like Platinum's RC/Update)will generate the necessary code. It is possible to do without

a product, but not without some utility to perform this commontask. Again, consultation with AMS as to their methodology forreorganization of DB2 tables should prove instructive.

 These products are bundles of functions, some of which we willnot need. For example, generating DCL to recover from acascading REVOKE statement just described is not a problem ifwe use external TSS/DB2 security. Yet we shall have to pay forthis function if we purchase RC/Update, or something like it.The other side of this story is that the products are notextendible to include user written functions, or other productsfrom other vendors. RC/Update lists for $43,125.

 

One criteria for the acquisition of database administrationtools is that a single functional areas in AIS should not havetoo many different products. In this sense, the Candle OMEGAMONfor DB2 monitor ($23,000) has the inside track in the systemsarea, and we could consider acquiring the suite of Candleproducts ($90,000, list).

 Some tools can be used by systems programmers, central databaseadministrators, local database administrators and applicationdevelopers. In that case, it is necessary that the utilities be

Page 22: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 22/25

securable so that users are restricted to the correct scope ofapplication: a local database administrator should only haveaccess to her own database. Notice that if we use external(TSS/DB2) security, the DB2 catalog may lack the informationthese tools need to distinguish among users.

 In the area of application development, a debugging tool,something that can explain the DB2 EXPLAIN tables to thedeveloper, may serve as an adequate mode of online access todynamic SQL.

 There is a remarkable difference in price among tools thatadvertise similar functionality. Before purchasing a product,we should show that we need the functionality provided.Besides, none of this stuff is magic and we could always writesomething.

 DB2 DATABASE ADMINISTRATION TRAINING.

 DB2 Database Administration Workshop (U4066).Vendor: IBMLocation: NYCTime: 4 1/2 days, Mar 2, 1992; May 4, 1992.Tuition: $1620.

- Audience: all central database administrative staff.- if there is a particular application in question, then thedba local to that application should also receive thistraining.

- creation of DB2 objects: databases and other data objects,plan packages, authid's.

- internal DB2 security.- issues to resolve: custodianship of data and plan packages;

relationship of dba to Security Office.- determination of division of responsibility among central and

local dba.- relation between data administrative function (including data

modeling) and database.

- administration (implementation of a model in a specificsoftware environment).- Note that the March 2 class is in conflict with the CICS/DB2

course, below. All IBM DB2 classes have now been upgraded toinclude DB2 V2.3.

 TRAINING IN DB2 APPLICATION PROGRAMMING AND SUPPORT

 SQL Application Programming for DB2Vendor: PlatinimumLocation: Shearson Lehman /390 Greenwich/NYCTime: February 24-28, 1992 (5 days)Tuition: $1250; $950 if > 2 students.

Audience: dba local to application. For other SQL development,we may want to bring training in house.

 CICS/DB2Vendor: PlatinimumLocation: Shearson Lehman /390 Greenwich/NYCTime: March 2 -6, 1992 (5 days)Tuition: $1250; $950 if >2 students.Audience: systems & dba.

 

Page 23: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 23/25

- CICS will be the preferred access method for all DB2 attachedusers.

- Database Administrative personnel must understand the properdesign of CICS transactions for use with DB2. Remote accessto DB2 may use CICS threads. Sybase accesses the DB2 "server"in this way.

- Note that Platinimum will not upgrade their courses for V2.3until June, at the earliest.

 COST TO INSTALL DB2 IN PRODUCTION ENVIRONMENT (ESTIMATES)

 ES 9121 Enhancements

 DB2 associated hardware costs:256M memory for ES/9121 $537,600less 15% educational discount $456,960co-terminous lease:42 months left on ES/9121lease @ 10% interest: $11,969/month

 3990-002 control unit w/o cache $ 77,700less 26% discount on state contract $ 57,498

 upgrade of 3990-002 to 3990-G03

with 32M cache $126,200less 15% educational discount $102,270 

One of the following 3390 DASD is needed:3390 A18 $156,750less 26% discount on state contract $115,954*

-or-3390 A28 $154,660

-or-3390 A28 $185,999

 SOFTWARE

 

- DB2 $3,506 monthly- Online monitor $368 monthly- Batch monitor $952 monthly

$4,682 $57,912 annual 

TOOLS 

- OMEGAMON for DB2 or $23,000 - Candle DB2 products (listprice, 1x charge, $90,000 includes OMEGAMON)

 TRAINING

 - Installation Training 2 @ $1,260 $2,520

- DBA Training 3 @ $1,620 $4,860- SQL Application Programming for DB2 3 @ $ 950 $2,850- CICS/DB2 3 @ $ 950 $2,850

$13,080 

GUIDELINES FOR SELECTION OF DATA/DATABASE ADMINISTRATION TOOLS: 

"Reverse engineering" capabilities: 

* Must provide a capability to reverse engineer existing DB2,

Page 24: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 24/25

VSAM data structures into a conceptual model. 

* Must be able to use a conceptual model in order to forwardengineer into relational and physical DB2 designs.

 * Does the product support development of relational database

model based on existing VSAM or non-relational technology(Model 204, FOCUS, etc.)? (While the underlying assumption isthat databases will be designed from a "forward engineering"perspective--building a database based on newly-definedrequirements--there may be some use in having a reverseengineering capability in reserve.)

 * Can product scan COBOL copybooks or copylibs and generate DB2

table definitions? 

GRAPHICAL USER INTERFACE FEATURES: 

* Does product support verification of the structure ofdiagrams?

 * Will it support decomposition of diagram into subordinate

views? 

* Does it include a palette of tools for drawing diagrams? * Does product include object-oriented, pop-up menus that are

related to the respective modelling objects in a diagram? Forexample, will double-clicking a relationship object pop up amenu that allows modification of mandatory/optionalproperties, etc.?

 * Do object-oriented pop-up menus allow entry of text

description? This is critical for the population ofrepository with significant descriptions of databaseentities.

 

* Are multiple windows allowed? Can a user view differentaspects of the model simultaneously? 

RELATIONAL DESIGN TOOLS: 

* Can primary and foreign keys be identified separately?Potential primary key identification: check unique clusteringindex, check unique index that corresponds to a foreign key,and keys that are subsume unique keys. Potential foreign keyidentification: identify candidates based on primary keymatches to exact name match, utilizing non-identical namematching algorithms to find additional primary key match-ups,utilizing number-of-distinct-values test to further qualify

primary key potential candidates. 

* Does the product define referential integrity constraints?Does it identify:

 * Tables without primary keys* Tables that are delete connected to themselves* Tables that are delete connected through multiple paths

 REFERENCES

Page 25: Data Management Task Force Final Report (166163745)

7/29/2019 Data Management Task Force Final Report (166163745)

http://slidepdf.com/reader/full/data-management-task-force-final-report-166163745 25/25

 Arnold, Dean, et. al.SQL Access: An implementation of the ISO Remote Database Accessstandard. Computer: 74-78, December, 1991.

 Balboni, JeffSQL Access: A Cure for the Nonstandard StandardData Communications 20(3):89, March, 1991

 Date, C.JA Guide to the SQL StandardAddison-Wesley, Reading, Mass., 1989

 Newman, Scott and Jim GrayWhich Way to Remote SQL?Database Programming and Design :46-54, December, 1991

 Gartner Group, 1991 Symposium on Information Technology.There are three: Data Manager, Relational Data System and IMSResource Lock Manager.


Recommended