Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | macey-saunders |
View: | 21 times |
Download: | 1 times |
Information Resources Information Resources ManagementManagement
April 24, 2001April 24, 2001
AgendaAgenda
AdministriviaAdministrivia Object-Oriented & DatabasesObject-Oriented & Databases Data WarehousingData Warehousing Data MiningData Mining SQL ExtensionsSQL Extensions XMLXML
AdministriviaAdministrivia
Homework #8Homework #8 Homework #9Homework #9 Current ScoresCurrent Scores Final Review Session?Final Review Session?
OODBMS vs. ORDBMSOODBMS vs. ORDBMS
OODBMS - Object-OrientedOODBMS - Object-Oriented ORDBMS - Object-RelationalORDBMS - Object-Relational
Appendix AAppendix A
OODBMSOODBMS
Persistent ObjectsPersistent Objects By classBy class By creationBy creation By markingBy marking By referenceBy reference
Storage/Retrieval MethodsStorage/Retrieval Methods
OODBMS - BenefitsOODBMS - Benefits
MatchMatch ProgrammingProgramming MethodologyMethodology Data types & structuresData types & structures
Ease of programmingEase of programming InheritanceInheritance
OODBMS - ChallengesOODBMS - Challenges
StandardsStandards ODMG - Object Database Management ODMG - Object Database Management
GroupGroup PerformancePerformance
Database vs. persistent languageDatabase vs. persistent language Loss of integrity, queriesLoss of integrity, queries
Storage SpaceStorage Space MaturityMaturity
ORDBMSORDBMS
Extensions to relational modelExtensions to relational model Complex data typesComplex data types InheritanceInheritance ReferencesReferences
Migration pathMigration path Use existing applications and Use existing applications and
knowledge baseknowledge base
ORDBMS - BenefitsORDBMS - Benefits
SQLSQL Existing SystemsExisting Systems VendorsVendors
ORDBMS - ChallengesORDBMS - Challenges
StandardsStandards ““Fit” with the development languageFit” with the development language Programming ComplexityProgramming Complexity
Using a relational database to store data from an object-oriented system has been likened to parking your car in your garage. With an OODBMS you park the car in the garage. If a (O)RDBMS is used, to park your car in the garage, you must first completely disassemble it and put each part in its specific location on a shelf. This process must then be reversed the next time you want to go for a drive.
OODBMS/ORDBMS ProductsOODBMS/ORDBMS ProductsVendor ProductComputer Associates www.cai.com/products/jasmine
Jasimine
Franz www.franz.com
AllegroStore
Fujitsu Software www.fsc.fujitsu.com
Jasmine
Gemstone Systems www.gemstone.com
GemStone/S
Matisse Software www.matisse.com
ADB
O2 Technology www.o2tech.com
O2
Object Design www.odi.com
ObjectStore
OODBMS/ORDBMS ProductsOODBMS/ORDBMS ProductsVendor ProductObjectivity www.objectivity.com
Objectivity/DB
Object Systems www.iprolink.ch/ibex.com
ITASCA
Ontos www.ontos.com
Ontos Integrator
Persistence www.persistence.com
Persistence LiveObject Server
Poet Software www.poet.com
Poet Object Server
Unisys www.osmos.com
Osmos
Versant www.versant.com
Versant ODBMS
Other LinksOther Links
Object Database Management GroupObject Database Management Group
www.odmg.orgwww.odmg.org Object Database NewsgroupObject Database Newsgroup
comp.databases.objectcomp.databases.object
Data MiningData Mining
Corporations have collosal amounts of dataCorporations have collosal amounts of data Usually only used for very specific purposes Usually only used for very specific purposes
(operations)(operations) Automated attempt to learn from the dataAutomated attempt to learn from the data Find statistical rules and patterns in the dataFind statistical rules and patterns in the data
Example: Giant Eagle Advantage CardExample: Giant Eagle Advantage Card
Goals of Data MiningGoals of Data Mining
Explanatory - Why?Explanatory - Why? Confirmatory - Is it?Confirmatory - Is it? Exploratory - ???Exploratory - ???
Approaches to Data MiningApproaches to Data Mining
ClassificationClassification identify rules that create identify rules that create
groupsgroups AssociationAssociation
find related conditions or find related conditions or eventsevents
CorrelationCorrelation relationships between relationships between
valuesvalues
User GuidedUser Guided hypothesis hypothesis
drivendriven AutomaticAutomatic
data driven data driven - AI based- AI based
Data WarehouseData Warehouse
A subject-oriented, integrated, time-A subject-oriented, integrated, time-variant, nonvolatile collection of datavariant, nonvolatile collection of data
Usually all data for a corporationUsually all data for a corporation Multidimensional databaseMultidimensional database
Data WarehousingData Warehousing
Single locationSingle location Long-term storageLong-term storage Greater availabilityGreater availability Separate “data” processing from day-to-Separate “data” processing from day-to-
day operations (performance)day operations (performance) All data is historicalAll data is historical Support data mining, et al.Support data mining, et al.
Data Warehousing QuestionsData Warehousing Questions
What data needs to be kept?What data needs to be kept? Where is it from?Where is it from? How good is it?How good is it? How long should it be kept?How long should it be kept? Can it be summarized? When?Can it be summarized? When? Will it make sense? What is the schema?Will it make sense? What is the schema? When is it updated?When is it updated?
Data Warehousing - BenefitsData Warehousing - Benefits
Support for decision making toolsSupport for decision making tools DSS, EIS, Data MiningDSS, EIS, Data Mining
Separation of information and day-to-Separation of information and day-to-day processingday processing
Unification - CentralizationUnification - Centralization Improved quality and consistencyImproved quality and consistency
Data Warehousing - Data Warehousing - ChallengesChallenges Costs: Storage, Setup, MaintenanceCosts: Storage, Setup, Maintenance Historical data issuesHistorical data issues Defining the warehouse schemaDefining the warehouse schema Doing the conversionDoing the conversion
Implementation & every timeImplementation & every time Keeping up with operational system Keeping up with operational system
changeschanges Answering the questionsAnswering the questions
Multidimensional DatabasesMultidimensional Databases
Two viewsTwo views Multidimensional tablesMultidimensional tables Star schemaStar schema
Multidimensional tableMultidimensional table each cell is attributeeach cell is attribute dimensions are “interesting” dimensions are “interesting”
categoriescategories
Multidimensional TableMultidimensional Table
Cell - salesCell - sales DimensionsDimensions
dayday personperson storestore itemitem
Star SchemaStar Schema
Multiple tablesMultiple tables Central table - data item (cell)Central table - data item (cell) Surrounding tables - information Surrounding tables - information
about each category (dimensions)about each category (dimensions)
Star SchemaStar Schema
Sales
Person
StoreItem
Day
Star SchemaStar Schema
Sales (Sales (DayDay, , PersonPerson, , StoreStore, , ItemItem, sales), sales)
Day (Day (DayDay, day info), day info)
Person (Person (PersonPerson, person info), person info)
Store (Store (StoreStore, store info), store info)
Item (Item (ItemItem, item info), item info)
Building/Maintaining a Data Building/Maintaining a Data WarehouseWarehouse1.1. Capture Capture
2.2. Scrub Scrub
3. Transform3. Transform
4. Load and Index4. Load and Index
Data MartsData Marts
Making specific data availableMaking specific data available Different ones for different needsDifferent ones for different needs
DW DM1
DM2Operational Systems
Data MiningData Mining
Corporations have collosal amounts of dataCorporations have collosal amounts of data Usually only used for very specific purposes Usually only used for very specific purposes
(operations)(operations) Automated attempt to learn from the dataAutomated attempt to learn from the data Find statistical rules and patterns in the dataFind statistical rules and patterns in the data
Example: Giant Eagle Advantage CardExample: Giant Eagle Advantage Card
Goals of Data MiningGoals of Data Mining
Explanatory - Why?Explanatory - Why? Confirmatory - Is it?Confirmatory - Is it? Exploratory - ???Exploratory - ???
Approaches to Data MiningApproaches to Data Mining
ClassificationClassification identify rules that create identify rules that create
groupsgroups AssociationAssociation
find related conditions or find related conditions or eventsevents
CorrelationCorrelation relationships between relationships between
valuesvalues
User GuidedUser Guided hypothesis hypothesis
drivendriven AutomaticAutomatic
data driven data driven - AI based- AI based
Data Mining - BenefitsData Mining - Benefits
Use dataUse data Learn new thingsLearn new things Improve decision makingImprove decision making
Data Mining - ChallengesData Mining - Challenges
Time (human and/or computer)Time (human and/or computer) Spurious resultsSpurious results
Separating the wheat from the chaffSeparating the wheat from the chaff Availability of dataAvailability of data Amount of dataAmount of data Changes in tools and technologiesChanges in tools and technologies Validity over timeValidity over time
Enhanced Data AnalysisEnhanced Data Analysis
Beyond SUM, COUNT, and AVGBeyond SUM, COUNT, and AVG SQL extensions (suggested)SQL extensions (suggested)
GROUP BY … AS PERCENTILEGROUP BY … AS PERCENTILE Specific percentilesSpecific percentiles
GROUP BY … WITH CUBEGROUP BY … WITH CUBE Cross-tabulationsCross-tabulations
Statistical package interfaceStatistical package interface SAS, S++, othersSAS, S++, others
Enhanced Data Analysis - Enhanced Data Analysis - BenefitsBenefits Greater functionalityGreater functionality Improved decision makingImproved decision making
Enhanced Data Analysis - Enhanced Data Analysis - ChallengesChallenges Lack of standardsLack of standards UnderstandabilityUnderstandability Processing requirementsProcessing requirements Cost of poorly written queriesCost of poorly written queries
““ad hoc” queries aren’t reviewedad hoc” queries aren’t reviewed
Extending Relational DBsExtending Relational DBs
Spatial and Geographic DatabasesSpatial and Geographic Databases Multimedia DatabasesMultimedia Databases
Changing the data stored while Changing the data stored while retaining the benefits of relational retaining the benefits of relational databasesdatabases
Spatial & Geographic DBsSpatial & Geographic DBs
Spatial - CADSpatial - CAD Geographic - GISGeographic - GIS
Similar issueSimilar issue How to store and retrieve such dataHow to store and retrieve such data
Spatial DatabasesSpatial Databases
Geometric objects (2 or 3 dimensions)Geometric objects (2 or 3 dimensions) LocationsLocations ConnectionsConnections Nonspatial information about each objectNonspatial information about each object SubstructuresSubstructures Spatial integrity constraintsSpatial integrity constraints
Two things can’t occupy the same Two things can’t occupy the same spacespace
GIS DatabasesGIS Databases
Raster Data (fractal data)Raster Data (fractal data) Pictures - possibly over timePictures - possibly over time MapsMaps
Vector DataVector Data LocationsLocations ConnectionsConnections
Nongeographic informationNongeographic information
Spatial & Geographic DB -Spatial & Geographic DB -BenefitsBenefits DBMSDBMS Specialized queriesSpecialized queries
Spatial & Geographic DataSpatial & Geographic Data ““Standard” DataStandard” Data Mix of the twoMix of the two
Integrity constraintsIntegrity constraints
Spatial & Geographic DB - Spatial & Geographic DB - ChallengesChallenges Space requirementsSpace requirements Level of detailLevel of detail Understandability - ComplexityUnderstandability - Complexity Processing requirementsProcessing requirements Compatibility between systemsCompatibility between systems Lack of standardsLack of standards
Multimedia DatabasesMultimedia Databases
Images, Audio, VideoImages, Audio, Video Nonmultimedia data (text) about eachNonmultimedia data (text) about each
Database EnhancementsDatabase Enhancements BLOBs (Binary Large Objects)BLOBs (Binary Large Objects) Similarity-based queriesSimilarity-based queries Guaranteed steady rateGuaranteed steady rate Synchronization of audio and videoSynchronization of audio and video
Multimedia Databases - Multimedia Databases - BenefitsBenefits DBMSDBMS Greater compression may be possibleGreater compression may be possible ““Paperless” office - document imagingPaperless” office - document imaging Workflow redesign - improvementsWorkflow redesign - improvements Greater availabilityGreater availability
Multimedia Databases - Multimedia Databases - ChallengesChallenges S T O R A G ES T O R A G E Specialized DBMSSpecialized DBMS Unity of database and networkUnity of database and network
Usually requires ATMUsually requires ATM Specialized hardwareSpecialized hardware
““juke boxes”juke boxes” optical disksoptical disks
XMLXML
What is it?What is it? What isn’t it?What isn’t it? What are the goals?What are the goals? Who controls it?Who controls it? Who’s using it?Who’s using it? Beyond XMLBeyond XML
What is XML?What is XML?
eXtensible Markup LanguageeXtensible Markup Language Markup language for “structured Markup language for “structured
information”information” ““structured” - content & role of that structured” - content & role of that
contentcontent markup - identify structuresmarkup - identify structures
““meta language for describing markup meta language for describing markup languages”languages”
Huh?Huh?
Storing structured data in a text fileStoring structured data in a text file spreadsheet, address book, transactions spreadsheet, address book, transactions
(think EDI)(think EDI) Looks like HTML, <tags>, but isn’tLooks like HTML, <tags>, but isn’t Text is universal, but not efficientText is universal, but not efficient
Does disk space matter?Does disk space matter? What about network capacity?What about network capacity?
XML is license-free & platform-independentXML is license-free & platform-independent
What XML isn’tWhat XML isn’t
HTMLHTML SGML - Standard Generalized Markup SGML - Standard Generalized Markup
Language - printingLanguage - printing Limited to current definitions (tags)Limited to current definitions (tags)
XML is the way to add new definitionsXML is the way to add new definitions A relational database management A relational database management
systemsystem A database, or is it?A database, or is it?
Goals of XMLGoals of XML
Easy to use over InternetEasy to use over Internet Wide variety of applicationsWide variety of applications Compatible with SGML (subset)Compatible with SGML (subset) Easy to write programs that use XML Easy to write programs that use XML
documentsdocuments No (or few) optional featuresNo (or few) optional features Human-legible if necessaryHuman-legible if necessary
Goals of XML (2)Goals of XML (2)
Standards developed quicklyStandards developed quickly Formal and conciseFormal and concise Easy to create documentsEasy to create documents No need for “shortcuts”No need for “shortcuts”
Who Controls XML?Who Controls XML?
W3 ConsortiumW3 Consortium www.w3.org/XMLwww.w3.org/XML XML 1.0 specificationXML 1.0 specification
Who’s Using XML?Who’s Using XML?
Financial Products Markup LanguageFinancial Products Markup Language FpMLFpML FpML.orgFpML.org ““A standard for financial derivatives A standard for financial derivatives
business-to-business e-Commerce”business-to-business e-Commerce” Others?Others?
Beyond XMLBeyond XML
Xlink - hyperlinks in XMLXlink - hyperlinks in XML XPointer & Xfragments - point to parts of XPointer & Xfragments - point to parts of
an XML documentan XML document CSS - style sheet languageCSS - style sheet language
XML and HTMLXML and HTML XSL - advanced language for style sheetsXSL - advanced language for style sheets XSLT - XSL transformation languageXSLT - XSL transformation language
Beyond XML (2)Beyond XML (2)
DOM - standard function calls for DOM - standard function calls for manipulating XML (and HTML) from manipulating XML (and HTML) from programsprograms
XML Namespaces - link a URL with XML Namespaces - link a URL with every tag and attributeevery tag and attribute
XML Schemas 1 & 2 - help in precisely XML Schemas 1 & 2 - help in precisely developing own XML-based formatsdeveloping own XML-based formats
Homework #10Homework #10
Last One! (No HW #11)Last One! (No HW #11) Research and evaluate productsResearch and evaluate products 100 points100 points
FinalFinal
Next Tuesday, 5/1Next Tuesday, 5/1 Approximately 1/3 from 4/3 - 4/24Approximately 1/3 from 4/3 - 4/24 Remainder - comprehensiveRemainder - comprehensive
Thank YouThank You