+ All Categories
Home > Documents > Monica OLAP

Monica OLAP

Date post: 24-Nov-2015
Category:
Upload: poojamittal26
View: 28 times
Download: 3 times
Share this document with a friend
Description:
bfgth
Popular Tags:
36
OLTP Online Transaction Processing System (OLTP) The online operational Database System that performs online transaction and query processing is called On– Line transaction Processing (OLTP) systems. Ex. “Day to day” operations of organizations, such as purchasing, inventory, manufacturing, banking, payroll registration, and accounting. OLTP System deals with operational data. Operational data are those data involved in the operation of a particular system. Example: In a banking System, you withdraw amount from your account. Then Account Number, Withdrawal amount, Available Amount, Balance Amount, Transaction Number etc are operational data elements.
Transcript

Slide 1

OLTPOnline Transaction Processing System (OLTP)

The online operational Database System that performs online transaction and query processing is called On Line transaction Processing (OLTP) systems. Ex. Dayto day operations of organizations, such as purchasing,inventory, manufacturing, banking, payroll registration, andaccounting. OLTP System deals with operational data. Operational data are those data involved in the operation of a particular system. Example: In a banking System, you withdraw amount from your account. Then Account Number, Withdrawal amount, Available Amount, Balance Amount, Transaction Number etc are operational data elements.

OLTPIn an OLTP system data are frequently updated and queried. So quick response to a request is highly expected. Since the OLTP systems involves large number of update queries, the database tables are optimized for write operations. To prevent data redundancy and to prevent update anomalies the database tables are normalized. Normalization makes the write operation in the database tables more efficient.Operational data are usually of local relevance. It involves queries accessing individual tuple(individual record).These type of queries are termed as point queries.

Examples for OLTP Queries:

What is the Salary of Mr. John?Withdraw Money from Bank Account : It performs update operation if money is withdrawn from account.What is the address and email id of the person who is the head of maths department?

What is OLAPBasic idea: converting data into information that decision makers need

Concept to analyze data by multiple dimension in a structure called data cubeOLAPOLAP designates a category of applications and technologies that allows the collection, storage, manipulation and reproduction of multidimensional data, with the goal of analysis.HistoryIn 1993, E. F. Codd came up with the term online analytical processing (OLAP) in his paper title Providing on-line analytical processing using user analyststhe term OLAP seems perfect to describe databases designed to facilitate decision making (analysis) in an organization Purpose of OLAPTo derive summarized information from large volume databaseTo generate automated reports for human viewExamples for OLAP Queries

How is the profit changing over the years across different regions ?Is it financially viable to continue the production unit at location X?

OLAP, by Dr. Khalil9What and Why OLAP?OLAP enables users to gain a deeper understanding and knowledge about various aspects of their corporate data through fast, consistent, interactive access to a variety of possible views of data.While OLAP systems can easily answer who? and what? questions, its ability is to answer what if? and why? type questions that distinguishes them from general-purpose query tools.The types of analysis available from OLAP range from basic navigation and browsing (referred to as slicing and dicing) , to calculations, to more complex analysis such as time series and complex modeling.

OLAP, by Dr. Khalil10OLAP ApplicationsFinance: Budgeting, activity-based costing, financial performance analysis, and financial modeling.

Sales: Sales analysis and sales forecasting.

Marketing: Market research analysis, sales forecasting, promotions analysis, customer analysis, and market/customer segmentation.

Manufacturing: Production planning and defect analysis.OLAP, by Dr. Khalil11OLAP Benefits Increased productivity of business end-users, IT developers, and consequently the entire organization.Reduced backlog of applications development for IT staff by making end-users self-sufficient enough to make their own schema changes and build their own models. Retention of organizational control over the integrity of corporate data as OLAP applications are dependent on data warehouses and OLTP systems to refresh their source data level. Improved potential revenue and profitability by enabling the organization to respond more quickly to market demands.OLTP System Online Transaction Processing (Operational System)OLAP System Online Analytical Processing (Data Warehouse)Source of dataOperational data; OLTPs are the original source of the data.Consolidation data; OLAP data comes from the various OLTP DatabasesPurpose of dataTo control and run fundamental business tasksTo help with planning, problem solving, and decision supportWhat the dataReveals a snapshot of ongoing business processesMulti-dimensional views of various kinds of business activitiesInserts and UpdatesShort and fast inserts and updates initiated by end usersPeriodic long-running batch jobs refresh the dataQueriesRelatively standardized and simple queries Returning relatively few recordsOften complex queries involving aggregationsProcessing SpeedTypically very fastDepends on the amount of data involved; batch data refreshes and complex queries may take many hours; query speed can be improved by creating indexesSpace RequirementsCan be relatively small if historical data is archivedLarger due to the existence of aggregation structures and history data; requires more indexes than OLTPDatabase DesignHighly normalized with many tablesTypically de-normalized with fewer tables; use of star and/or snowflake schemasBackup and RecoveryBackup religiously; operational data is critical to run the business, data loss is likely to entail significant monetary loss and legal liabilityInstead of regular backups, some environments may consider simply reloading the OLTP data as a recovery methodSchemaPronounce skee-ma, the structure of a database system, described in a formal language supported by the database management system (DBMS). In a relational database, the schema defines the tables, the fields in each table, and the relationships between fields and tables.Schemas are generally stored in a data dictionary. Although a schema is defined in text database language, the term is often used to refer to a graphical depiction of the database structure.

Types of SchemasIn database:-Hierarchical modelNetwork modelRelational model (RDBMS)In data warehouseStar schemaSnow-flake schema

Star schemaThe star schema architecture is the simplest data warehouse schema. It is called a star schema because the diagram resembles a star, with points radiating from a center. The center of the star consists of fact table and the points of the star are the dimension tables. Usually the fact tables in a star schema are in third normal form(3NF) whereas dimensional tables are de-normalized. Despite the fact that the star schema is the simplest architecture, it is most commonly used nowadays and is recommended by Oracle.In a relational database, denormalization is an approach to speeding up read performance (data retrieval) in which the administrator selectively adds back specific instances of redundant data after the data structure has been normalized. A denormalized database should not be confused with a database that has never been normalized.15Star Schema

Star SchemaFact Tables

A fact table typically has two types of columns: foreign keys to dimension tables and measures those that contain raw numeric items that represent relevant business facts. A fact table can contain fact's data on detail or aggregated level, so it tends to be very large.Star SchemaDimension TablesA dimension table is a structure usually composed of one or more hierarchies that categorizes data. If a dimension hasn't got a hierarchies and levels it is called flat dimension or list. These tables are joined to the fact table using foreign key references. Dimension tables are generally small in size then fact table.

Typical fact tables store data about sales while dimension tables data about geographic region(markets, cities) , customers, products, time.Characteristics of star schema: Simple structure -> easy to understand schema Great query effectives -> small number of tables to join Relatively long time of loading data into dimension tables -> de-normalized The most commonly used in the data warehouse implementations -> widely supported by a large number of business intelligence toolsSnowflake schema It is a logical arrangement of tables in a multidimensional database such that the entity relationship diagram resembles a snowflake shape. The snowflake schema is represented by centralized fact tables which are connected to multiple dimensions. "Snowflaking" is a method of normalising the dimension tables in a star schema. When it is completely normalised along all the dimension tables, the resultant structure resembles a snowflake with the fact table in the middle. The principle behind snowflaking is normalisation of the dimension tables. Snow-flake schema

Snow-flake Schema Star SchemaEase of maintenance / changeNo redundancy and hence more easy to maintain and changeHas redundant data and hence less easy to maintain/changeEase of UseMore complex queries and hence less easy to understandLess complex queries and easy to understandQuery PerformanceMore foreign keys-and hence more query execution timeLess no. of foreign keys and hence lesser query execution timeType of DatawarehouseGood to use for datawarehouse core to simplify complex relationships (many:many)Good for datamarts with simple relationships (1:1 or 1:many)JoinsHigher number of JoinsFewer JoinsDimension tableIt may have more than one dimension table for each dimensionContains only single dimension table for each dimensionWhen to useWhen dimension table is relatively big in size, snowflaking is better as it reduces space.When dimension table contains less number of rows, we can go for Star schema.Normalization/ De-NormalizationDimension Tables are in Normalized form but Fact Table is still in De-Normalized formBoth Dimension and Fact Tables are in De-Normalized formData modelBottom up approachTop down approachCubeA cube is a multidimensional structure that contains information for analytical purposes; the main constituents of a cube are dimensions and measures. Dimensions define the structure of the cube that you use to slice and dice over, and measures provide aggregated numerical values of interest to the end user. As a logical structure, a cube allows a client application to retrieve values, of measures, as if they were contained in cells in the cube; cells are defined for every possible summarized value. A cell, in the cube, is defined by the intersection of dimension members and contains the aggregated values of the measures at that specific intersection.Benefit of Using CubesA cube provides a single place where all related data, for analysis, is stored.

3-D Cubedimensions = 3Multi-dimensional cube:Fact table view:

day 2

day 1ExampleStoreProductTimeM T W Th F S SJuiceMilkCokeCreamSoapBreadNYSFLA10345632125656 units of bread sold in LA on MDimensions:Time, Product, StoreAttributes:Product (upc, price, )Store Hierarchies:Product Brand Day Week QuarterStore Region Countryroll-up to weekroll-up to brandroll-up to regionOLAP, by Dr. Khalil26Representation of Multi-Dimensional DataOLAP database servers use multi-dimensional structures to store data and relationships between data. Multi-dimensional structures are best-visualized as cubes of data, and cubes within cubes of data. Each side of a cube is a dimension.

OLAP, by Dr. Khalil27Representation of Multi-Dimensional DataMulti-dimensional databases are a compact and easy-to-understand way of visualizing and manipulating data elements that have many inter-relationships. The cube can be expanded to include another dimension, for example, the number of sales staff in each city.The response time of a multi-dimensional query depends on how many cells have to be added on-the-fly. As the number of dimensions increases, the number of cubes cells increases exponentially.

OLAP, by Dr. Khalil28Representation of Multi-Dimensional DataMulti-dimensional OLAP supports common analytical operations, such as:Consolidation: involves the aggregation of data such as roll-ups or complex expressions involving interrelated data. For example, branch offices can be rolled up to cities and rolled up to countries.Drill-Down: is the reverse of consolidation and involves displaying the detailed data that comprises the consolidated data.Slicing and dicing: refers to the ability to look at the data from different viewpoints. Slicing and dicing is often performed along a time axis in order to analyze trends and find patterns.

Olap cube basicsMeasuresDimensionsHierarchiesLevels

29OLAP InplementationMultidimensional OLAP (MOLAP)Relational OLAP (ROLAP)Hybrid OLAP (HOLAP)OLAP, by Dr. Khalil31Multi-dimensional OLAP (MOLAP)MOLAP tools use specialized data structures and multi-dimensional database management systems (MDDBMS) to organize, navigate, and analyze data.To enhance query performance the data is typically aggregated and stored according to predicted usage.MOLAP data structures use array technology and efficient storage techniques that minimize the disk space requirements through sparse data management.The development issues associated with MOLAP:Only a limited amount of data can be efficiently stored and analyzed.Navigation and analysis of data are limited because the data is designed according to previously determined requirements.MOLAP products require a different set of skills and tools to build and maintain the database.

OLAP, by Dr. Khalil32Relational OLAP (ROLAP)ROLAP is the fastest-growing type of OLAP tools.ROLAP supports RDBMS products through the use of a metadata layer, thus avoiding the requirement to create a static multi-dimensional data structure.This facilitates the creation of multiple multi-dimensional views of the two-dimensional relation.To improve performance, some ROLAP products have enhanced SQL engines to support the complexity of multi-dimensional analysis, while others recommend, or require, the use of highly denormalized database designs such as the star schema.The development issues associated with ROLAP technology:Performance problems associated with the processing of complex queries that require multiple passes through the relational data.Development of middleware to facilitate the development of multi-dimensional applications.Development of an option to create persistent multi-dimensional structures, together with facilities o assist in the administration of these structures.

HOLAPa hybrid of ROLAP and MOLAPcan be thought of as a virtual database whereby the higher levels of the database are implemented as MOLAP and the lower levels of the database as ROLAP OLAP, by Dr. Khalil34Hybrid OLAP (HOLAP)HOLAP tools provide limited analysis capability, either directly against RDBMS products, or by using an intermediate MOLAP server.HOLAP tools deliver selected data directly from DBMS or via MOLAP server to the desktop (or local server) in the form of data cube, where it is stored, analyzed, and maintained locally is the fastest-growing type of OLAP tools.The issues associated with HOLAP tools:The architecture results in significant data redundancy and may cause problems for networks that support many users.Ability of each user to build a custom data cube may cause a lack of data consistency among users.Only a limited amount of data can be efficiently maintained.

MOLAP (Multidimensional Online Analytical Processing)ROLAP (Relational Online Analytical Processing)HOLAP (Hybrid Online Analytical Processing)The MOLAP storage mode causes the aggregations of the partition and a copy of its source data to be stored in a multidimensional structure in Analysis Services when the partition is processed.The ROLAP storage mode causes the aggregations of the partition to be stored in indexed views in the relational database that was specified in the partitions data source.The HOLAP storage mode combines attributes of both MOLAP and ROLAP. Like MOLAP, HOLAP causes the aggregations of the partition to be stored in a multidimensional structure in an SQL ServerAnalysis Services instance.This MOLAP structure is highly optimized to maximize query performance. The storage location can be on the computer where the partition is defined or on another computer running Analysis Services. Because a copy of the source data resides in the multidimensional structure, queries can be resolved without accessing the partitions source data.Unlike the MOLAP storage mode, ROLAP does not cause a copy of the source data to be stored in the Analysis Services data folders. Instead, when results cannot be derived from the query cache, the indexed views in the data source are accessed to answer queries.HOLAP does not cause a copy of the source data to be stored. For queries that access only summary data in the aggregations of a partition, HOLAP is the equivalent of MOLAP.MOLAP (Multidimensional Online Analytical Processing)ROLAP (Relational Online Analytical Processing)HOLAP (Hybrid Online Analytical Processing)Query response times can be decreased substantially by using aggregations. The data in the partitions MOLAP structure is only as current as the most recent processing of the partition.Query response is generally slower with ROLAP storage than with the MOLAP or HOLAP storage modes. Processing time is also typically slower with ROLAP. However, ROLAP enables users to view data in real time and can save storage space when you are working with large datasets that are infrequently queried, such as purely historical data.Queries that access source datafor example, if you want to drill down to an atomic cube cell for which there is no aggregation datamust retrieve data from the relational database and will not be as fast as they would be if the source data were stored in the MOLAP structure. With HOLAP storage mode, users will typically experience substantial differences in query times depending upon whether the query can be resolved from cache or aggregations versus from the source data itself.Sheet1customeridnameaddresscityproductidnamepricestorecodecitysaleoderIddatecustIdprodIdstoreIdqtyamtsalestorecodecitytypemgrcitycityIdpopregionregionregIdnamesTypetIdsizelocationsaleprodIdstoreIddateamts1s2s353joe10 mainsfop1bolt10c1nyco1001/7/9753p1c1112orderIds5sfot1joesfo1Mnorthnorthcold regiont1smalldowntownp1s1112p1125081fred12 mainsfop2nut5c2sfoo1022/7/9753p2c1211dates7sfot2fredla5Msouthsouthwarm regiont2largesuburbsp2s1111p2118111sally80 willowlac3la1053/8/97111p1c3550custIds9lat1nancyp1s3150prodIdp2s218storeIdp1s1244productidnamepriceqtyp1s224p1bolt10amtp2nut5customeridstorecodecitynamec1nycaddressc2sfocityc3laproductidsalecustIdprodIdstoreIdqtyamtname53p1c1112price53p2c1211111p1c3550storecodecity

&APage &P

Sheet2

&APage &P

Sheet3

&APage &P

Sheet4

&APage &P

Sheet5

&APage &P

Sheet6

&APage &P

Sheet7

&APage &P

Sheet8

&APage &P

Sheet9

&APage &P

Sheet10

&APage &P

Sheet11

&APage &P

Sheet12

&APage &P

Sheet13

&APage &P

Sheet14

&APage &P

Sheet15

&APage &P

Sheet16

&APage &P

Sheet1customeridnameaddresscityproductidnamepricestorecodecitysaleoderIddatecustIdprodIdstoreIdqtyamtsalestorecodecitytypemgrcitycityIdpopregionregionregIdnamesTypetIdsizelocationsaleprodIdstoreIddateamts1s2s353joe10 mainsfop1bolt10c1nyco1001/7/9753p1c1112orderIds5sfot1joesfo1Mnorthnorthcold regiont1smalldowntownp1c1112p144481fred12 mainsfop2nut5c2sfoo1022/7/9753p2c1211dates7sfot2fredla5Msouthsouthwarm regiont2largesuburbsp2c1111p2111sally80 willowlac3la1053/8/97111p1c3550custIds9lat1nancyp1c3150prodIdp2c218storeIdp1c1244productidnamepriceqtyp1c224p1bolt10amtp2nut5customeridstorecodecitynamec1nycaddressc2sfocityc3laproductidsalecustIdprodIdstoreIdqtyamtname53p1c1112price53p2c1211111p1c3550storecodecity

&APage &P

Sheet2

&APage &P

Sheet3

&APage &P

Sheet4

&APage &P

Sheet5

&APage &P

Sheet6

&APage &P

Sheet7

&APage &P

Sheet8

&APage &P

Sheet9

&APage &P

Sheet10

&APage &P

Sheet11

&APage &P

Sheet12

&APage &P

Sheet13

&APage &P

Sheet14

&APage &P

Sheet15

&APage &P

Sheet16

&APage &P

Sheet1customeridnameaddresscityproductidnamepricestorecodecitysaleoderIddatecustIdprodIdstoreIdqtyamtsalestorecodecitytypemgrcitycityIdpopregionregionregIdnamesTypetIdsizelocationsaleprodIdstoreIddateamts1s2s353joe10 mainsfop1bolt10c1nyco1001/7/9753p1c1112orderIds5sfot1joesfo1Mnorthnorthcold regiont1smalldowntownp1c1112p1125081fred12 mainsfop2nut5c2sfoo1022/7/9753p2c1211dates7sfot2fredla5Msouthsouthwarm regiont2largesuburbsp2c1111p2118111sally80 willowlac3la1053/8/97111p1c3550custIds9lat1nancyp1c3150prodIdp2c218storeIdp1c1244productidnamepriceqtyp1c224p1bolt10amtp2nut5customeridstorecodecitynamec1nycaddressc2sfocityc3laproductidsalecustIdprodIdstoreIdqtyamtname53p1c1112price53p2c1211111p1c3550storecodecity

&APage &P

Sheet2

&APage &P

Sheet3

&APage &P

Sheet4

&APage &P

Sheet5

&APage &P

Sheet6

&APage &P

Sheet7

&APage &P

Sheet8

&APage &P

Sheet9

&APage &P

Sheet10

&APage &P

Sheet11

&APage &P

Sheet12

&APage &P

Sheet13

&APage &P

Sheet14

&APage &P

Sheet15

&APage &P

Sheet16

&APage &P


Recommended