+ All Categories
Home > Documents > Training Report (2)

Training Report (2)

Date post: 17-Feb-2017
Category:
Upload: rahul-dubey
View: 87 times
Download: 0 times
Share this document with a friend
111
WIPRO – APPLYING THOUGHT 1
Transcript
Page 1: Training Report (2)

WIPRO – APPLYING THOUGHT

1

Page 2: Training Report (2)

DATA WAREHOUSING AND BUSINESS INTELLIGENCE USING QLIKVIEW

A PROJECT TRAINING REPORT Submitted By

RAHUL DUBEY

In partial fulfilment for the award of the degree of Bachelor of Technology In COMPUTER SCIENCE AND ENGINEERING

Page 3: Training Report (2)

ACKNOWLEDGEMENT

I, RAHUL DUBEY am grateful to Wipro InfoTech for providing me the opportunity to work them and complete my 6 weeks of project training as a part of my B-Tech (Comp Sc.) curriculum.

Also I would like to express my deep sense of gratitude to my project training guide Mr. Nitish Vij, Solution Architect (DWH Practice) for his invaluable guidance and suggestion during my training tenure. His experience has been immense help as werehis efforts in making us understand all the aspect of the project in a small frame of time and showing us the right way.

It has been a great learning experience for me as got a chance to apply my knowledge in a practical domain. This training and experience has not only enriched me with technical knowledge but has also infested the maturity of thought and vision, the attributes required to be successful software professional.

Last but not the least I would like sincerely thank Mr Ivor Egbert, Sr. Executive(TED)For offering us this training and also my team mates fir their support and assistance throughout this period.

Page 4: Training Report (2)

DECLARATION BY THE CANDIDATE

I hereby declare that the work which is being presented in the dissertation entitled “Data Warehouse and BI using Qlikview” in the partial fulfilment of the requirement for the award of the degree of B-Tech in Computer Science and Engineering, Jaypee University of Engg. & Tech is an authenticated record of my work carried out during the period from 31st May 2010 till 9th July under the supervision of Mr. Nitish Vij, Wipro InfoTech.

Place: Wipro InfoTech, Gurgaon Signature of the Candidate

Date :

Page 5: Training Report (2)

BONAFIDE CERTIFICATE

This is to certify that the above statements made by the candidate are true to the best of our knowledge and belief.

Place: Wipro InfoTech, Gurgaon Signature

Date:

Page 6: Training Report (2)

ABOUT THE COMPANY

Wipro InfoTech is the leading strategic IT partner for the companies across India, The Middle East and Asia-Pacific– offering integrated IT solutions. It plans, deploys, sustains and maintains your IT lifecycle through our total outsourcing, consulting service, business solutions and professional services. Wipro InfoTech helps you drive momentum in your organisation – no matter what domain you are in.

Backed by the strong quality process and rich experience managing global clients across various business verticals, it aligns IT strategies to your business goals. Along with their best of breed technology partners, Wipro InfoTech also helps you with hardware and IT infrastructure needs.

Wipro InfoTech is a part of USD 5 billion Limited (NYSE:WIT) with a market captitalization of USD 24 billion. The various accreditations that we have achieved for every service we offer reflect our commitment towards quality assurance. Wipro InfoTech was the first global software company to achieve Level 5 SEI-CMM, the world first IT company to achieve Six Sigma, as well as the world’s first company to attain Level 5 PCMM. Currently, their presence extends to 9 regional offices in India besides offices in the KSA, UAE, Taiwan, Malaysia, Singapore, Australia, and other regions in Asia-Pacific and the Middle East.

Page 7: Training Report (2)

THE SERVICES OFFERED BY THE COMPANY

In today’s world where IT infrastructure plays a key role in determining the success of your business organisation, Wipro InfoTech helps to derive maximum value from the success of your business organisation, Wipro InfoTech helps to derive maximum value from the IT investments. They offer their clients the full array of IT lifecycle services. From technology optimisation to mitigating risks, there is a constant demand to evaluate, deploy and manage flexible responsive and economical solution. Outsourcing non-core operations can help to transform the business into a leaner and smarter organisation with greater adaptability to changing economic and business trends.

In a maturing outsourcing market where both clients and vendors are becoming increasingly adept at understanding the fundamentals needed to develop a lasting relationship, Wipro InfoTech offers a partnership that goes beyond merely providing a solution. Spurred on by the goal of creating new business process and innovative models to help the customers gain new level of efficiency, differentiation, and flexibility, Wipro InfoTech offers a Total Outsourcing Services(TOS).

This powerful service offering ensures dynamic solutions that offer total process visibility resulting in pre-emptive solving of problems or issues even before they can manifest and affect the business performance.

Their solutions eschew the immature model of offering ad hoc solutions that dwell on pricing, labour arbitrage and granular level contracts within tower group solutions that tend towards being strategic corporate initiative. This ensures delivery of results against services levels, larger scope relationships that enable services providers to respond quickly and flexibly transfer of day-to-day responsibilities.

At Wipro InfoTech they also offer consulting services as part of the advisory expertise across various domains. Their various consulting practices enable you to achieve execution excellence to help drive your business momentum despite challenges arising from globalisation and the dynamics of customer loyalty. Optimising IT resources through their services, they build a strong base to empower your technology operations. This includes identifying pain areas, deploying the right resources to upgrade or solve them, implementing strategic business and IT tools, as well as managing the project lifecycle. All of these achieved through their focussed their focused quality that complies with ISO 9000, Six Sigma, SEI CMM & PCMM level 5 standards and processes.

With over two decades of experience Wipro InfoTech has a commanding lead in leveraging critical IT services for clients in India, the Middle East and Asia-pacific. Their services are further backed with strategic partnership with some of the top global technology corporations – Oracle, Microsoft, SAP, IBM among others. Their service offerings include:

Page 8: Training Report (2)

Consulting : Strategic Cost Reduction, Business Transformation, Security Governance, Strategy, E-Governance.

Business Solutions: Enterprise Applications, Solutions for Fast Emerging Businesses, Application Development and Portals, Applications Maintenance, Third Party Testing, Data Warehouse / Business Intelligence, Point Solutions.

Professional Services: System Integration, Availability Services, Managed Services.

Total Outsourcing.

DATA WAREHOUSING AND BI PRACTICES AT WIPRO INFOTECH

Data warehouses are an organization’s corporate memory , containing critical information that is the basis for their business management and operations. Organizations then require their data warehouse to be scalable, secure and stable with the ability to optimize storage and retrieval of complex sets of data. Business intelligence systems transform an organization’s ability to convert raw data into information that makes online multidimensional transaction and analytical processing possible. Data warehouse (DW) and business intelligence (BI) operations together enable organizations to base crucial business decisions on actual data analyses.

At Wipro InfoTech, the DW/BI offerings provide an organization with direct access to information analytics that will help them respond quickly to emergent business opportunities and rapidly changing market trends. With India’s largest dedicated DW/BI team of 2050+ consultants who bring 4350+ person years of experience, Wipro InfoTech DW/BI solutions framework can be customized to address the domain specific requirements.

They have extensive experience in the finance and insurance, retail, manufacturing, energy and utilities, telecom, healthcare and government sectors. Such varied domain experience, along with the alliance with global vendors in the field and cross-technology competency drive the BI operations from a departmental to an enterprise-wide initiative. As an end-to-end service provider, they consult, architect, integrate and manager’s customer’s DW/BI operations to ensure that they stay ahead in today’s competitive business environment.

Wipro’s DW/BI solutions framework includes:

DW/BI consulting

Their consultants work with you to define your specific DW/BI requirements through a comprehensive examination of your focus areas. We work to derive a solution that factors in your investment plans and balances cost-efficiency with required business benefits. The key modules are:

Preparing business cases for BI/DW Business & information analysis Preparing BI & DW solution framework Arriving at roadmap for implementation

Page 9: Training Report (2)

BI & DW project management

DW/BI architecture

They formulate a design of the proposed DW/BI solutions through aligning requirements analyses with your goals and existing infrastructure. Our key offerings include:

Data acquisition from different legacy systems on various platforms including Mainframes, AS/400, Unix, Digital OLTP platforms as DB2, IMS. IDMS, VSAM, Oracle legacy applications, Sybase, Informix and ERP packages as PeopleSoft etc.

Data modelling ETL architecture Metadata architecture and management Security architecture

DW/BI integration As a part of integration phase, Wipro InfoTech DW/BI team designs and builds physical databases ensuring that appropriate disaster recovery plans are in place. Our data mining implementation includes data cleaning, ETL, visualization and enabling data access. The data mining tool selection and creation of reporting environments are domain-specific and fulfil operational requirements such as customer profiling, target marketing, compaign effectiveness analysis and fraud detection and management. The reporting environment that we have developed and deployed are feature-rich and make multi-dimensional analyses possible across various types of data warehouse.

DW/BI management

To ensure consistent performance as data warehouse scale in volume and used to ensure maximum benefits, our DW/BI management offering includes:

Data warehouse administration, maintenance and support activities

Capacity planning Data warehousing audit Performance tuning

Page 10: Training Report (2)

ABSTRACT

Organizations are all looking to increase revenue, lower expenses, and improve profitability by improving efficiency and effectiveness in their business process and overall performance. Business Intelligence (BI) software vendors claim that they have the technology that can provide this improvement. Vendors concentrate on selling products or tools that can be used to build these solutions but rarely concentrate on the problems the customer is trying to solve. As new requirements are realized, new vendors are brought in, new tools are purchased and new consultants arrive to make it work. Eventually, the corporate BI initiative becomes a collection of disjointed point solutions using a combination of expensive monolithic commercial applications and difficult to maintain custom code .Using this current approach, each tool is designed problems must be broken into pieces and segregated into task like Reporting, Analysis, Data Mining, Workflow etc. There is no application responsible for initiating, managing, verifying or coordinating results.People and procedure are called upon to make up for these deficiencies.

This report describes the QlikView Business intelligence Platforms: QlikView, is a suite of powerful and rapidly deployable business intelligence software that enables enterprises and their management to effectively and proactively monitor, manage and optimize their business. QlikViewTM lets companies analyze all their data quickly and efficiently. QlikView eliminates the need for data warehouses, data marts, and OLAP cubes; instead it gives users rapid access to data from multiple sources in an intuitive, dashboard-style interface. With QlikView, companies can turn data into information, and information into better decisions.

Page 11: Training Report (2)

INTRODUCTION

Business have begun to exploit the burgeoning data online to make better decisions about their activities. Many of their activities are rather complicated, however and certain types of information cannot be extracted using SQL.

DATABASE APPLICATIONS

Database applications can be broadly classified into two broad categories.

1. TRANSACTION PROCESSING SYSTEM: These are the systems that recoed information about transaction, such as product sales information for companies.

2.DECISION SUPPORT SYSTEMS: They aim to get high level information out of the detailed information stored in transaction processing systems and to use the high level information to make a variety of decisions.

ISSUES INVOLVED

The storage and retrieval of data for decision support raises several issues:

Although many decision-support queries can be written in SQL, others either cannot be expressed in SQL or cannot be easily expressed in SQL.

Database query languages are not suited to the performance of detailed statistical analyses of data.

Large companies have diverse sources of data that they need to use for making business decisions. The sources may store the data in different schemas. For performance reasons, the data sources usually will not permit other parts of the company to retrieve data on demand.

Page 12: Training Report (2)

INTRODUCTION TO DATA WAREHOUSING

A data warehouse is a repository of information gathered from multiple sources, stored under a unified schema, at a single site. Once gathered, the data are stored for a long time, permitting access to historical data. Thus data warehouse provide the user single consolidated interface to data, making decision support queries easy to write. Moreover, by accessing information for decision support from data warehouse, the decision maker ensures that online transaction is not affected by decision support workload.

According to Inmon,Data warehouse is a powerful data base model that significantly enhances user’s ability to quickly analyze large, multidimensional data sets. It cleanses and organizes data to allow users to make business decisions based on facts. Hence, the data in data warehouse must have strong analytical characteristics. Creating data to be analytical requires that it must be.

1. Subject oriented.2. Integrated.3. Time referenced.4. Non-volatile.

1. Subject oriented – Data warehouse group data by subject rather by activity. In contrast transaction system allow data by activities – payroll processing, shipping products, loan processing. Data organized around activities cannot answer the questions like “ How many salaried employees are there who have a tax deduction of Rs.”X” from their account across a branches of company”. This would require heavy searching and aggregation of employee and account records of all branches. In data warehouse information are subject oriented like – employee, account, sales etc.

2. Integrated data – It refers to de-duplicating information and merging it from many sources into one consistent location.3. Time referenced – The most important characteristics of an analytical data is

its prior state of being. It refers to time valued characteristic. For e.g. The user may ask that “what were the total sales of the product “A” for the past three years on New year day across region “Y” ? ”.So we should know the sales figures of the product on New year’s Day in all the branches of that particular region.

4. Non-volatile – Data being non-volatile help users to dig deep in the history and to arrive specific decision making based on facts.

Example: In order to store data, over the years, many application designers in each branch have made their individual decisions as to how an application and database should be built. So source systems will be different in naming conventions, variable measurements, encoding structures, and physical attributes of data. Consider a bank that has got several branches in several countries, has millions of customers and the lines of business of the enterprise are savings, and loans. The following example explains how the data is integrated from source systems to target systems.

Page 13: Training Report (2)

Example of Source Data

System Name

Attribute Name Column Name Datatype Values

Source System 1

Customer Application Date

CUSTOMER_APPLICATION_DATE

NUMERIC(8,0) 11012005

Source System 2

Customer Application Date

CUST_APPLICATION_DATE DATE 11012005

Source System 3

Application Date APPLICATION_DATE DATE 01NOV200

5

In the aforementioned example, attribute name, column name, datatype and values are entirely different from one source system to another. This inconsistency in data can be avoided by integrating the data into a data warehouse with good standards.

Example of Target Data(Data Warehouse)

Target System

Attribute Name Column Name Datatype Values

Record #1

Customer Application Date

CUSTOMER_APPLICATION_DATE DATE 01112005

Record #2

Customer Application Date

CUSTOMER_APPLICATION_DATE DATE 01112005

Record #3

Customer Application Date

CUSTOMER_APPLICATION_DATE DATE 01112005

In the above example of target data, attribute names, column names, and data types are consistent throughout the target system. This is how data from various source systems is integrated and accurately stored into the data warehouse.

However, the means to retrieve and analyze data, to extract, transform and load data, and to manage the data dictionary are also considered essential components of a data warehousing system. Many references to data warehousing use this broader context. Thus, an expanded definition for data warehousing includes business intelligence tools, tools to extract, transform, and load data into the repository, and tools to manage and retrieve metadata.

Page 14: Training Report (2)

Data warehousing arises in an organisation's need for reliable, consolidated, unique and integrated reporting and analysis of its data, at different levels of aggregation.

The practical reality of most organisations is that their data infrastructure is made up by a collection of heterogeneous systems. For example, an organisation might have one system that handles customer-relationship, a system that handles employees, systems that handle sales data or production data, yet another system for finance and budgeting data, etc. In practice, these systems are often poorly or not at all integrated and simple questions like: "How much time did sales person A spend on customer C, how much did we sell to Customer C, was customer C happy with the provided service, Did Customer C pay his bills" can be very hard to answer, even though the information is available "somewhere" in the different data systems.

Yet another problem might be that the organisation is, internally, in disagreement about which data is correct. For example, the sales department might have one view of its costs, while the finance department has another view of that cost. In such cases the organisation can spend unlimited time discussing who's got the correct view of the data.

It is partly the purpose of data warehousing to bridge such problems. It is important to note that in data warehousing the source data systems are considered as given: Even though the data source system might have been made in such a manner that it's difficult to extract integrated information, the "data warehousing answer" is not to redesign the data source systems but rather to make the data appear consistent, integrated and consolidated despite the problems in the underlying source systems. Data warehousing achieves this by employing different data warehousing techniques, creating one or more new data repositories (i.e. the data warehouse) whose data model(s) support the needed reporting and analysis.

There are three types of data warehouses:

1. Enterprise Data Warehouse - An enterprise data warehouse provides a central database for decision support throughout the enterprise.

2. ODS(Operational Data Store) - This has a broad enterprise wide scope, but unlike the real enterprise data warehouse, data is refreshed in near real time and used for routine business activity. One of the typical applications of the ODS (Operational Data Store) is to hold the recent data before migration to the Data Warehouse. Typically, the ODS are not conceptually equivalent to the Data Warehouse albeit do store the data that have a deeper level of the history than that of the OLTP data.

3. Data Mart - Data mart is a subset of data warehouse and it supports a particular region, business unit or business function. It basically describes an approach in which each individual department implement its own management information system often based on a relational database or a smaller multidimensional or a spread sheet like system. However these systems once in production are difficult to extend by the use of other department because there are a inherit design limitation in building single site of business needs and second is that its expansion may lead to disruption of existing users

Page 15: Training Report (2)

COMPONENTS OF DATA WAREHOUSE

DWH architecture is a way of representing data, communication processing, presentation that exist for end-user computing within the enterprise. The architecture of a typical data warehouse consist of the parts performing the following function.

Gathering of data. Storage of data. Querying and data.

Architecture, in the context of an organization’s data warehouse efforts, is a conceptualization of how data warehouse is built. There is no right or wrong architecture, rather multiple architecture is used to support various environments and situations. The worthiness of the architecture can be judged in how conceptualization sides in the building, maintenance, and usage of data warehouse.

One possible simple conceptualization of data warehouse architecture consist of following interconnected parts.

Source system – The goal of data warehousing is to free the information locked up in the operational systems and to combine it with information form the other, often external sources of data. Increasingly, large organization are acquiring additional data from outside databases. It is very essential to identify the right data sources and determine an efficient process to collect facts.

Source data transport layer – It largely contributes to data trafficking, it particularly represents the tools and process involved in transporting data from the source systems to the enterprise warehouse system. Since the data is huge ,the interfaces with the source system have to be robust and scalable enough to manage secured data transmission.

Data quality Control and Data profiling layer- Often, data quality causes the most concern in any data warehousing solution. Incomplete and inaccurate data will jeopardize the success of the data warehouse. It is very essential to measure the quality of data of the source and take corrective action even before the information is processed and loaded into the target warehouse.

Metadata Management Layer – Metadata is the information about data within the enterprise, record description in COBOL program are metadata or create statements in SQL. So, in order to be fully functional warehouse , it is necessary to have a variety of metadata available.

Data integration layer – It is involved in scheduling the various tasks that must be accomplished to integrate data acquired from various sources. A lot formatting and cleansing activities happen in this layer so that data is consistent.

Data processing layer- The warehouse is where the dimensionally modelled data resides. In some cases one can think of the warehouse simply as a transforms view of the operational data, but the models for the analytical purpose.

Page 16: Training Report (2)

End user Reporting layer - Success of a data warehouse implementation largely depends upon ease of access to valuable information, In that sense. The end user reporting layer is very critical component.

DATA WAREHOUSE SCHEMAS

Data warehouse typically have schemas that are designed for data analysis using tools such as OLAP tools.

Thus the data are usually multidimensional data with two types of attributes.

Measure attributes – Given a relation used for data analysis some of the attributes are identified as measure attributes since they measure some value. For e.g. The attribute “NUMBER” of the sales relation is a measure attribute because it measures number of unit sold.

Dimension attribute – Some or all the attributes of the relation are identified as dimension attributes , since they define the dimension on which , measure attributes are viewed.

Table containing multidimensional data are called fact tables and are usually very large .To minimize the storage requirement dimension attributes are usually short identifiers that are foreign key into other tables called dimension tables.

TYPES OF SCHEMA

Page 17: Training Report (2)

STAR SCHEMA

The star schema (sometimes referenced as star join schema) is the simplest style of data warehouse schema. The star schema consists of a few fact tables (possibly only one, justifying the name) referencing any number of dimension tables. The star schema is considered an important special case of the snowflake schema.

SNOWFLAKE SCHEMA

Page 18: Training Report (2)

A snowflake schema is a logical arrangement of tables in a multidimensional database such that the entity relationship diagram resembles a snowflake in shape. Closely related to the star schema, the snowflake schema is represented by centralized fact tables which are connected to multiple dimensions. In the snowflake schema, however, dimensions are normalized into multiple related tables whereas the star schema's dimensions are de-normalized with each dimension being represented by a single table. When the dimensions of a snowflake schema are elaborate, having multiple levels of relationships, and where child tables have multiple parent tables ("forks in the road"), a complex snowflake shape starts to emerge. The "snow-flaking" effect only affects the dimension tables and not the fact tables.

CONSTELLATION SCHEMA

For each star schema or snowflake schema it is possible to construct a fact constellation schema. This schema is more complex than star or snowflake architecture, which is because it contains multiple fact tables. This allows dimension tables to be shared amongst many fact tables.That solution is very flexible, however it may be hard to manage and support.

The main disadvantage of the fact constellation schema is a more complicated design because many variants of aggregation must be considered.

Page 19: Training Report (2)

BENEFITS OF DATA WAREHOUSE

Some of the benefits that a data warehouse provides are as follows:

A data warehouse provides a common data model for all data of interest regardless of the data's source. This makes it easier to report and analyze information than it would be if multiple data models were used to retrieve information such as sales invoices, order receipts, general ledger charges, etc.

Prior to loading data into the data warehouse, inconsistencies are identified and resolved. This greatly simplifies reporting and analysis.

Information in the data warehouse is under the control of data warehouse users so that, even if the source system data is purged over time, the information in the warehouse can be stored safely for extended periods of time.

Because they are separate from operational systems, data warehouses provide retrieval of data without slowing down operational systems.

Data warehouses can work in conjunction with and, hence, enhance the value of operational business applications, notably customer relationship management (CRM) systems.

Data warehouses facilitate decision support system applications such as trend reports (e.g., the items with the most sales in a particular area within the last two years), exception reports, and reports that show actual performance versus goals.

Page 20: Training Report (2)

DISADVANTAGES OF DATA WAREHOUSE

There are also disadvantages to using a data warehouse. Some of them are:

Data warehouses are not the optimal environment for unstructured data. Because data must be extracted, transformed and loaded into the warehouse,

there is an element of latency in data warehouse data. Over their life, data warehouses can have high costs. Data warehouses can get outdated relatively quickly. There is a cost of

delivering suboptimal information to the organisation. There is often a fine line between data warehouses and operational systems.

Duplicate, expensive functionality may be developed. Or, functionality may be developed in the data warehouse that, in retrospect, should have been developed in the operational systems.

ETL TOOL IN DATA WAREHOUSE

Extract, transform, and load (ETL) is a process in database usage and especially in data warehousing that involves:

Extracting data from outside sources Transforming it to fit operational needs (which can include quality levels) Loading it into the end target (database or data warehouse).

Extract

The first part of an ETL process involves extracting the data from the source systems. Most data warehousing projects consolidate data from different source systems. Each separate system may also use a different data organization format. Common data source formats are relational databases and flat files, but may include non-relational database structures such as Information Management System (IMS) or other data structures such as Virtual Storage Access Method (VSAM) or Indexed Sequential Access Method (ISAM), or even fetching from outside sources such as through web spidering or screen-scraping. Extraction converts the data into a format for transformation processing.

Transform

The transform stage applies a series of rules or functions to the extracted data from the source to derive the data for loading into the end target. Some data sources will require very little or even no manipulation of data. In other cases, one or more of the following transformation types may be required to meet the business and technical needs of the target database:

Selecting only certain columns to load (or selecting null columns not to load). For example, if source data has three columns (also called attributes) say roll-no, age and salary then the extraction may take only roll-no and salary.

Page 21: Training Report (2)

Similarly, extraction mechanism may ignore all those records where salary is not present (salary = null).

Translating coded values (e.g., if the source system stores 1 for male and 2 for female, but the warehouse stores M for male and F for female), this calls for automated data cleansing; no manual cleansing occurs during ETL

Encoding free-form values (e.g., mapping "Male" to "1" and "Mr" to M) Deriving a new calculated value (e.g., sale_amount = qty * unit_price) Filtering Sorting Joining data from multiple sources (e.g., lookup, merge) Aggregation (for example, rollup — summarizing multiple rows of data —

total sales for each store, and for each region, etc.) Transposing or pivoting (turning multiple columns into multiple rows or vice

versa) Splitting a column into multiple columns (e.g., putting a comma-separated list

specified as a string in one column as individual values in different columns) Disaggregation of repeating columns into a separate detail table (e.g., moving

a series of addresses in one record into single addresses in a set of records in a linked address table)

Lookup and validate the relevant data from tables or referential files for slowly changing dimensions.

Applying any form of simple or complex data validation. If validation fails, it may result in a full, partial or no rejection of the data, and thus none, some or all the data is handed over to the next step, depending on the rule design and exception handling. Many of the above transformations may result in exceptions, for example, when a code translation parses an unknown code in the extracted data.

Load

Load phase loads the data into the end target, usually the data warehouse (DW). Depending on the requirements of the organization, this process varies widely. Some data warehouses may overwrite existing information with cumulative, frequently updating extract data is done on daily, weekly or monthly. while other DW (or even other parts of the same DW) may add new data in a historicized form, for example, hourly. To understand this, consider a DW that is required to maintain sales record of last one year. Then, the DW will overwrite any data that is older than a year with newer data. However, the entry of data for any one year window will be made in a historicized manner. The timing and scope to replace or append are strategic design choices dependent on the time available and the business needs. More complex systems can maintain a history and audit trail of all changes to the data loaded in the DW.

Examples

For example, a financial institution might have information on a customer in several departments and each department might have that customer's information listed in a different way. The membership department might list the customer by name, whereas the accounting department might list the customer by number. ETL can bundle all this

Page 22: Training Report (2)

data and consolidate it into a uniform presentation, such as for storing in a database or data warehouse.

ETL Tools

At present the most popular and widely used ETL tools and applications on the market are:# IBM Web-sphere Data Stage (Formerly known as Ascential Data Stage and Ardent Data Stage)# Informatica Power Centre# Oracle Warehouse Builder# Ab-Initio# Pentaho Data Integration - Kettle Project (open source ETL)# SAS ETL studio# Cognos Decision stream# Business Objects Data Integrator (BODI)# Microsoft SQL Server Integration Services (SSIS)

OLTP(Online transaction processing )

Definition: Databases must often allow the real-time processing of SQL transactions to support e-commerce and other time-critical applications. This type of processing is known as online transaction processing (OLTP).

Online transaction processing, or OLTP, refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing. The term is somewhat ambiguous; some understand a "transaction" in the context of computer or database transactions, while others (such as the Transaction Processing Performance Council) define it in terms of business or commercial transactions. OLTP has also been used to refer to processing in which the system responds immediately to user requests. An automatic teller machine (ATM) for a bank is an example of a commercial transaction processing application.

The technology is used in a number of industries, including banking, airlines, mail order, supermarkets, and manufacturing. Applications include electronic banking, order processing, employee time clock systems, e-commerce. The most widely used OLTP system is probably IBM's CICS.

Benefits

Online Transaction Processing has two key benefits: simplicity and efficiency. Reduced paper trails and the faster, more accurate forecasts for revenues and expenses are both examples of how OLTP makes things simpler for businesses.

Disadvantages

As with any information processing system, security and reliability are considerations. Online transaction systems are generally more susceptible to direct attack and abuse than their offline counterparts. When organizations choose to rely on OLTP,

Page 23: Training Report (2)

operations can be severely impacted if the transaction system or database is unavailable due to data corruption, systems failure, or network availability issues. Additionally, like many modern online information technology solutions, some systems require offline maintenance which further affects the cost-benefit analysis.

Contrasting Data warehouse and OLTP

One major difference between the types of system is that data warehouses are not usually in third normal form (3NF), a type of data normalization common in OLTP environments.

Data warehouses and OLTP systems have very different requirements. Here are some examples of differences between typical data warehouses and OLTP systems:

Workload

Data warehouses are designed to accommodate ad hoc queries. You might not know the workload of your data warehouse in advance, so a data warehouse should be optimized to perform well for a wide variety of possible query operations.

OLTP systems support only predefined operations. Your applications might be specifically tuned or designed to support only these operations.

Data modifications

A data warehouse is updated on a regular basis by the ETL process (run nightly or weekly) using bulk data modification techniques. The end users of a data warehouse do not directly update the data warehouse.

Page 24: Training Report (2)

In OLTP systems, end users routinely issue individual data modification statements to the database. The OLTP database is always up to date, and reflects the current state of each business transaction.

Schema design

Data warehouses often use de-normalized or partially de-normalized schemas (such as a star schema) to optimize query performance.

OLTP systems often use fully normalized schemas to optimize update/insert/delete performance, and to guarantee data consistency.

Typical operations

A typical data warehouse query scans thousands or millions of rows. For example, "Find the total sales for all customers last month."

A typical OLTP operation accesses only a handful of records. For example, "Retrieve the current order for this customer."

Historical data

Data warehouses usually store many months or years of data. This is to support historical analysis.

OLTP systems usually store data from only a few weeks or months. The OLTP system stores only historical data as needed to successfully meet the requirements of the current transaction.

OLAP(online analytical processing)

Online analytical processing, or OLAP is an approach to swiftly answer multi-dimensional analytical queries. OLAP is part of the broader category of business intelligence, which also encompasses relational reporting and data mining. The typical applications of OLAP are in business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas. The term OLAP was created as a slight modification of the traditional database term OLTP (Online Transaction Processing).

Databases configured for OLAP use a multidimensional data model, allowing for complex analytical and ad-hoc queries with a rapid execution time. They borrow aspects of navigational databases and hierarchical databases that are faster than relational databases.

The output of an OLAP query is typically displayed in a matrix (or pivot) format. The dimensions form the rows and columns of the matrix; the measures form the values.

Page 25: Training Report (2)

At the core of any OLAP system is the concept of an OLAP cube (also called a multidimensional cube or a hypercube). It consists of numeric facts called measures which are categorized by dimensions. The cube metadata is typically created from a star schema or snowflake schema of tables in a relational database. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables.

Each measure can be thought of as having a set of labels, or meta-data associated with it. A dimension is what describes these labels; it provides information about the measure.

A simple example would be a cube that contains a store's sales as a measure, and Date/Time as a dimension. Each Sale has a Date/Time label that describes more about that sale.

Any number of dimensions can be added to the structure such as Store, Cashier, or Customer by adding a column to the fact table. This allows an analyst to view the measures along any combination of the dimensions.

For Example:

Sales Fact Table +------------------------+| sale_amount | time_id |+------------------------+ Time Dimension | 2008.08| 1234 |---+ +-----------------------------++------------------------+ | | time_id | timestamp | | +-----------------------------+ +---->| 1234 | 20080902 12:35:43 | +-----------------------------+

MOLAP(MULTIDIMENSIONAL OLAP)

MOLAP stands for Multidimensional Online Analytical Processing.

MOLAP is an alternative to the ROLAP (Relational OLAP) technology. While both ROLAP and MOLAP analytic tools are designed to allow analysis of data through the use of a multidimensional data model, MOLAP differs significantly in that it requires the pre-computation and storage of information in the cube — the operation known as processing. MOLAP stores this data in an optimized multidimensional array storage, rather than in a relational database (i.e. in ROLAP).

Advantages of MOLAP

Fast query performance due to optimized storage, multidimensional indexing and caching.

Smaller on-disk size of data compared to data stored in relational database due to compression techniques.

Automated computation of higher level aggregates of the data. It is very compact for low dimension data sets.

Page 26: Training Report (2)

Array model provides natural indexing Effective data extract achieved through the pre-structuring of aggregated data.

Disadvantages of MOLAP

The processing step (data load) can be quite lengthy, especially on large data volumes. This is usually remedied by doing only incremental processing, i.e., processing only the data which has changed (usually new data) instead of reprocessing the entire data set.

MOLAP tools traditionally have difficulty querying models with dimensions with very high cardinality (i.e., millions of members).

Some MOLAP products have difficulty updating and querying models with more than ten dimensions. This limit differs depending on the complexity and cardinality of the dimensions in question. It also depends on the number of facts or measures stored. Other MOLAP products can handle hundreds of dimensions.

MOLAP approach introduces data redundancy.

ROLAP(RELATIONAL OLAP)

ROLAP stands for Relational Online Analytical Processing.

ROLAP is an alternative to the MOLAP (Multidimensional OLAP) technology. While both ROLAP and MOLAP analytic tools are designed to allow analysis of data through the use of a multidimensional data model, ROLAP differs significantly in that it does not require the pre-computation and storage of information. Instead, ROLAP tools access the data in a relational database and generate SQL queries to calculate information at the appropriate level when an end user requests it. With ROLAP, it is possible to create additional database tables (summary tables or aggregations) which summarize the data at any desired combination of dimensions.

While ROLAP uses a relational database source, generally the database must be carefully designed for ROLAP use. A database which was designed for OLTP will not function well as a ROLAP database. Therefore, ROLAP still involves creating an additional copy of the data. However, since it is a database, a variety of technologies can be used to populate the database.

Advantages of ROLAP

ROLAP is considered to be more scalable in handling large data volumes, especially models with dimensions with very high cardinality (i.e. millions of members).

With a variety of data loading tools available, and the ability to fine tune the ETL code to the particular data model, load times are generally much shorter than with the automated MOLAP loads.

The data is stored in a standard relational database and can be accessed by any SQL reporting tool (the tool does not have to be an OLAP tool).

Page 27: Training Report (2)

ROLAP tools are better at handling non-aggregatable facts (e.g. textual descriptions). MOLAP tools tend to suffer from slow performance when querying these elements.

By decoupling the data storage from the multi-dimensional model, it is possible to successfully model data that would not otherwise fit into a strict dimensional model.

The ROLAP approach can leverage database authorization controls such as row-level security, whereby the query results are filtered depending on preset criteria applied, for example, to a given user or group of users (SQL WHERE clause).

Disadvantages of ROLAP

There is a consensus in the industry that ROLAP tools have slower performance than MOLAP tools. However, see the discussion below about ROLAP performance.

The loading of aggregate tables must be managed by custom ETL code. The ROLAP tools do not help with this task. This means additional development time and more code to support.

When the step of creating aggregate tables is skipped, the query performance then suffers because the larger detailed tables must be queried. This can be partially remedied by adding additional aggregate tables, however it is still not practical to create aggregate tables for all combinations of dimensions/attributes.

ROLAP relies on the general purpose database for querying and caching, and therefore several special techniques employed by MOLAP tools are not available (such as special hierarchical indexing). However, modern ROLAP tools take advantage of latest improvements in SQL language such as CUBE and ROLLUP operators, DB2 Cube Views, as well as other SQL OLAP extensions. These SQL improvements can mitigate the benefits of the MOLAP tools.

Since ROLAP tools rely on SQL for all of the computations, they are not suitable when the model is heavy on calculations which don't translate well into SQL. Examples of such models include budgeting, allocations, financial reporting and other scenarios.

Page 28: Training Report (2)

DIFFERENCE BETWEEN OLTP AND OLAP

OLTP OLAP

Current data &Short database transactions

Current as well as historical data &Long database transactions

Online update/insert/delete Batch update/insert/deleteNormalization is promoted De-normalization is promotedHigh volume transactions Low volume transactionsTransaction recovery is necessary Transaction recovery is not necessaryDatabase size: 100 MB TO GB Database size :100 GB to TB

BUSINESS INTELLIGENCE

Page 29: Training Report (2)

Business Intelligence (BI) refers to computer-based techniques used in spotting, digging-out, and analyzing business data, such as sales revenue by products and/or departments or associated costs and incomes

BI technologies provide historical, current, and predictive views of business operations. Common functions of Business Intelligence technologies are reporting, online analytical processing, analytics, data mining, business performance management, benchmarking, text mining, and predictive analytics.

Business Intelligence often aims to support better business decision-making Thus a BI system can be called a decision support system (DSS) Though the term business intelligence is often used as a synonym for competitive intelligence, because they both support decision making, BI uses technologies, processes, and applications to analyze mostly internal, structured data and business processes while competitive intelligence, is done by gathering, analyzing and disseminating information with or without support from technology and applications, and focuses on all-source information and data (unstructured or structured), mostly external to, but also internal to a company, to support decision making.

The five key stages of Business Intelligence:

1. Data Sourcing

2. Data Analysis

3. Situation Awareness

4. Risk Assessment

5. Decision Support

Data sourcing

Business Intelligence is about extracting information from multiple sources of data. The data might be: text documents - e.g. memos or reports or email messages; photographs and images; sounds; formatted tables; web pages and URL lists. The key to data sourcing is to obtain the information in electronic form. So typical sources of data might include: scanners; digital cameras; database queries; web searches; computer file access; etcetera.

Data analysis

Business Intelligence is about synthesizing useful knowledge from collections of data. It is about estimating current trends, integrating and summarising disparate information, validating models of understanding, and predicting missing information or future trends. This process of data analysis is also called data mining or knowledge discovery. Typical analysis tools might use:-

Page 30: Training Report (2)

Probability theory - e.g. classification, clustering and Bayesian networks.

Statistical methods - e.g. regression. Operations research - e.g. queuing and scheduling.  Artificial intelligence - e.g. neural networks and fuzzy logic.

Situation awareness

Business Intelligence is about filtering out irrelevant information, and setting the remaining information in the context of the business and its environment. The user needs the key items of information relevant to his or her needs, and summaries that are syntheses of all the relevant data (market forces, government policy etc.).  Situation awareness is the grasp of  the context in which to understand and make decisions.  Algorithms for situation assessment provide such syntheses automatically.

Risk assessment

Business Intelligence is about discovering what plausible actions might be taken, or decisions made, at different times. It is about helping you weigh up the current and future risk, cost or benefit of taking one action over another, or making one decision versus another. It is about inferring and summarising your best options or choices.

Decision support

Business Intelligence is about using information wisely.  It aims to provide warning you of important events, such as takeovers, market changes, and poor staff performance, so that you can take preventative steps. It seeks to help you analyse and make better business decisions, to improve sales or customer satisfaction or staff morale. It presents the information you need, when you need it.

 

Page 31: Training Report (2)

MODELLING TECHNIQUES IN DATA WAREHOUSE

CONCEPTUAL DATA MODEL

A conceptual data model identifies the highest-level relationships between the different entities. Features of conceptual data model include:

* Includes the important entities and the relationships among them. * No attribute is specified. * No primary key is specified.

The figure below is an example of a conceptual data model.

From the figure above, we can see that the only information shown via the conceptual data model is the entities that describe the data and the relationships between those entities. No other information is shown through the conceptual data model.

LOGICAL DATA MODEL

A logical data model describes the data in as much detail as possible, without regard to how they will be physical implemented in the database. Features of a logical data model include:

* Includes all entities and relationships among them. * All attributes for each entity are specified. * The primary key for each entity is specified. * Foreign keys (keys identifying the relationship between different entities) are specified. * Normalization occurs at this level.

Page 32: Training Report (2)

The steps for designing the logical data model are as follows:

1. Specify primary keys for all entities. 2. Find the relationships between different entities. 3. Find all attributes for each entity. 4. Resolve many-to-many relationships. 5. Normalization.

The figure below is an example of a logical data model.

Comparing the logical data model shown above with the conceptual data model diagram, we see the main differences between the two:

* In a logical data model, primary keys are present, whereas in a conceptual data model, no primary key is present. * In a logical data model, all attributes are specified within an entity. No attributes are specified in a conceptual data model. * Relationships between entities are specified using primary keys and foreign keys in a logical data model. In a conceptual data model, the relationships are simply stated, not specified, so we simply know that two entities are related, but we do not specify what attributes are used for this relationship.

PHYSICAL DATA MODEL

Physical data model. Physical data model represents how the model will be built in the database. A physical database model shows all table structures, including column

Page 33: Training Report (2)

name, column data type, column constraints, primary key, foreign key, and relationships between tables. Features of a physical data model include:

* Specification all tables and columns. * Foreign keys are used to identify relationships between tables. * De-normalization may occur based on user requirements. * Physical considerations may cause the physical data model to be quite different from the logical data model. * Physical data model will be different for different RDBMS. For example, data type for a column may be different between My-SQL and SQL Server.

The steps for physical data model design are as follows:

1. Convert entities into tables. 2. Convert relationships into foreign keys. 3. Convert attributes into columns. 4. Modify the physical data model based on physical constraints / requirements.

The figure below is an example of a physical data model

Comparing the logical data model shown above with the logical data model diagram, we see the main differences between the two:

* Entity names are now table names. * Attributes are now column names. * Data type for each column is specified. Data types can be different depending on the actual database being used.

Page 34: Training Report (2)

QLIKVIEW – A BUSINESS INTELLIGENCE TOOL

What is QlilView?

QlikView is a flagged product of QlikTech Company and can be classify it to the category of Business Intelligence tools of the future. During the 2007 QlikTech gained a title of the "coolest" vendor in BI and has been growing faster than any other BI vendor. Recognized as a 'Visionary' in Gartner Group's annual Magic Quadrant (2007). According to Gartner's predictions "By 2012, 70% of Global 1000 organizations will load detailed data into memory as the primary method to optimize BI application performance (0.7 probability)" and the QlikView is one of the leader in this market. QlikView creates endless possibilities of making ad hoc queries in a non-hierarchical data structure. It is possible thanks to (AQL - Associative Query Logic) - for automatically associating values in the internal QlikView database. QlikView simplifies analysis for everyone. It makes it possible for anybody to create very useful, accurate KPI, measurement reports and performance dashboards and make accurate,strategicdecisions.

QlikView has over 4,500 customers in 58 countries, and adds 11 new customers each day. In addition to hundreds of small and midsized companies, QlikTech's customers include large corporations such as Pfizer, AstraZeneca, The Campbell Soup Company, Top Flite, and 3M. QlikTech is privately held and venture backed by Accel Partners, Jerusalem Venture Partners, and Industrifonden.

Qlikview is an easy to use and flexible business intelligence software that has been around since 1993. It allows for interactive analysis and is easy to develop, implement and train users on.

Other QlikTech products are, Qlikview Server which provides data analysis of Qlikview data over the web and Qlikview Publisher which helps control the distibution of Qlikview's applications.

QlikView, is a suite of powerful and rapidly deployable business intelligence software that enables enterprises and their management to effectively and proactively monitor, manage and optimize their business. QlikViewTM lets companies analyze all their data quickly and efficiently. QlikView eliminates the need for data warehouses, data marts, and OLAP cubes; instead it gives users rapid access to data from multiple sources in an intuitive, dashboard-style interface. With QlikView, companies can turn data into information, and information into better decisions.QlikView is quick to implement, flexible and powerful to use, and easy to learn. It provides a rapid return on investment and a low total cost of ownership compared to traditional OLAP and reporting tools.

Page 35: Training Report (2)

QlikView helps California Casualty improve efficiencies

25% improvement in sales conversions, 60% improvement in compliance response time.

The Business Failure of Traditional Business Intelligence

A recent article in Intelligent Enterprise magazine captured the three major issues of today’straditional BI solutions: 1) Data reporting has been an afterthought to using core businessapplications on a day-to-day basis, breaking the “link between insight and action.” 2)Consolidating disparate tools into a suite isn’t enough to give business users the information they need. Search and semantics need to be integrated. 3) “Build it and they will come” doesn’t provide insight, just more technology. BI needs to answer the needs of decision makers, and needs to deliver incremental successes along the way.

These factors, among others, have undoubtedly led to the dismal performance in BI initiatives. According to a study published in DM Review, a leading business intelligence publication, the average total implementation time for BI initiatives is 17 months, with five months to deploy the first usable analytic application. The average total cost of implementation is a staggering $12.8 million. And at best, according to the survey, is a 35% success rate for internally built BI/DW systems purchased operational analytic applications are considered successful a mere 13% of the time.

How long did it take to implement your business intelligence initiative?

Page 36: Training Report (2)

These failed initiatives cost more than money and resources – they hamper business performance in nearly every way. A summer 2005 survey of 385 finance and IT executives, by CFO Research Services asked respondents to identify the drivers of poor information quality (IQ). Nearly half the survey respondents – 45 percent – cite disparate, non-integrated IT systems and the variability of business processes as an acute problem that constrains management’s ability to work effectively and focus on high-value activities. Approximately the same number agrees that finance and business units alike spend too much time developing supplemental reports and analysis.

A Revolution in Business Intelligence

The OLAP TraditionTwenty years ago memory was expensive and processors were slow. Faced with theseconstraints, developers at the time devised architecture for delivering results of multi-dimensional analysis which relied on pre-calculating fixed analyses. Simply put, they pre-calculated all measures across every possible combination of dimensions. For example, for total sales by sales person and region, the system would calculate total sales for each sales person for each region, and for every union of sales person and region. The results of these calculations were stored and retrieved when an end user requested a particular “analysis.”

This is what is traditionally referred to as “calculating the cube” and the “cube” is the mechanism which organizes and stores the results. Because the results were pre-calculated, regardless of how long in took to calculate

Page 37: Training Report (2)

the results, the response time from the perspective of the end user was instantaneous.

The Enabling Technology or Change AgentToday, we have a fundamentally different technology platform available to us on which to build business intelligence. Specifically three things have happened:

First, Moore’s Law has relentlessly beat its drum – resulting in processors which are significantly faster today than they were twenty years ago and memory which is significantly less expensive. The difference in price/performance for both factors is well over a factor of 1,000 higher today than it was then.

Second, the mainstream availability of 64-bit processors raises the amount of memory a computer can utilize. A 32-bit processor can use four gigabytes of memory at a maximum, and a portion of that must be devoted to the operating system. A 64-bit processor can use 17,179,869,184 gigabytes or 16 Exabyte of RAM – a factor of four billion more. Of course, the practical limitation of computers available today is much lower, but machines with 40, 80, or even 120 gigabytes of memory are readily available for less than $30,000.

Third, hardware manufacturers have shifted from computers with few fast processors to computers with multiple lower-power, lower-speed processors. The challenge today is keeping computers operating at a reasonable temperature. Intel’ and AMD’s stated strategy for achieving this goal is to equip computers with many lower power processors working in parallel. Today it is common to find computers with 2, 4, 16, 32 or even 128 processors. In addition, newer processors have multiple “cores” bundled on a single chip.

QlikView’s Premise: In Memory BI

QlikView was built with a simple architectural premise –all data should be held in memory, and all calculationsshould be performed when requested and not prior.Twenty years ago this would have been impossible. In1993, when QlikTech was founded, it was still a prettycrazy idea. But now, the trends in the underlyingplatform (referenced in the previous section) have liftedthe constraints so that organizations of all sizes can nowbenefit.

Page 38: Training Report (2)

QlikView’s patented technology is based on an extremely efficient, in memory data model. High-speed associations occur as the user clicks in the applications and the display is updated immediately, allowing users to work with millions of cells of data and still respond to queries in less than a second.

As a result of this design, QlikView removes the need to pre-aggregate data, define complex dimensional hierarchies and generate cubes. QlikView performs calculations on the fly, giving the power of multidimensional analysis to every user, not just the highly trained few. By taking full advantage of 64-bit technology’s memory capacity QlikView can provide summary level metrics and record level detail on the same architecture. Companies gain infinitely scalable business analysis solutions that provide summary KPIs as well as highly granular, detailed analyses.

In its recent Research Note, leading analysts at Gartner reported on the value of this approach: “Our research indicates that query performance using this in-memory method is often just as fast as or faster than traditional aggregate-based architectures. In-memory technology not only retrieves the data faster, but it also performs calculations on the query results much faster than disk-based approaches…Therefore, with in-memory technology, users can freely explore detailed data in an unfettered manner without the limitations of a cube or aggregate table to receive good performance.”

A Revolution in Benefits

Page 39: Training Report (2)

The QlikView solution, because of its unique integrated components and because it operates entirely in memory, offers some unique advantages over traditional OLAP:Fast Time-to-Value: With traditional OLAP, constructing cubes is time consuming and requires expert skills. This process can take months, and sometimes over a year. In addition, the cube must be constructed before it can be calculated, a process which itself can take hours. And, all this must occur before analysis or reporting can be performed – before the user even sees answers to his questions. Because the data is loaded in memory, creating analysis in QlikView takes seconds. There is no pre-definition of what is a dimension – any data is available as a dimension and any data is available as a measure. The time implementing QlikView is spent locating data, and deciding what analysis is interesting or relevant to solving the business question. Typically, this process takes only a week or two.Easy to Use: The entire end user experience in QlikView is driven by the “click.” End users enjoy using QlikView because it works the way their mind does. Each time they want to review the data sliced a new way, they simply click on the data they want to evaluate. Because QlikView operates in memory, with each click all data and measures are recalculated to reflect the selection. Users can go from high level aggregates (e.g., roll up of margin on all products in a specific line) to individual records (e.g., which order was that?) in a click – without pre-defining the path to the individual record. The QlikView UI uses color coding which provides instant feedback to queries.Powerful: Because queries and calculations are performed in memory, they are extremely quick. In addition, QlikView is not constrained by the speed of the underlying source. Even if the underlying data is stored in a system which has poor query performance (for instance, a text file), the performance is always optimal because the data is loaded in memory.

QlikView also compresses data as it is stored in memory, allowing large amounts of data to be stored. Typically, there is a 10X reduction in size of the data once it’s in memory.Flexible: One of the major issues with traditional OLAP is that modifying an analysisrequires changing the cube, a process which can take a very long time. In addition, this process is typically controlled by IT. With QlikView, viewing analysis by a new dimension or changing a measure can be performed by business professionals in seconds. Standard interfaces, including ODBC and Web Services, mean that any data source can be analyzed in QlikView. What’s more, users

Page 40: Training Report (2)

can do “local” or “desktop” analysis, using the full data and interactivity of the application on laptops.Scalable: QlikView is designed to scale easily in both the amount of data it can handle, and the number of users working with it. It’s simple to deploy to thousands of users – utilizing all available hardware power across all available processors and cores, requiring only a web browser.

An Overview of QlikViewQlikView is revolutionizing business intelligence with fast, powerful and visual analysis that’s simple to use. QlikView’s patented technology offers all of the features of “traditional” analytics solutions – dashboards and alerts, multi-dimensional analyses, slice-and-dice of data – without the limitations, cost or complexity of traditional BI applications. QlikView solutions can be deployed in days, users can be trained in minutes, and end usersget results instantly.

Page 41: Training Report (2)

The QlikView Platform

QlikView offers all of the capabilities that traditionally required a complex and costly suite of products, on a single unified platform. QlikView provides flexible ad-hoc analysis capabilities, powerful analytic applications, and simple printable reports. This allows organizations to deploy QlikView to everyone – highly skilled analysts doing ad-hoc detailed reporting, executives requiring a dashboard of critical business information and plant supervisors analyzing output performance. Further, QlikView allows organizations to eliminate unused paper reports, and replace them with demand-driven reporting.

QlikView Enterprise – For the developer

QlikView Enterprise is the complete developer’s tool for building QlikView applications. QlikView Enterprise lets developers load disparate data sources for access in a single application. The data load script supports over 150 functions for data cleansing, manipulation and aggregation. An intuitive, wizard-driven interface allows powerful, visually interactive applications to be developed quickly.

QlikView Publisher – For distribution

QlikView Publisher ensures that the right information reaches the right user at the right time. As the use of business analysis spreads throughout the organization, controlling the distribution of analysis becomes increasingly important. QlikView Publisher allows for complete control of the distribution of a company’s QlikView applications, automating the data refresh process for QlikView

Page 42: Training Report (2)

application data. In addition, it ensures that applications are distributed to the correct users when they need them.

QlikView Server – For security

QlikView Server is the central source for truth in an organization. With today’s distributed workforce, QlikView Server provides a simple way for organizations to ensure that everyone has access to the latest data and analysis regardless of their location. Regardless of the client chosen – zero-footprint DHTML, Windows, ActiveX plug-in, or Java – QlikView Server provides accessto the latest version of each QlikView application.

QlikView Professional – For the power user

QlikView Professional lets power-users build, change or modify the layout of existing QlikView applications. QlikView Professional users can refresh existing data sources, and can choose to work with either local applications or applications distributed via QlikView Server. Power user scan work with local data, including offline enterprise applications, with no limitations.

QlikView Analyzer – For the general user

QlikView Analyzer lets end-users connect to server-based QlikView applications. QlikView Analyzer has a number of deployment options, including Java clients (supporting Sun and MSFT-Java), plug-in for MSFT IE and AJAX zero footprint clients. The installed Analyzer EXE client also provides offline analysis and reporting capabilities.

QlikView Architecture

Most traditional databases are built upon a relational model. Records are broken apart to reduce redundancy and key fields are used to put the records back together at the time they are used. Database programmers are required to make tradeoffs between increased speed at the cost of more space and more time to add or edit records, and the database user often suffers based on these decisions. QlikView was built with a simple architectural premise – all data should be held in memory, and all calculations should be performed when requested and not prior. QlikTech’s goal is to deliver powerful analytic and reporting solutions in a quarter of the time, at half the cost, and with twice the value of competing OLAP cube (Online Analytical Processing)-based products. QlikView is designed so that the entire application (data model included) is held in RAM – this is what makes it uniquely efficient compared to traditional OLAP cube-based applications. It creates an in-memory data model as it loads data from a data source, enabling it to access

Page 43: Training Report (2)

millions of cells of data and still respond to queries in less than a second.

High-speed associations occur as the user clicks in the various sheet objects and the display is updated immediately. QlikView operates much faster and requires significantly less space than an equivalent relational database because it optimizes the data as it loads – removing redundant field data and automatically linking tables together. Indexes are not required, making every field available as a search field without any performance penalty. Because of this design, QlikView typically requiresa 1/10th of the space required for the same data represented in a relational model, i.e. 100GB data fits into 10GB of memory. There is no limit to the number of tables allowed in an application, or to the number of fields, rows or cells in a single table. RAM is the only factor that limits the size of an application. QlikView offers three components in an integrated solution:

Fast Query Engine: Loading the data into memory allows QlikView to query, or subset, the data instantly to only reveal the data which is relevant to a given user. Inaddition, QlikView shows users the data which is excluded by a selection.On Demand Calculation Engine: charts, graphs, and tables of all types in QlikView are multidimensional analysis. That is, they show one or more measures (e.g., metrics, KPIs, expressions, etc.) across one or more dimensions (example: total sales by region). The major difference is that these calculations are performed as the user clicks and never prior.Visually Interactive User Interface (UI): QlikView offers hundreds of possible chart and table types and varieties; there are list boxes for navigating dimensions; statistic boxes; and many other UI elements. Every UI element can be clicked on to query.

Page 44: Training Report (2)

QlikView Technical Features

Data and Data Loading

QlikView loads data directly from most data sources (i.e., ODBC and OLEDB sources, using vendor specific drivers), any text or table data file (i.e., delimited text files, Excel files, XML files, etc.), and any formats, as well as data warehouses and data marts (although these are not required). QlikView also offers a plug-in model for loading custom data sources (web services).QlikView is designed to handle a remarkable amount of data. There is no limit to the number of tables allowed in an application. In addition, there is no limit to the number of fields, rows or cells in a single table – QlikView can handle billions of unique values in a given field.

RAM is the only other factor that limits the size of an application. The maximum size of a QlikView application is closely tied to the available RAM on the system where the application will run. However, it is not as easy as looking at the size of a relational database and comparing that to the RAM on the system to determine if the application is appropriate for QlikView. As QlikView loads data from a source database, thedata is highly compressed and optimized, typically resulting in a QlikView application of only 10% of the size of the original source.

Load Script

QlikView can load data that is stored in a variety of formats, as mentioned above. Data can be loaded from generic tables, cross

Page 45: Training Report (2)

tables, mapping tables (data cleansing), and interval matching tables. Tables can be joined, concatenated, sampled and linked to external information such as other programs, bitmaps, URLs, etc.

In order to pull data from a data source, QlikView executes a load script. The load script defines the source databases and tables and fields that should be loaded into QlikView. In addition, you can calculate new variables and records using hundreds of functions available in the script. In order to help you create a load script, QlikView includes a wizard that will generate the script.

Visual Basic Script and JavaScript Support

Programmers can develop VBScript or JavaScript macros to add specific functionality to an application. Macros can be attached to button objects that a user must click to activate, or the macros can be attached to various QlikView events. For example, a macro can be automatically invoked whenever an application is opened, when the load script is executed, or when a selection is made in a list box.

Analysis Engine

As described earlier, QlikView’s In Memory Data Model forms the basis for everyQlikView application. It holds all data loaded down on a transaction level, and is part of the QVW file (QlikView file format), which is loaded into RAM.

The Platform is optimized to run on every available Windows platform (32 & 64-bit), and makes use of all available processing power and RAM for each specific platform.

The Selection Engine processes the user “point-and-click” and returns the associated values to that query. It provides sub-second response times on queries made to the In Memory Data Model

The Chart & Table Engine handles the calculations and graphic display of the charts in the user interface. It calculates multiple “cubes” in real time (one cube for each graph in application), and promotes user selections directly in graphs.

Clients

Supported clients include an installed Windows EXE client that connects to QlikView Server; an ActiveX component integrates other software. The Platform also allows for an ActiveX plug-in for Microsoft Internet Explorer, AJAX zero footprint client and is Java-

Page 46: Training Report (2)

client compatible with Mozilla based web browsers. An open interface enables automated integration with QlikView.

Security

The data in a QlikView application is often confidential and then you need to control the access to the data.

Authentication is any process by which you verify that someone is who they claim they are. QlikView can either let the Windows operating system do the authentication using the Windows log on, or prompt for a user ID and a password (different from the Windows user ID and password) or use the QlikView serial number as a simple authentication method.

Authorization is finding out if the person, once identified, is permitted to have the resource. QlikView can let the Windows operating system do the authorization, by allowing or disallowing a user, a group or a domain the access to the entire application. If a finer granularity is needed, e.g. the user is only allowed to see specific records or fields, the QlikView Publisher can be used to automate the creation of a set of applications, i.e. one application per user group.

QlikView Application and User InterfaceThe QlikView interface is designed to provide perfect data overview in multiple dimensions – simplifying analysis and reporting for everyone. It presents data intuitively, allowing users to question anything and everything, from all types of objects (e.g., list boxes, graphs, tables) and to any aspect of the underlying data – regardless of where the data is located in a hierarchy.

Page 47: Training Report (2)

Key Elements of the User InterfaceSheets & TabsIn QlikView, analysis is made on sheets navigated through tabs (similar to Excel). Each sheet can hold several sheet objects (list boxes, graphs, tables etc) to analyze the underlying data model. All sheets are interconnected, meaning that selections made on one sheet affect all other objects on all other sheets.

List BoxThe basic building block of a QlikView application is the list box. A list box is a movable, resizable object that presents the data taken from a single column of a table. Rather than listing the duplicate values, only unique values are presented. If desired, the number of occurrences of each distinct value can also be listed.

Multi BoxThe multi box can hold several fields in a single object. Selections can be made through dropdown lists by clicking or text search and select. The multi box displays values only in a single selected mode.

Charts & GaugesIn QlikView, the results of a selection or query can be displayed in graph. Typically, a graph holds one or more expressions which are recalculated each time a selection is made. The result can be

Page 48: Training Report (2)

displayed as a bar chart, line chart, heat chart, grid chart, scatter chart, or as speedometer or gauge. All graphs are fully interactive, which means that you can make selections or queries directly by point-and-click or by “painting” the area of interest.

TablesJust as with graphical representation of data (in graphs), the result of an analysis can be displayed in a table. QlikView provides the ability to display data in powerful Pivot tables and Straight tables. These tables are fully interactive, which means that you can make selections directly in the tables or by drop down selection in the graph dimensions. Using a table box, QlikView can display any combination of fields in a single object, regardless of what source database table they came from. This feature is useful when providing listings of any kind. The table box can be sorted by any field or combination.

Reports & Send to ExcelQlikView has an integrated report editor for ease-of-use of application specific reports. The reports are dynamically updated as the user makes selections. Power users can also easily create reports by a simple drag-and-drop procedure. All data displayed in the GUI is ready to be exported at any time to Excel or other applications by a simple click of a button.

User Navigation and AnalysisPoint-and-Click Queries

Asking and answering questions is a simple matter of point and click. The user forms a query in QlikView simply by clicking the mouse on a field value or other item of interest. In a list box, the user clicks on one or more values of interest to select them. QlikView immediately responds to the mouse click and updates all objects displayed on the current sheet.

Multiple Sort Options

Since each field of data can be displayed in its own list box, it makes sense that you would want to sort each list box independently of all others. When you are scrolling through a list box, you want the values to appear in some sorted order appropriate to that field. QlikView allows you to sort each list box independently and according to multiple sort specifications.One or more of the following algorithms can apply to each list box in either ascending or descending order:State: Selected and optional values can be sorted from the top or bottom of the list box

Page 49: Training Report (2)

Expression: Values are sorted by the result of evaluating any entered expressionFrequency: Values are sorted by frequency of occurrenceNumeric Value: Values are sorted according to their numeric valueText: Values are sorted alphabeticallyLoad Order: Values are sorted according to the way they occurred in the original source database

Powerful Searching

Fortunately, QlikView allows you to search through the list as simply and quickly as typing on the keyboard. Select any list box, or open a multi box or drop down list, and start typing. QlikView immediately begins searching through the list to find values matching your criteria. Single character and multi-character wildcards are supported, as well as greater than and less than symbols to enable searching for numeric and date ranges.

Rapid Application Design and DeploymentSimple applications can be created within just a few minutes using QlikView’s wizards. More complex applications integrating data from various sources and displaying trend analysis charts and pivot tables may take a little bit longer.The best way to understand how simple it is to create and use a QlikView application is to step through the process involved:

Step 1: Locate the Data Source

The first step in creating an application in QlikView is to determine what data you wish to load. While it is possible to include inline data in the QlikView load script, application data will almost always come from an existing file, spreadsheet or database. You may load data from a single source file or database, or you may load and integrate data from many different sources at the same time.

The source file will typically be arranged with each record of the file containing one record of data. However, QlikView can work with data in practically any format, including generic databases, cross-tables, hierarchical databases, multi-dimensional databases, etc. The first row may or may not contain field labels, although you can always choose to set or change the labels in the wizard or in the script. If the data will come from a text file, each file will typically be treated as a single table. When working with spreadsheets, each tabbed sheet will be treated as a table.

Step 2: Create the Load Script

Page 50: Training Report (2)

Once the source data has been determined, a load script must be created to copy the data from the data source into QlikView’s associative database. Creating the load script is simplified by the use of wizards that construct script statements for supported file types.

Step 3: Execute the Load Script

After the load script is complete, the script must be executed either by using the “Run” button in the Edit Script dialog, or by selecting “Reload,” available on both the toolbar and the File menu. During the load process, QlikView examines each statement in the load script and processes it in sequential order. At the completion of the load script, a copy of all of the data referenced in the load script is loaded and available in the QlikView application.

Step 4: Place Objects on a Sheet

In order to use the data in the QlikView application, you must place list boxes or other objects on one or more sheets. The actual objects that should be used and how they should be grouped into sheets depends on the specific application.

Step 5: Start Using the Application

As soon as the first object is created on a sheet, the application is available for use. All objects are automatically associated together, and clicking in any object initiates a query.

Step 6: Add More Sheets and Objects as Required

Finally, continue to add and arrange objects on sheets until the application achieves the functionality desired. You may wish to add more customization to the load script by taking advantage of QlikView’s “Expression Engine,” or you may wish to add macros to automate certain actions.

Main features and benefits of QlikView:

Use of an in-memory data model Allows instant, in memory, manipulation of massive datasets Does not require high cost hardware Automated data integration and a graphical analytical environment attractive for customers Fast and powerful visualization capabilities Ease of use - end users require almost no training Highly scalable - near instant response time on very huge data volumes Fast implementation - customers are live in less than one month, and most in a week

Page 51: Training Report (2)

Flexible - allows unlimited dimensions and measures and can be modified in seconds Integrated - all in one solution : dashboards, power analysis and simply reporting on a single architecture Low cost - shorter implementations result in cost saving and fast return on investment Risk free - available as a fully-functional free trial download

REPORTS USING

BUSINESS INTELLIGENCE TOOL

“ QLIKVIEW”

Page 52: Training Report (2)

A REPORT ON BISLERI

Mineral Water under the name 'Bisleri' was first introduced in Mumbai in glass bottles in two varieties - bubbly & still in 1965 by Bisleri Ltd., a company of Italian origin. This company was started by Signor Felice Bisleri who first brought the idea of selling bottled water in India.

Parle bought over Bisleri (India) Ltd. In 1969 & started bottling Mineral water in glass bottles under the brand name 'Bisleri'. Later Parle switched over to PVC non-returnable bottles & finally advanced to PET containers.

Since 1995 Mr. Ramesh J. Chauhan has started expanding Bisleri operations substantially and the turn over has multiplied more than 20 times over a period of 10 years and the average growth rate has been around 40% over this period. Presently we have 8 plants & 11 franchisees all over India. We have our presence covering the entire span of India. In our future ventures we look to put up four more plants in 06-07. We command a 60% market share of the organized market. Overwhelming popularity of 'Bisleri' & the fact that we pioneered bottled water in India, has made us synonymous to Mineral water & a household name. When you think of bottled water, you think Bisleri.

Bisleri value their customers & therefore have developed 8 unique pack sizes to suit the need of every individual. We are present in 250ml cups, 250ml bottles, 500ml, 1L, 1.5L, 2L which are the non-returnable packs & 5L, 20L which are the returnable packs. Till date the Indian consumer has been offered Bisleri water, however in their effort to bring to you something refreshingly new, they have introduced Bisleri Natural Mountain Water - water brought to us from the foothills of the mountains situated in Himachal Pradesh. Hence our product range now comprises of two variants : Bisleri with added minerals & Bisleri Mountain Water.

Page 53: Training Report (2)

It is their commitment to offer every Indian pure & clean drinking water. Bisleri Water is put through multiple stages of purification, organized & finally packed for consumption. . Rigorous R&D & stringent quality controls has made us a market leader in the bottled water segment. Strict hygiene conditions are maintained in all plants.

In their endeavour to maintain strict quality controls each unit purchases performs & caps only from approved vendors. They produce our own bottles in-house. They have recently procured the latest world class state of the art machineries that puts us at par with International standards. This has not only helped us improve packaging quality but has also reduced raw material wastage & doubled production capacity. You can be rest assured that you are drinking safe & pure water when you consume Bisleri. Bisleri is free of impurities & 100% safe. Enjoy the Sweet taste of Purity !

BISLERI PRODUCTS

Bisleri value their customers & therefore have developed 8 unique pack sizes to suit the need of every individual. We are present in 250ml cups, 250ml bottles, 500ml, 1L, 1.5L, 2L which are the non-returnable packs & 5L, 20L which are the returnable packs.

Bisleri with added Minerals

Bisleri Mineral Water contains minerals such as magnesium sulphate and potassium bicarbonate which are essential minerals for healthy living. They not only maintain the pH balance of the body but also help in keeping you fit and energetic at all times.

Bisleri Mountain Water

Bisleri Natural Mountain emanates from a natural spring, located in Uttaranchal and Himachal nestled in the vast Shivalik Mountain ranges. Lauded as today's 'fountain of youth', Bisleri Natural Mountain Water resonates with the energy and vibrancy capable of taking you back to nature. Bisleri Natural Water is bottled in its two plants in Uttaranchal and Himachal Pradesh and is available in six different pack sizes of 250ml, 500ml, 1 litre, 1.5 litre, 2 litre and 5 litres.

Page 54: Training Report (2)

TECHNOLOGICAL ASPECTS

Here we are creating excel files as a database which will be linked to qlikview(business intelligence tool) to improve your decision system support systems.

Excels sheet created are as follows:

1.ZONE SHEET

In the Zone excel sheet I have taken only 3 regions of NCR i.e Gurgaon, Delhi, Noida. Each region is divided into 100 places in its circle. It also consists of the place population its area and population growth. In short it provides full information about the geography.

2.TRANSACTION SHEET

In Transaction sheet I have provided with 5 years of data (month, day) along with salesman ID, Transaction ID, Retailer , Retailer ID and at the last Sales.

3.SALESMAN

In salesman sheet we provide the Salesman ID with Salesman Name Along with the Distributor ID with whom salesman is related to.

4.RETAILER

Page 55: Training Report (2)

Retailer excel sheet consist of Retailer ID of the Retailers their name, Zone & Region to which they are associated to.

DESCRIPTION OF REPORT

Bisleri report is implemented on the business intelligence tool named “Qlikview”.This report is basically for 3 regions in NCR i.e Gurgaon, Delhi & Noida. I have taken around 100 regions in each zone with their population, population growth & area of the region. I have designed the excel sheet in such a way that corresponding region is displayed with their population, population growth & name of the zone, as it was an small level report so nothing much was done with it.

Bisleri or any other mineral water supply company has Distributors in the particular regions and a number of salesman executive works under them which are responsible for delivery of the orders home to home. They keep track of the delivery in the respective regions and are responsible for sale in their regions, once in a week i.e Saturday they have to report to the office submitting their sales report which they have done in whole week for e.g if the Bisleri office is in Delhi salesman executive from Noida Gurgaon and Delhi reports to the office every Saturday along with their daily sales report sheet, they are checked accordingly there of their performance and sales they have done whole week, is they are giving benefit to the company is the sales increasing in that regions.

DATA COLLECTION

Regarding data collection truly speaking no company want to give their data to any outsider, but somehow I managed to get data from a sales executive named “Rajneesh Mani Tripathi” who was working in Bisleri itself.

Page 56: Training Report (2)

DESCRIPTION OF IMPLEMENTATION ON TOOL

INTRODUCTION

This is the first page or home page of the tool, as we were making the report on Bisleri so I have planned to put the background of Bisleri only.

Page 57: Training Report (2)

BACKGROUND

In the background section I have described about Bisleri its circulation and about the report it also gives a small description about Qlikview and tabs which redirects to Bisleri homepage.

Page 58: Training Report (2)

HOW TO WORK

This tab describes how the selection are made or how to work on this software suppose an unknown person to computer or who is not in IT field can easily study and use this tool making it user friendly.

GEOGRAPHY

Geography tab gives the description about the area it shows Zone with their corresponding regions, their area and population along with their growth is shown graphically.

Page 59: Training Report (2)

If we select Delhi zone with cannot place as its region it shows its population in thousand with graph.

RETAILER

Retailer tab describes the list of retailers with their Retailer ID’s in which zone and region they are working and their sales shown graphically.

Page 60: Training Report (2)

If the selection is made of Agarwal Restaurant of the Id 1097 its corresponding sales is shown graphically with its zone and region.

SALES

Sales tab shows Zone, Regions Years of sale day and month of sale with sales shown graphically by corresponding salesman, bookmark with the retailer shown to which particular sales man is associated.

Page 61: Training Report (2)

TABLES

There is a table showing sales of the different salesman.

CONCLUSION

This Bisleri project was a bench mark for me to design other reports, using the concepts of Bisleri report it would be more comfortable to make other reports. This

Page 62: Training Report (2)

report was a practice report I have learnt through this small report how to work on this tool It was quite interesting toll to be worked upon.

I have learnt through this report how to draw graphs make different tables and list, multi, table boxes. Through this report I have also got the idea of working of mineral water supply companies how they work. Last but not the least working on this project was a nice Experience which I would use to make other reports in near future.

Page 63: Training Report (2)

QLIKVIEW REPORT ON

“LG”

OVERVIEW OF THE COMPANY

LG believe that technological innovation is the key to success in the marketplace. Founded in 1958, led the way in bringing advanced digital products and applied

Page 64: Training Report (2)

technologies to customers. With commitment to innovation and assertive global business policies aim to become a worldwide leader in advanced digital technology.

The trajectory of LG Electronics, its growth and diversification, has always been grounded in the company ethos of making our customers' lives ever better and easier-happier, even-through increased functionality and fun.

Since its founding in 1958, LG Electronics has led the way to an ever-more advanced digital era. Along the way, our constantly evolving technological expertise has lent itself to many new products and applied technologies. Moving forward into the 21st century, LG continues to on its path to becoming the finest global electronics company, bar none.

LG Electronics is pursuing the vision of becoming a true global digital leader, attracting customers worldwide through its innovative products and design. The company’s goal is to rank among the top 3 consumer electronics and telecommunications companies in the world by 2010. To achieve this, we have embraced the idea of “Great Company, Great People,” recognizing that only great people can create a great company.

Facts & Figures

Established In : Jan 1997 Managing Director : Mr. Moon B. Shin Corporate Office :Plot no51, Udyog Vihar, Surajpur Kasna Road, Greater

Noida (UP) Corporate Website : http://www.lgindia.com Number of Employees: 3000+

Business Areas & Main Products

Home Entertainment

Plasma Display Panels, LCD TV , Colour TVs, Audios, Home Theater System, DVD Recorder/Player

Home Appliances

Refrigerators, Washing Machines, Microwaves, Vacuum Cleaners

AC

Split AC, Windows AC, Commercial AC’s

Business Solutions

LCD monitors, CRT monitors, Network Monitors, Graphic Monitor, Optical Storage Devices, LED Projectors, NAS( Network attached Storage) and Digital signage

Page 65: Training Report (2)

GSM

Color Screen GSM Handsets, Camera Phones, Touch Screen Phones, 3G Phones

PERFORMANCE AND GROWTH RATE

TECHNOLOGICAL ASPECTS

Page 66: Training Report (2)

Here we are creating excel files as a database which will be linked to qlikview(business intelligence tool) to improve your decision system support systems.

Excels sheet created are as follows:

1.ZONE EXCEL SHEET:

Zone excel sheet consist of name of 20 states with consist of 10 regions each along with zone id it also comprises of the population of different regions.

2.TRANSACTION EXCEL SHEET:

Transaction excel sheet consist of the fields as name of all the Retailers in different regions in all 20 states and their sales of respective years from 2005-2009 with a total of all the sales too.

3. RETAILERS EXCEL SHEET:

Retailer excel sheet consist of Retailer ID and the name of respective Retailers in different regions across the country.

4.PROFIT & LOSS EXCEL SHEET:

In this sheet had given a profit and loss id so as the star schema should be made easily. This sheet comprises of Profit & Loss ID, Quantity supplied, inventories of 5 years (2005-2009) and I have decided a threshold so that we would be able to determine the profit or loss made by the different retailers.

5.PRODUCT EXCEL SHEET:

It comprises of Product ID name of the products with their specifications, category, series & model number

6.OFFERS EXCEL SHEET:

There is a offer sheet as well which consist of the model number of different products their original price, festival on which the discount is given by the company, discount rate with its new price after the deduction.

7.MANAGERS EXCEL SHEET:

Manager excel sheet consist of Manager ID, name of the managers and their contact number so that they can be called by the company when needed.8.INVENTORY EXCEL SHEET:

Page 67: Training Report (2)

Inventory excel sheet consist of Zone ID, Retailers ID, Managers ID, Product ID, Profit & Loss ID, Quantity supplied per year by the retailers, Threshold, and inventories retailers have from 2005 - 2009

9. LG PICTURES OR GRAPHICS EXCEL SHEET:

In LG pictures excel sheet I have given the model number of the products and the path to graphics folder in which pictures of different items are saved.

DESCRIPTION OF REPORT

Page 68: Training Report (2)

LG report implemented on qlikview is on a wide scale, this report is nearly for all over India in this report I have taken consist of 20 states each states consist of 10 regions which thereby consist of 5-6 retailers, each retailers have 5 years of sales data 2005 – 2009 their profit and loss is also determined on the basis of sales made by them in corresponding years.

Nearly all the products are shown with their series model number and there pictures is also shown. There is also a Manager which is appointed to each region so that if needed would contact directly to the regional Manager so that one can know about the sales and problems going out there in their regions, their contact number is also given along with their Manager ID

There are also discounts given on different offers on different festivals and new year, original price with their discounted rate and discounted price is also shown along with their current selection.

DESCRIPTION OF IMPLEMENTATION ON TOOL

Page 69: Training Report (2)

STARTING PAGE

Starting page shows LG logo with a get started tab which directs you to next page,

BACKGROUND TAB:

Background tab gives the description about the application or the report made for LG specifications, useful links which redirects you to qlikview community LG homepage and to learn more & there are some points described about qlikview

Page 70: Training Report (2)

HOW TO GET STARTED

This tab describes how the selection are made or how to work on this software suppose an unknown person to computer or who is not in IT field can easily study and use this tool making it user friendly.

Page 71: Training Report (2)

ZONE

Zone tab describes all 20 states with their corresponding 10 regions each with their Zone ID’s with their population, a table box is shown with zone id’s and states.

Drill down approach is used is this tab for different states i.e if we click on the sates it gets dissolved in different region and if region is clicked then corresponding population is shown graphically as shown in 3 snapshots.

Page 72: Training Report (2)

RETAILERS TAB

Retailers tab shows list of all the retailers in the list box with their corresponding Retailer ID, Regions, & their States, there is also a multi box showing complete information about the retailers.

Page 73: Training Report (2)

LG PRODUCTS

Nearly all the LG products are shown in this tab, if we will click on the products tab it gets drilled down into different products i.e mobile phones, computer products, home appliances etc. This tab shows different products with their ID’s, series, model number, cost.

FESTIVAL OFFERS BY THE COMPANY

This tab shows different festival offers and new year their discount rates with their prices before and after discount, there is also a table box showing Discount on each items.

Page 74: Training Report (2)

INVENTORY

This tab describes the quantity supplied to different retailers per year upto 5 years from 2005 -2009, drill down approach is also used over here in products suppose if we click on mobiles it drills down to its different types then to series and then to model number corresponding pictures available can also be seen from there clicking on model number, there are different charts too available for your easy analysis quantity vs product, quantity vs retailers, there is also a table box of the quantity supplied, quantity supplied per model is also available here shown in snapshots.

Page 75: Training Report (2)
Page 76: Training Report (2)
Page 77: Training Report (2)

SALES

Sales tab shows Product, Category, Series, Model Number, Quantity sold per model in the respective year, States, Regions with the mane of the retailers, quantity sold per retailers, there also some of the graph to analyse this more better quantity sold per product, quantity sols per year etc. Corresponding selection is reflected here also means it selected the same model is selected in this sheet too, it shows the sales of the same model number.

Page 78: Training Report (2)

PROFIT AND LOSS

Profit and Loss tab consist of Product, Category, Series, Model number, quantity sold per model on yearly basis, there are also some graphs showing quantity sold per retailer, profit and loss model state regions and name of the retailers are also specified here. Profit and Loss is basically calculated by the threshold value given if the sale is above the threshold value retailer makes the profit else he is in the loss.

Page 79: Training Report (2)

MANAGER DETAILS

This is the last tab of the report it shows the Managers to which regions they are given the charge, contact as mobile number is also specified, States, Regions are also described in the list boxes.

SCHEMA

The schema used here is star and constellation schema.

Page 80: Training Report (2)

TABLE VIEWER

INTERNAL TABLE VIEWER

In the internal table viewer Sales table is the fact table and several dimensional table is associated with the sales table in snowflake schema, as such it is a star schema as a whole which consist of snowflake schema too.

Page 81: Training Report (2)

EXTERNAL TABLE VIEWER

Page 82: Training Report (2)

REFERENCES

William Inmon a book in data warehouse. Building Data warehouse by Mohanty. Actionable Insights for Business Decision Makers

by Business Objects (an SAP company), SAP and Intel by Don Tapscott Internet References(URL)

http://en.wikipedia.org/wiki/Extract,_transform,_load http://www.etltools.org/ http://download.oracle.com/docs/cd/B10500_01/index.htm http://www.cuddletech.com/articles/oracle/node66.html http://en.wikipedia.org/wiki/Online_transaction_processing http://www.keysurvey.com/ http://www.pythagoras.co.uk/business_intelligence.aspx http://demo.qlikview.com/index.htm

Page 83: Training Report (2)

CONCLUSION

It was a great opportunity for me to do my summer training in Wipro InfoTech , really it was a nice experience working in such a nice IT firm. It took me days to adapt to the situations there, it was not an easy task for me exposing myself from a college life directly to the corporate world. Environment at Wipro was tremendous there a lot to learn employees there had exceptional communication skills to which I thought I was lagging behind but all depends on the environment.

Regarding the topic of my training “DATA WAREHOUSING AND BUSINESS INTELLIGENCE USING QLIKVIEW” I must say that it was really interesting topic which I was able to learn there. My project guide Mr. Nitish Vij was really nice person he gave me exceptional concepts on data warehouse and business intelligence I got full support by him.

There were two reports which I have made in my training tenure Firstly, was a report on “BISLERI” this was a practice report I learnt how to work on this business intelligence tool through this tool. I found really interesting to work on this tool, this was a small scale report it was having some problems First was that hierarchy was not defined drill down approach was not there and Secondly was the table viewer was not properly defined. This two problems I planned to correct in my next report.

Second report which I had made was on “LG” all the problems were resolved in this report which was there in the Bisleri report.

At last I must say that this tool was really nice and helpful to my career perspective to work with use of this tool is really helpful in solving business purposes in the coming future, it solves business decisions in an efficient manner.


Recommended