+ All Categories
Home > Documents > Business Intelligence Solution

Business Intelligence Solution

Date post: 16-Oct-2015
Category:
Upload: k1988
View: 18 times
Download: 1 times
Share this document with a friend
Description:
Business Intelligence Solution
Popular Tags:

of 12

Transcript
  • Whi

    te P

    aper

    www.infosys.com

    Abstract

    Enterprise Business Intelligence (BI) solutions today are analyzing growing amounts of data. More often, the data is historical in nature, coming from within the enterprise and also from external channels such as the Web, mobile, and devices. This has led to the growth of data volume to alarming levels. In traditional BI implementations, this information explosion, along with increasing demands on computational power to process high volumes of data, has been managed through expensive hardware and software upgrades. This is a highly inefficient approach to meet the demands of a growing business, one that the enterprise considers economically unfavorable.

    With the global scale of operation of large enterprises, the need of the hour is to make information available to partners, remotely-located analysts, and managers who are on the move. This in turn results in additional demands on infrastructure and IT.

    This white paper discusses how Cloud Computing might help address these challenges with its round-the-clock availability and its dynamic and scalable nature. Cloud infrastructure would be beneficial in terms of offloading - BI storage, long running processes, and handling erratic load behaviors. The proposed solution discussed in this paper is an alternative BI architecture providing an optimal solution that extends existing BI infrastructure.

    Business Intelligence Solutions on Windows Azure- Sidharth Subhash Ghag

  • 2 | Infosys White Paper

    BI ProcessOverview

    Primarily, a BI solution has two parts: data storage and analysis. The stored raw data is an asset that needs to be cleansed and processed to derive information for making decisions. The information has to be presented to the decision makers in an intuitive and highly interactive manner, so that key strategic decisions can be made in the least possible time. BI relies on data warehousing (a data repository designed to support an organizations decision making). Ineffectively managed data warehouses make it difficult for organizations to quickly extract necessary data for information analysis to facilitate practical decision-making.

    The BI process can be represented using the following diagram:

    Online Transaction DataOnline transactional data (operational data) from multiple systems (finance, sales, and CRM) is extracted and processed to eliminate data redundancy or is optimized to be stored in a data warehouse. The purpose of creating a data warehouse is to bring information from heterogeneous systems to a common data storage platform.

    Data Warehouse

    A data warehouse is an independent master store of all the historical transactional data for any enterprise. Extracting transactional data from multiple systems and then cleansing the data for using it for further analysis is the most important activity of establishing a data warehouse. The process of accumulating data largely depends on the source systems from where the data is retrieved. Mostly, this process of accumulation is customized enough to handle the multiple data sources and data rules, easing the transformation of data from multiple disparate systems, which needs to be stored in a single platform.

    Data Marts

    Although a data warehouse is a storehouse for voluminous data, it is difficult to process complex analytical queries or jobs directly off the data warehouse. Thus, the data warehouse is broken down logically or sometimes physically into smaller analysis units called data marts. Data marts can be conceptualized as units of data storage used for dedicated analysis, which is generated using specific filters and queries. Data marts contain specialized multi-dimensional data structures called data cubes. Unlike relational database tables, which have only two dimensions (row and column), a data cube has multiple dimensions.

    Typical data mart queries include how the sale of grocery products was in the last six months and how a promotion performed in the last six months in the southern region. Data marts are useful for such focused analysis.

    Since the data warehouse is responsible for storing high volumes of historical and ever-growing data, a data warehouse solution should be cost-effective and reliable and should always be available to other components for analysis and reporting.

    Reports, Dashboards and Key Performance Matrix

    Analysis is the process of slicing and dicing a set of information to interpret a pattern that can be used to justify certain impact or for further planning. The analytics engine works on data marts. The purpose of the analytics engine is to execute complex queries and present data with multiple dimensions and measures. Dimensions and measures are key parameters in BI that help slice and dice information to make it more precise for decision makers.

    Figure 1: BI Process

  • 3 | Infosys White Paper

    Data presentation is a crucial component in analysis. The richer the presentation of the data to be analyzed, the better it is for decision makers to examine the information. This presentation layer helps in presenting reports, KPI matrix, and dashboards to the end user for slicing and dicing information. These rich reports also support what-if scenario analyses.

    A BI system is an aggregation of multiple systems and sub-systems. Data storage, information slicing and dicing tools, and reporting or rich visualization interfaces are some of the multiple sub-systems of any typical BI system. This peculiarity of structure and integration creates inherent challenges. Let us look at the typical challenges faced by enterprises in implementing and using BI solutions.

    BI Implementation Challenges Intermittent demands for storage

    Since a data warehouse is the backbone of the entire BI solution, it becomes important to manage this data warehouse properly to keep it running all the time. The data warehouse is a storehouse for large datasets, and it is not possible to keep the entire data active so that it may be used for on-demand analysis. In certain scenarios, historical data that has otherwise been inactive for some time may need to be activated. Activation of historical data involves obtaining the backed up tapes, retrieving the data, and loading and fitting it into the current activated data warehouse or data marts, all of which are by no means simple. Even if such a situation arises only once a month, it would still consume a considerable amount of IT operational resources. Storage demand increases with every such request because activation of inactive data adds to rather than taking away from the currently activated data. The need for extra storage capacity adds to the investment of hardware and the pressure of managing the same.

    Sub-optimal utilization of resources

    As the BI solutions have been in place for many years, it is highly likely that the number of users, size of the storage, and complexity of the systems have increased. Increase in users adds pressure on the scalability of the solution, which might have been provisioned long ago.

    There is yet another possibility where an organization may have considered the rapid growth in the number of users, where the storage and other infrastructure capacities are planned upfront. In such cases, it is highly likely that the system may remain underutilized causing the loss of opportunity of using the same investment elsewhere. The scalability challenge is crucial in deciding utilization as well as smooth running of the system.

    Lacking external dimension

    On-premise BI solutions are mostly oriented around the transactional data of the enterprises. They lack the external dimensions and measures of analysis, that are important for strategic analysis. A combination of internal data such as sales data and external data such as government collected data and industry trends can be used to get better insight and plan effective strategies.

    External environmental data is available through different data marketplaces, which can help enhance the quality of analytics. Increasing demand to factor external entities into the analysis is adding pressure on the design and flexibility of the BI solutions. Many a time, enterprises end up developing their own components or smaller, independent BI solutions to factor these external entities.

    Lacking multi-channel delivery capabilities

    Most enterprises work with workforce spread all over the world. These geographically distributed stakeholders demand round-the-clock availability and accessibility from any place. Enterprises that had not factored this demand have ended up spending huge amounts of money and resources to address it. The need to make data warehouses and BI solutions available over the Internet with multiple delivery channels such as RIA, services, mobile and browsers is increasing. This quick, easy, perennial accessibility adds an edge to enterprises, facilitating them to collaborate better and take decisions quickly. Thus, it becomes essential for enterprises to make their BI platform available over the Internet. This requirement not only demands additional investment for infrastructure, but also adds to the additional integration touch points to address such requirements.

    Present businesses operate in highly dynamic environments influenced by factors such as changing business scenarios, change in compliances and governance processes, new integration requirements adding to the complexities of the systems, and increasing pressure on the system to be responsive. These challenges multiply with the increasing demand for dynamism in the business, processes, and technologies. It is important for every enterprise to address these challenges and make use of their BI investment to get the best results.

  • 4 | Infosys White Paper

    BI Solution Based on Cloud Computing With more and more devices getting meshed and inter-connected on the information highway, demand for data and everything related to it will grow manifold. This information explosion will lead to the need of systems that can:

    Process large amounts of data efficiently and in near real-time

    Handle storage for data flowing in from the various systems and devices into storage units that can store large amounts of data

    The figure shown below depicts a typical information flow landscape of any large enterprise in the future. Thus, a BI solution has to meet the high volume requirements of an enterprise, which constantly exchanges information with multiple stakeholders, systems, and devices as part of its day-to-day operations.

    Cloud computing, a new generation technology platform of deploying and delivering software services, addresses the growth requirements of an enterprise and the commonly faced BI challenges. The value proposition delivered by cloud computing, which can address the needs of the BI platform for the future, includes:

    Capability to process voluminous and rapidly-growing data over the Internet

    Replication of machines, applications, and data storage at multiple instances to provide high availability

    Dynamic, elastic capability to support scaling up and down of infrastructure within minutes

    Improved Cost EfficiencyManaging complexity and Total Cost of Ownership (TCO) using cloud storage solutions are relatively more appealing compared to traditional RDBMS data solutions, especially in a data warehouse scenario that deals with handling historic or inactive data. With cloud storage, data can be kept active at all times while avoiding the aide of the IT management to activate any historical data. Thus, cloud storage addresses the challenges of intermittent data storage access, particularly when there is an urgent need to reload historical data, say to meet compliance-related queries.

    Content Providers Field Devices/Appliances

    Delivery Channels

    Regulatory Agencies

    PartnersSuppliers

    Portal & Reporting

    TransformationEngine

    Analytical Engine

    DW

    Enterprise Geo1

    Sales SCM CRM

    Enterprise Geo2

    Sales SCM CRMCustomers

    Figure 2: Typical Azure Business Intelligence Eco-System

  • 5 | Infosys White Paper

    Elastic and ScalableA cloud-based solution offers users the capability to provide cloud resources such as computing services, storage services, and cache services instantaneously. This infrastructure-level flexibility allows one to handle workload fluctuations, both planned and unplanned, in an elastic manner without having to plan for any investments upfront. The elastic and scalable nature of the cloud, along with the pay-as-you-go model, aligns well with the enterprise needs such that the business gets a more transparent and assured view of its IT resource consumption.

    InteroperableSince the cloud is available over the Internet and can easily provide interoperable endpoints such as REST and SOAP, the architecture supports easy integration with external services. Relatively easy and quick integration with externally available interface endpoints makes the enterprises account for adding external dimensions to their analysis. These rich sets of external dimensions provide a platform for the enterprise to logically consider factors for their analysis, be it competitor data, national/international growth data, neighborhood safety, climate effect, or new stores or services in the neighborhood.

    Available Anytime AnywhereThe cloud is available ubiquitously and can be accessed through standard http protocols. Enterprises do not have to spend extra money or resources to make the solution available over the Internet. Concerns such as provisioning and hardening are inconsequential with the cloud. The cloud helps enterprises support multiple delivery channels that allow information to reach stakeholders including employees, mobile field agents, and external partners easily.

    Even as the cloud computing platform is growing, different vendors are adding to the rich set of building blocks required to develop enterprise applications on the cloud. The basic principle in developing these building blocks is to be able to integrate easily and quickly. All the vendors are striving for open and interoperable standards of integration, making it easier to use these enterprise application services on any cloud platform. It also delivers the advantage of making the system agile to handle system changes required to address dynamic business and technical needs.

    These characteristics of the cloud computing platform enable the implementation of large BI solutions possible in an easy and relatively inexpensive manner. Cloud computing platforms are maturing and cloud vendors are trying hard to increase the functional and technical richness of their offerings to drive innovations. These innovations would help enterprises in better management, easy decision making, and being more competitive.

    We will explore Microsoft Azure, a public cloud platform that offers Platform as a Service (PaaS), for developing the next generation cloud-based BI solution. PaaS offers hosted scalable application servers with necessary supporting services such as storage, security, and integration infrastructure. PaaS platform also provides development tools and application building blocks to develop custom solutions on the cloud. Though we have selected PaaS for our proposed solution, there are two other cloud delivery models: Software as a Service (SaaS) and IaaS (Infrastructure as a service), which we will discuss briefly in this paper.

  • 6 | Infosys White Paper

    Azure Based BI SolutionWe will now attempt to explain a high-level design for a custom-built BI solution on Windows Azure.

    Let us first get acquainted with the Azure terminologies given in the following table:

    High-Level Design for Custom-Built BI Solution on AzureOwing to concerns around data privacy, security, and data ownership, enterprises have been cautious in adopting cloud computing. However, at the same time, they have also shown a keen interest in leveraging the value proposition offered by the cloud and the potential opportunity it presents in growing their businesses.

    Keeping these key aspects in mind, a hybrid BI solution is proposed to alleviate enterprise challenges. As shown in the figure below, the proposed solution divides the architecture into two distinct facets On-premise component and Cloud component.

    Windows Azure A cloud operating system platform that provides the computing capability on a cloud

    Azure Table Storage Entity/Key value or tuple store-based service capabilities provided by Microsoft Azure to address large,structured, and scalable data storage

    Azure Blob Storage Large and scalable data storage made available by Microsoft Azure for unstructured data such asdocuments and media les

    Azure Queue Queue service oered by Microsoft Azure for message orchestration and asynchronousrequest processing

    SQL Azure Relational database capability similar to SQL Server made available by Microsoft Azure to addressrelational database capabilities on the cloud

    Web Role A web server instance to run web applications readily available at http/https endpoints for access.A web role is simply a web server provided by Microsoft Azure

    Worker Role A computing instance for executing long running processes on Microsoft Azure

    VM Role A role used to run a virtual hard disk image, store that image in the cloud, and load and run it on demand.The role is highly suited for moving legacy applications to the cloud with minimal eort

    AppFabric Service Bus A service-bus-like messaging platform on the cloud that allows on-premise applications to be availableexternally and to seamlessly connect with other systems

    AppFabric Access ControlService (ACS)

    A claim-based authorization service that supports federated access to enterprise systems and serviceson the cloud. All authorization rules can be abstracted and managed from ACS independently out ofthe application in a standard oriented way

    Windows Azure DataMarketplace

    An information marketplace that acts as an external dataset provider, which would be consumed by theBI stack to leverage external dimensioning metrics such as demographics, location, and other publicallyavailable information to enrich the analytical reporting capabilities

    Windows IdentityFoundation (WIF)

    An identity management framework that externalizes identity-related logic from an application.Federated single sign-on scenarios involving multiple stakeholders can be built on this framework. Forthe enterprise, this will also help integrate on-premise Active Directory-based authentication with theAzure deployed application

  • 7 | Infosys White Paper

    On-Premise Components

    Data Cleansing and Profiling Agent

    This agent would be responsible for collating transactional and unstructured data from on-premise systems, cleansing the data, and uploading it on a data warehouse developed on Azure table storage. This component can be extended to consider disparate data sources such as Oracle, SQL Server, mainframes, and excel data. Cleansing and profiling would also be configurable according to business needs to handle business-specific rules, such as soft-deleted data should not be uploaded and transactional data not in the published state should not be uploaded. The data transfer from agent to the cloud would happen over a secured channel. This agent is usually a part of the Extract Transform Load (ETL) component.

    Data Integration Layer

    Based on the criticality of information, an enterprise may have structure data categorized into different levels. We will discuss the different data integration approaches to cover mission critical and non-mission critical data.

    Exposing master data on the cloud without having to upload the master data on the cloud storage helps in maintaining data privacy and ownership in the hands of the enterprise. This would avoid the need to physically store confidential data such as credit card details, address information of customers, and salary information of employees on the cloud. It would instead be fetched from the enterprise as and when required.

    Figure 3: High-Level Design for Custom-Built BI Solution on Azure

  • 8 | Infosys White Paper

    An on-premise component that forms a part of the integration layer would help in exposing the master data to the cloud. Technically, this can be achieved by leveraging the capabilities of the Azure AppFabric service bus. Azure AppFabric service bus, with its service virtualization capabilities, allows exposing on-premise components or services on the cloud without having to physically move the data outside the enterprise. The AppFabric service bus provides a publically accessible virtual endpoint on the cloud to any on-premise service endpoint it manages. This channel of communication between the Azure AppFabric service bus and the on-premise service can be secured at the transport level, which would be achieved by using SSL, and at the message level, which would be achieved by using standard encryption techniques.

    To avoid latency issues, which could be a cause of concern arising due to the external network hop between an on-premise and cloud environments, a distributed caching functionality can be implemented on the cloud. The analytical engine deployed on the cloud can be embedded with a caching component such as Azure AppFabric Cache to cache regularly-used master data and in turn reduce the effects of latency.

    Data integration achieved using service virtualization addresses data security concerns, but this comes at the cost of performance. It is, thus, advisable that for non-critical data, the data be transported and made to reside physically on the cloud, closer to the hosted application. This can be achieved by leveraging existing data integration techniques such as ETL, Change Data Capture (CDC), and Enterprise Information Integration (EII) implemented using a tool such as Microsofts SQL Server Integration Services (SSIS).

    Power Pivot

    Power Pivot for Excel is a data analysis tool that delivers unmatched computational power directly within the application and with a tool such as MS Excel, which users are fairly acquainted with. Power Pivot is a user-friendly way to perform data analysis using familiar Excel features such as the common MS Office User Interface shell, PivotTable, PivotChart views, and slicers. Power Pivot helps users analyze data marts offline without being connected to the online data marts. Power Pivot enables focused analysis on the data marts for on-premise and on-the-move analysts to access at their own convenience.

    ADFS 2.0

    ADFS 2.0 is an identity provider service that enables an enterprise-level identity federation solution. It is developed on Windows Identity Foundation (WIF) and makes it very easy to integrate with web applications for authentication/authorization from on-premise active directory use stores. The BI portal solution proposed here would implement claims-based authentication using WIF and ADFS 2.0 for allowing enterprise users to login to the system with their existing active directory credentials.

    Azure Components

    Cloud Data Warehouse

    All the collated data uploaded by the cleansing and profiling agent would be stored in Azure table storage. Azure table storage is highly scalable and is an appropriate fit for persisting de-normalized data due to its Entity Value Attribute (tuple store) style of storage. No analytical processing or advanced queries would be run on the data warehouse. Hence, the economically cheaper Azure table storage is a relatively better option compared to relational data stores such as SQL Azure. The Azure storage, through blobs, can also persist metadata of the data warehouse along with unstructured data such as files, documents, scanned images, and video files.

    The inexpensive storage capability delivered by table storage frees data warehouse administrators from having to deactivate historical data, a practice often followed in the earlier BI systems due to storage capacity limitations of on-premise storage facilities. CAPEX spending, normally involved in expanding storage to meet enterprise growth, is also eliminated. However, due to the Pay-As-You-Use pricing model of Windows Azure services, there would be a rise in the OPEX spending, but it would tend to align more closely with the demands of the growing business. A detailed assessment of the existing system along with a Y-O-Y ROI analysis of the Azure platform can help provide a clear picture in terms

    of overall savings and business value that can be realized in the future.

    Analytical Engine

    The analytical engine is the most important component in the BI solution. The analytical engine:

    Prepares data required for focused analysis

    Applies algorithms for processing data based on different facts, measures, and dimensions

    Analyzes structured and unstructured information to provide patterns and predicts trends that are usually difficult to spot with the naked eye or traditional reporting

    Identifies cases or exceptions in the data to isolate or identify anomalies

  • 9 | Infosys White Paper

    As of now, the SQL Server Analysis Services are not provided as part of the SQL Azure services. Hence, it is imperative to build this custom component, which would achieve analysis services, cube formation, and querying cube-related functionalities on SQL Azure.

    In the proposed solution, the analytical engine has the following parts:

    Batch Process (Azure worker role): This Azure worker role would be responsible for the creation of data marts and offline reports.

    Data-Mart Processor: Responsible for creating new data marts (SQL Azure tables) from the data warehouse (Azure table storage) for focused analysis. The multiple requests submitted by analysts from the BI portal to create data marts would be handled asynchronously by batch-processing requests, implemented using Azure queues.

    Offline Report Generator: Responsible for generating standard reports periodically and storing it in the Azure blobs to make it readily available for the BI portal. This component would generate standard reports as per the configuration stored in the Azure table storage.

    Real Time Analytics (Azure web role): This Azure web role is one of the most important components used for analysis. It would be responsible for fetching data from data marts and presenting it on the BI portal for analysis. BI portal presentation of dynamic reports and KPI matrix and generation of ad-hoc reports on existing data marts are achieved through this component. It services analysis requests synchronously on the existing data marts, making real-time analysis possible on the data marts.

    Data Marts: Since the proposed data warehouse is created using Azure table storage, which is entity-value schema-based and non-relational, we propose to create data marts in the SQL Azure tables. This is primarily because existing analytical engines can also leverage the premium RDBMS capabilities offered by SQL Azure on the cloud without any changes. SQL Azure is a relational database and makes it easy to fetch data using complicated analytical queries. Power Pivot provides a quick and powerful analysis tool to be used with SQL Azure. Moreover, the BI portal would be able to generate the desired reports and analyses out of SQL Azure.

    Application Data: Application data comprises configuration and customization data required as a part of the BI solution.

    SQL Azure Reporting Services Reports: As part of the BI solutions, standard reports can be configured using SQL Azure Reporting Services (SARS) and can be made available from the BI portal.

    Standard Reports: As part of the BI solution, there are standard reports needed to be generated on the data using the specific dimensions and measures. These standard reports can be generated in a batch process to reduce the latency and can be made available all the time. As explained previously, the batch analytics component running on the Azure worker role generates these reports periodically.

    BI Portal: This is the web portal ported on Azure web role. It interacts with the analytical engine to generate dashboards, ad-hoc reports, and visual analyses of data from multiple dimensions and measures. This BI portal would be accessible everywhere over the Internet and would be made available over multiple delivery channels including desktop, mobile, and PDAs.

    Windows Azure Data Marketplace Dataset External Measures: The analytics engine can be configured to use specific datasets exposed from Windows Azure data marketplace. These datasets would be used as an external measure, along with the data mart measures, for analysis. Examples of such datasets that can be used as external measures could be demographic information of customers, upcoming business/stores in nearby locations, and weather conditions impacting sales for specific location

    Design Considerations Geo-location and affinity group: Applications developed on Windows Azure can be deployed across multiple data centers located around the world South Central US, North Central US, West Europe, East Europe, East Asia, and South East Asia. The Windows Azure global footprint is rapidly growing as Microsoft continues to build new global data centers for Azure deployment. Selection of appropriate data centers and creating an affinity group for deployment should be considered for the following reasons:

    Note: With Windows Azure version 1.6 release (November 2011), running SSAS off Azure VM roles is not supported by Microsoft. Hence, until Microsoft recognizes SSAS as a first class citizen of the cloud, we suggest using the data-mart processor approach.

  • 10 | Infosys White Paper

    Regional Legislations/Regulations These are to address regulatory requirements of deploying the application and its data within a specific geographical location. There are a few compliance requirements that organizations have to abide by, to keep their data geographically close to the region of business operations. These requirements can be addressed by deploying the Azure application in an appropriate data center.

    Performance Data center proximity to end users would help in reducing network latency and improving overall application performance. Creating an affinity group for application and data instances would deploy these components within the same data center and would bring them closer. Inter-process communication within the same affinity group is faster and helps in improving application performance, especially when there would be a large amount of data transfers involved during activities such as reporting and data mart creation.

    Caching: Caching frequently used data such as reference data and infrequently modified data would help reduce data access calls and latency in serving requests. Moreover, since there would be multiple roles running in the Azure load-balanced environment, we need to consider using distributed caching systems such as Windows Azure AppFabric Caching services or Distributed Memcached.

    Partition keys for table storage: Partition keys used for data warehouse should not create too large partitions such that they are not able to run efficient queries on Azure. We need to consider using partition keys in all queries for better performance.

    Communication security for data in transit: We need to ensure transport level security using SSL. For highly confidential data, we need to consider using messaging-level security, such as encryption and signatures.

    Processing Model: We could analyze business use-cases and choose the appropriate processing model between online and batch. Long running processes can be effectively scaled using the worker role approach for computation tasks. Message queue based asynchronous processing also provides data and processing reliability.

    SQL Azure Partition: In case where data-mart size expands more than one database instance limit of 150 GB for SQL Azure, consider horizontal partitioning of few tables. We could consider high-growth tables for partitioning and range-based keys or storing hash of keys to identify a specific partition.

    Other Cloud-Based BI Implementation ModelsAccording to US-based National Institute of Standards and Technology, the cloud is composed of three service models, namely, SaaS, PaaS, and IaaS. The design of the cloud-hosted BI solution explained in this paper was made by considering the boundaries of a PaaS cloud service model, realized using Microsoft Azure. The other cloud models available for implementing BI solutions are as follows:

    SaaS: This is the highest abstraction of the cloud. In this model, a finished application or solution is offered as a service. It is akin to a packaged product with support for limited customization offered through the cloud. Since it is a standard packaged solution, there may be limitations for enterprises to map their unique customizations and heterogeneous data stores to avail this solution. SaaS might be a good offering for smaller organizations to address their limited BI needs.

    IaaS: This is the lowest abstraction of the cloud. In this model, vendors provide basic hardware and software infrastructure as a service. Customers need to deploy their software, ranging from the operating system to the end application. Using this model, enterprise will have to address the need of software licensing and deployment themselves, which limits the benefits of the cloud computing platform.

    Enterprises can select their cloud platform based on the criteria described in the figure below, driven by factors that make business sense in their respective domains.

  • 11 | Infosys White Paper

    The above evaluation model summarizes the business value realized in implementing a cloud-based BI solution on different cloud service models. A model of this nature can help guide enterprises in selecting the most appropriate cloud service by mapping the expected outcome of their BI initiatives to the business value realized from the different cloud service options available.

    Concerns About BI in Cloud/AzureThe cloud platform addresses most of the challenges faced by enterprises in implementing and managing a traditional on-premise BI solution. However, there are few concerns around cloud usage for implementing BI solutions. These concerns are common to any cloud implementation and are not specific to BI. Let us briefly discuss these concerns from a BI cloud adoption point of view. The most talked about concern is around data security and compliance.

    Enterprises have concerns about placing their confidential data on the cloud where it would get replicated onto multiple servers. Technically, the cloud technology treats all data in a similar fashion and that raises concerns around information security. To address this problem practically, there is a need to amend the compliance rules to cater to the technology evolution. At the same time, cloud vendors need to provide mechanisms that can handle the need to meet compliance requirements more effectively. Until then, a hybrid solution as proposed in the high-level design in this paper, wherein critical data is stored on-premise but is exposed as a service for integration and aggregation purpose and transactional data is stored in the cloud, is an option that can be explored.

    SelectionCriteria

    Flexibility

    Ease ofManagement(Hardware,Software &Infrastructure)

    Control

    FunctionalRichness

    ApplicationBuilding Blocks

    Security &Compliance

    Time to Market

    QoS (Scalability,Availability,Reliability &Performance)

    PreferredProcurement

    ChoiceBuy Buy Build Subscribe

    On-Premise

    Private

    Business Intelligence Platform Evaluation Model

    Public

    IAAS PAAS SAAS

    On-Premise IAAS PAAS SAAS

    Figure 4: Business Intelligence Platform Evaluation Model

  • 2012 Infosys Limited, Bangalore, India. Infosys believes the information in this publication is accurate as of its publication date; suchinformation is subject to change without notice. Infosys acknowledges the proprietary rights of the trademarks and product names of other companies mentioned in this document.

    About Infosys

    Many of the world's most successful organizations rely on Infosys to deliver measurable business value. Infosys provides business consulting, technology, engineering and outsourcing services to help clients in over 30 countries build tomorrow's enterprise.

    For more information, contact [email protected] www.infosys.com

    ConclusionAs cloud computing is evolving and growing every day, it would bring on several distinct changes. We foresee changes in compliance requirements and a mindset shift to make optimized use of the cloud technology from the decision support system perspective. BI, as elucidated, has a peculiar nature; it would need a customized solution approach. An integrated BI solution formed from a combination of on-premise deployments, as well as cloud-based deployments, is the most suitable option available not only to realize the cloud benefits but also to address enterprise concerns around the cloud.

    This paper has discussed in detail how Microsoft Azure can be a good fit for an enterprise willing to optimize yet futuristically enrich its solution. This paper also envisages an integration pattern for hybrid in-cloud and on-premise solutions developed using Windows Azure. This pattern is not limited to BI solutions; it can also be used in multiple problem domains such as disaster recovery, data backup, seasonal campaigning, and collaboration solution. We hope to see a lot of interest generated in developing a green field BI solution, migrating an existing BI solution, or using the proposed aggregation design for implementing solutions on Windows Azure.

    Referenceshttp://www.powerpivot.com/

    http://msdn.microsoft.com/en-us/security/aa570351.aspx

    AcknowledgementSachin Kumar Sancheti, Technical Architect, for his immense contribution in preparing the initial draft and for technical input provided during his tenure in the organization.

    Yogesh Bhatt, Principal Architect, Infosys Labs and Sudhanshu Hate, Senior Technology Architect, Infosys Labs for paper review.

    About the AuthorSidharth Subhash Ghag ([email protected]) is a Senior Technology Architect with the Microsoft Technology Center (MTC) in Infosys. With several years of software industry experience, he currently leads solutions in Microsoft Technologies in the area of Cloud computing. He has also worked in the areas of SOA and service-enabling mainframe systems and on domains such as Finance, Utilities, and Transportation. He has been instrumental in helping Infosys clients with service orientation of their legacy systems. Currently, he helps customers adopt Cloud computing within their Enterprise. He has authored papers on Cloud computing and service-enabling mainframe systems. Sidharth blogs at http://www.infosysblogs.com/cloudcomputing


Recommended