Date post: | 08-Nov-2014 |
Category: |
Documents |
Upload: | maslinovik5003 |
View: | 16 times |
Download: | 0 times |
Copyright © 2006 B-EYE-Network.com
Prepared for Sybase, Inc.by Judith R. Davis
JANUARY, 2006
Right-Time Business Intelligence:Optimizing the Business Decision Cycle
TABLE OF CONTENTS
Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
The Business Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
What is Right-Time Business Intelligence? . . . . . . . . . . . . . . . . . . . . . .4
The Customer Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
Right-Time Business Intelligence Solutions . . . . . . . . . . . . . . . . . . . . . . . .6
Evolution of Business Intelligence Applications . . . . . . . . . . . . . . . . . .6
The Three Components of Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
Case Study: Shopzilla . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
Organization Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
The Business Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
The Right-Time Business Intelligence Solution . . . . . . . . . . . . . . . . . . .11
Implementation Advice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13
Summary of Benefits/Return on Investment . . . . . . . . . . . . . . . . . . . . .13
Summary of Features Most Important to Success . . . . . . . . . . . . . . . . .14
Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
About Sybase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Company Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Sybase IQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Other BI solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
Copyright © 2006 B-EYE-network.com
All Rights Reserved
Right-Time Business Intelligence
2
The right information
delivered to the right
people at the right time
Right-time falls betweenintra-day and real-time
Sybase IQ meets
Shopzilla business
requirements
Key steps in implementing
right-time BI
Impressive benefits and
ROI
A solid foundation for the
future
Trends centralize BI
information and
encapsulate BI functions
EXECUTIVE SUMMARY
Right-time business intelligence (BI) is all about delivering the right information in the right
format to the right people at the right time for decision-making purposes. Right-time BI optimizes
(“right-sizes”) the time latency between when a business event occurs and when an appropriate
action is taken. In business intelligence, this usually means shortening the overall decision cycle.
There are three segments of business intelligence latency: data latency, analysis latency, and
action latency.
Right-time business intelligence encompasses any BI data collection, analysis, or action that falls
between once-a-day and real-time. Right-time business intelligence is used to run an
organization’s daily business with the ability to modify the business intra-day. Right-time BI
applications have evolved over time to focus on operational business intelligence, the integration
of BI functionality within day-to-day business operations. Technologies for this include near real-
time updates of data-warehouse data from operational systems and the packaging of BI
capabilities as services available on demand.
In this report, we present an in-depth case study describing the right-time BI solution
implemented by Sybase customer Shopzilla to improve the effectiveness of business decisions.
Shopzilla is a well-known online shopping portal. The company uses Sybase IQ to capture and
access merchant consumer survey data in a data warehouse.
One area we explored in the case study was key steps in implementing a right-time BI solution.
Shopzilla emphasized the importance of the decision to implement near real-time updates. Once
the decision is made, the company recommended answering a series of questions to address the
implications of the decision on the BI environment.
Benefits of Shopzilla’s right-time BI solution were impressive. The major benefit to Shopzilla and
its merchants is the ability to react quickly to frequently-changing business environments. This is
particularly critical in the online world. Other benefits include enabling merchants to customize
their consumer surveys with minimal technical support, automation of survey processing, fast
response time on queries while concurrently updating the database in near real-time, availability
of more historical survey data for analysis, and simplification of the overall Shopzilla BI
infrastructure. Shopzilla also identified a significant return on investment (ROI).
Shopzilla’s right-time BI solution provides a solid platform for future enhancements to extend the
value beyond the current implementation. Examples include tripling the size of the data
warehouse and adding many new users, exploring new BI platforms, and making the Sybase IQ
data warehouse more accessible to less technical users.
Emerging trends are centralized BI environments and a single view of the business, and
encapsulating BI functions as common, reusable services. The study results highlight the
significance of right-time business intelligence as another operational system that adds value to
the business.
Right-Time Business Intelligence
3
The right information at
the right time…
…to make smart business
decisions…
…at every level of the
organization
Optimizing the tradeoff
between time and
business value
Right-time doesn’t mean
real-time
INTRODUCTION
THE BUSINESS OBJECTIVE
When it comes to intelligence about your business, the problem statement is simple. If you need a
piece of information and you don’t have it, you don’t have it. If you need a piece of information
now and you can’t get it now, you don’t have it when you need it. If you need these pieces of
information to solve a problem—say, figure out why your product isn’t selling in certain
markets—you aren’t going to be able to solve the problem. And if your competitor does have this
information about its business, you are at a distinct disadvantage. The point here is the
importance of knowing what information you need and when you need it—what is the right
information and when is the right time to look at it—and then ensuring your business can deliver it.
We’ve been talking about this for decades—getting the right information to the right people at the
right time in the right format. Companies are serious about achieving this goal in order to be
smarter about the way they do business, a prerequisite for survival and success in today’s
competitive, global marketplace. Companies are under more spotlights than ever before—new
government compliance regulations, highly-publicized customer-satisfaction surveys, and even
news stories in addition to the traditional illumination of financial reports. Customers have high
expectations; there is a shrinking margin for error and little tolerance for lack of good business
intelligence.
Who are my customers? Which ones generate the most value for the company? What do my
customers want from a relationship with my company? When a customer calls me, what do I
already know about that customer and how can I use the information to serve the customer
better? And look really smart because I have the customer’s profile at my fingertips, plus the
context I need to decide how to handle this particular call from this customer. “Well, Mr. Smith, I
see that the red widget you are calling about was purchased within the past month from our
downtown store, and this is your second call about this problem…” Detailed data about the
customer is transformed and integrated into useful information on which to make effective,
everyday decisions at every level of the business. If we extend this approach to all aspects of the
business—suppliers, products, etc. in addition to customers—we have the foundation for “right-
time” business intelligence (BI).
WHAT IS RIGHT-TIME BUSINESS INTELLIGENCE?
Right-time business intelligence optimizes the time latency between when a business event occurs
and when an appropriate action is taken. The goal is to “right-size” the decision-making cycle, the
time lag between knowing what is happening based on internal and/or external events, and taking
appropriate action based on that knowledge. According to a recent article by Richard Hackathorn,
right-time takes into consideration the potential trade-off between time-to-action and the business
value of the action.1 Making a decision today may have greater or lesser value than making the
decision next week.
In the BI arena, however, optimizing the decision cycle typically means shortening it, or
compressing it. It doesn’t necessarily mean minimizing the time lag, or automatically assuming
that every decision process must be completed in “real-time.” The key is to define the “right”
time for each decision cycle, one that reflects business realities and the trade-offs between risk
and cost. It is extraordinarily expensive to create a completely real-time organization. Even if an
organization can afford real-time decision-making, it may not be necessary or worth the cost.
Right-Time Business Intelligence
4
State-of-the-art used to be
daily updates
Businesses want more
real-time BI
Right-time is intra-day
data collection, analysis,
action
Right-time trends
BI is an operational
system that adds value
Understanding real-world
customer experiences
The state-of-the-art in business intelligence for a long time was a data warehouse (DW) and/or
data marts updated overnight (within the traditional “batch window”) with data from legacy
operational systems. The overnight updates extracted operational data in batch, transformed the
data into a format for analysis (e.g., denormalized data, multidimensional OLAP cubes), and
loaded it into the data warehouse. This approach still works for businesses where analysis of daily
data is “right time.”
Over the past five years or so, however, organizations have explored technology to support more
real-time data collection, analysis, and decision-making in a BI environment. The goal is to
support intra-day analysis of up-to-date information with the ability to make immediate decisions
about the business. Example: Flight 876 has been cancelled because of mechanical problems and
we need to re-book all the passengers within the next 15 minutes. Oh, and by the way, three are
platinum frequent flyers with over five million miles each. What are our options?
We define right-time business intelligence as any BI data collection, analysis, or action that falls
between once-a-day and real-time. It is BI used to run an organization’s daily business with the
ability to modify the business intra-day. Organizations can shorten the decision-making cycle in
many ways to facilitate right-time business intelligence. Current trends include:
• Moving data from operational systems into the data warehouse more often than once a day oron a continuous basis.
• Integration of data to provide a current “single view” of the customer, of the business, etc.with up-to-the-minute business context for analytical information.
• Extending access to analytical tools and information across the organization, enabling theretail buyer in the store, the call center operator, or a dashboard to take advantage of right-time BI functionality.
• Encapsulating BI functionality as reusable services that can be called “on demand” by anyapplication.
In the next section, we describe the evolution of right-time business intelligence solutions. The
strategic direction here is to seamlessly integrate BI functionality into business processes. This
recognizes the importance of business intelligence as yet another “operational” system that adds
value to the business.
THE CUSTOMER PERSPECTIVE
For this report, we also interviewed Shopzilla, a Sybase customer that has implemented a right-
time BI solution to improve the effectiveness of business decisions and processes. Our goal was
to understand the real-world challenges of right-time business intelligence, how the solution was
developed, and what lessons were learned. The case study below describes the specific business
problem, the right-time BI solution, benefits achieved, return on investment (ROI), features
critical to success, implementation advice, and future directions.
Right-Time Business Intelligence
5
Traditional BI focuses on
strategic and tactical uses
Operational BI focuses
on intra-day updates,
analysis, decisions
Integrating BI within
business operations
RIGHT-TIME BUSINESS INTELLIGENCE SOLUTIONS
EVOLUTION OF BUSINESS INTELLIGENCE APPLICATIONS
Let’s take a quick look at how business intelligence applications have evolved to support different
types of decisions and analytical time frames. Traditional approaches to business intelligence
include:
• Strategic business intelligence—Strategic use of business intelligence focuses on theachievement of long-term organizational goals such as increasing revenue or decreasing cost.The analytical time frame is typically 6 months to 1 year.
• Tactical business intelligence—Tactical use of business intelligence addresses interim,shorter-term organizational goals such as analyzing a marketing campaign for a specificproduct. The analysis is usually geared to achieving a strategic BI objective, and theanalytical time frame involved is typically weeks or a few months.
Traditional uses of business intelligence are supported by data warehouses and data marts that are
updated daily (or less frequently) using overnight batch windows.
Organizations are now exploring operational business intelligence, or ways to optimize a wider
range of decisions by integrating BI capabilities within business operations. Operational business
intelligence uses business intelligence in running an organization’s daily business. An example is
monitoring sales during the day, enabling the organization to constantly modify its business
within the day. The analytical and decision-making time frame here is intra-day. The operational
BI space is where right-time business intelligence fits.
The ultimate in right-time business intelligence is near real-time business intelligence, where data
is analyzed as soon as possible after it flows into the organization. Examples of near real-time
business intelligence are credit-card fraud detection and financial risk analysis applications.
The goal is to close the gap between analytical applications and operational applications, creating
a closed-loop process. BI analytics become a service available when needed, or embedded within
an operational process. Ideally, this brings the benefits of business intelligence to a broader
population within an organization. People and processes in operational departments can now take
advantage of the power of business intelligence. Business intelligence is no longer a tool only for
power users or analysts.
The technologies for implementing operational business intelligence are different from those used
to implement traditional BI applications. For example, the concept of the overnight batch window
for updating the data warehouse may be replaced by the need to trickle-feed data changes into the
BI system during the operational day. Another important direction is using a service-oriented
architecture (SOA) to package BI capabilities as services available “on-demand” to business
processes. These BI services can then be linked directly into operational processes if appropriate.
Right-Time Business Intelligence
6
Latency occurs between
data collection, analysis,
and decision points
Another way to categorize BI applications is whether they are event-driven or demand-driven.
• Event-driven business intelligence—An event-driven BI application monitors specificbusiness operations, relates ongoing data events or changes to business rules, and generatesalerts when appropriate. The BI application may also have the ability to take action itselfbased on knowledge embedded in the system. Thus, event-driven, right-time BI monitors thehealth and well-being of the organization. When exceptions occur, the system alerts someoneand may include recommendations about what action should be taken, or the system maymake a decision automatically. The closer an organization gets to real-time, the more event-driven it becomes.
• Demand-driven business intelligence—A demand-driven BI application is accessible tooperational systems when analytical data is needed to make a business decision. An exampleof demand-driven business intelligence is a call center operator firing off a query against afederated database to get information about a customer on the phone.
THE THREE COMPONENTS OF LATENCY
Another context for right-time BI solutions is the component of the decision-making process each
addresses. There are three segments of latency on the continuum between when an event occurs
and when an informed action is taken—data latency, analysis latency, and decision/action latency
(see Figures 1 and 2 below.) This concept was developed by Richard Hackathorn, Bolder
Technology, Inc.1
Data latency is the time it takes to collect raw data, prepare it for analysis, and store it where it
can be accessed and analyzed. Important functionality here includes data profiling, extraction,
validation, cleansing, transformation, integration, transformation, delivery, and loading. There are
a wide variety of tools and products that address one or more of these aspects of data latency.
These tools fall into several categories, including extract, transform, and load (ETL); data
replication; enterprise application integration (EAI); enterprise information integration (EII);
master data management (MDM); and others. The destination platform of the data is typically a
data warehouse or data mart. For more information on these technologies and which to use when,
please see the Data Warehousing Institute (TDWI) research report titled “Data Integration: Using
ETL, EAI, and EII Tools to Create an Integrated Enterprise” by Colin White of BI Research.2
Right-Time Business Intelligence
7
Three Components of Latency
Figure 1. This shows the three components of latency between when a business event occurs and when action
is taken. The elapsed time may result in lost business value to the organization.
The Benefit of Reducing Latency
Figure 2. Reducing latency at one or more points in the decision-making time continuum can dramatically
increase the business value of the decision
Analysis latency is the time it takes to access the data, analyze the data, turn the data into
information, apply business/exception rules, and generate alerts if appropriate. Analysis may be
done by a user or an application.
Right-Time Business Intelligence
8
Right-time is business-
and process-specific
A leading online
comparison- shopping
service
Handling huge volumes of
frequently-changing
product data
Global reach and influence
Consumer surveys provide
feedback for merchants
and Shopzilla
Decision latency is the time it takes to receive an alert, review the analysis, decide what action is
required, if any, based on knowledge of the business, and take action.
Implementing right-time business intelligence involves detailed analysis of each business process
to look for opportunities to shorten the decision cycle. Only then can the organization assess
which BI technologies and products can be applied as part of a right-time solution. For example,
manufacturing companies with just-in-time inventories need updates throughout the day. In the
case of new compliance regulations, companies often need to report exceptions as soon as possible.
CASE STUDY: SHOPZILLA
ORGANIZATION BACKGROUND
Shopzilla is a well-known, online comparison-shopping portal. Its Web site is designed to help
Internet shoppers quickly find the products they want at the best price. Founded in 1996 as
BizRate.com, the company initially focused on collecting consumers’ ratings of their online
shopping experiences. After raising enough venture capital, the company expanded into
comparison shopping in the early 2000s. The company name was changed to Shopzilla in late
2004. Shopzilla is headquartered in Los Angeles.
Shopzilla is now an operating subsidiary of E.W. Scripps Co., a diverse media conglomerate with
businesses in areas such as newspaper publishing and broadcast TV. Shopzilla’s 2004 revenues
were $67.4 million. The number of employees is 210 and growing. Competitors include
Shopping.com, Yahoo! Shopping, NexTag, and Froogle.
Shopzilla’s Web site currently offers access to over 30 million products from more than 60,000
merchants in 2,500 categories, and these volumes are steadily growing. Shopzilla continually
updates its database of merchants, products, and prices to provide each site visitor with relevant,
up-to-date search results using its proprietary ShopRank search engine.
In addition to the U.S., Shopzilla has country-specific versions of its Web site for the United
Kingdom, France, and Germany. Shopzilla also provides shopping content for other Web sites,
such as AOL, Lycos, and Time Warner’s RoadRunner.
We interviewed Ranga Srinivasan, Director of Database Administration, and Anish Balakrishnan,
Director of Strategic Projects, for this case study.
THE BUSINESS PROBLEM
A major component of Shopzilla’s business, in addition to its shopping search engine, is customer
surveys. Shopzilla conducts customized surveys for many of its merchant partners—Circuit City,
Staples, and others. Shopzilla also conducts surveys on its own behalf. The BizRate market
research division summarizes, analyzes, and publishes the results of these surveys. Surveys serve
two purposes: (1) enable Shopzilla to rate merchants and (2) enable each merchant to improve its
business by directly analyzing its own survey results. Survey types include:
• An online point-of-sale (POS) survey—This survey collects customer feedback on the firstpart of the online ordering process: placing the order. Specific metrics addressed includeevaluation of the online ordering process, overall satisfaction and loyalty, featuresinfluencing the purchase, and demographic information, among others.
Right-Time Business Intelligence
9
Current data warehouse
was too static
Technical goals: 100%
uptime, fast update and
retrieval
Flexibility to customize
survey content
• An online fulfillment survey—This survey collects feedback on the second part of the onlineordering process: delivery and post-purchase support. In this case, the buyer receives anemail from Shopzilla with a link to the survey. Metrics here are similar to those of the POSsurvey with a focus on post-purchase service and support. Results of fulfillment surveys aretied to POS surveys for the same consumer.
• An offline survey—Customers who buy something at a physical store are invited to providefeedback on the shopping experience via a secure URL. The customer logs on to a Web siteto complete the survey. Offline survey results are treated separately from online results.
• An online panel survey—Shopzilla maintains a panel of over 800,000 buyers who canprovide merchants with immediate answers to critical e-commerce questions. The panelistsare called on to answer a broad range of questions, including trends in shopping behavior andawareness/effectiveness of marketing and advertising. These surveys can be targeted todistinct audiences based on demographics, Internet usage, purchasing behavior, etc.
From survey results, Shopzilla updates the merchant-rating information displayed on its Web site,
provides merchants with valuable customer feedback, and publishes its own market-research
reports.
In 2001, Shopzilla realized that its solution for survey data was too static and would not support
future requirements. Survey data was stored in a Sybase RDBMS, data was at least one-day old,
query response was slow, ad-hoc reporting was not supported, and merchant surveys were not
customizable. Data volume was growing fast and there was a need for ad-hoc reporting, faster
response times, and more flexibility in addition to reliability and high availability.
A major business goal was to enhance the value of the Shopzilla relationship for the primary
customer—the merchant—and develop a closer connection with merchants. To do this, Shopzilla
wanted to offer merchants (1) the ability to completely customize their surveys based on unique
business requirements, (2) the opportunity to improve their business based on ad-hoc analysis of
18 to 24 months of survey results, and (3) near real-time access to their data with no technical
investment on their end.
Another key goal was to automate survey processing to minimize resources and time required to
update the database of survey results. Shopzilla also wanted to simplify the overall infrastructure
and provide compatibility with its Web operation. And last but not least, pricing and a low total
cost of ownership (TCO) was a big issue for the company.
Given its competitive environment, Shopzilla developed a daunting list of strategic business
intelligence requirements to achieve these goals:
• 100% uptime—The business intelligence system needed to be available 24/7, 365 days ayear. The challenge was to find a solution with both hot backup and no downtime for anymaintenance activities, including schema changes.
• Near real-time data updates—Shopzilla wanted to update survey data in the data warehouseat least every 15 minutes while supporting concurrent ad-hoc queries.
• Instantaneous query response time for ad-hoc reporting—Shopzilla wanted fast queryresponse time for both merchant-survey analysis and internal analysis, regardless of thecomplexity of the query and the amount of underlying data. The company planned to store atleast 18 to 24 months of data in the data warehouse.
• Ability for merchants to customize surveys without requiring technical assistance—Shopzillawanted to give merchants the flexibility to design their own survey questions.
• Support for a large number of columns per table—Shopzilla anticipated the need forupwards of 1,500 columns in its main survey tables, and the ability to automatically addcolumns and tables on-the-fly to capture custom survey questions.
Right-Time Business Intelligence
10
Adapt to growth
Sybase IQ handles survey
results and ad-hoc
reporting
Merchants customize their
own surveys
Survey results are
updated every 15 minutes
Columns are added on-
the-fly for new survey
questions
• Ability to easily migrate to a new database structure—Shopzilla realized that a column-oriented database was the best architecture. The solution also had to support Transact-SQLand provide easy data migration from Sybase, Microsoft SQL Server, and Oracle.
• Handle increasing volumes of data—Shopzilla also knew that as the number of merchantclients increased, there would be a resulting increase in data volumes and database size. Thedata warehouse had to adapt seamlessly to such growth without adversely impacting queryperformance and near real-time data-refresh rates.
THE RIGHT-TIME BUSINESS INTELLIGENCE SOLUTION
Shopzilla’s right-time business intelligence solution is focused primarily on processing and
analyzing survey results. This is where Shopzilla needs near real-time updates and fast response
time on ad-hoc queries and reports. To handle the collection of survey results and the merchant
reporting system, the company implemented Sybase IQ version 12.5 on Sun servers using the
Sybase IQ Virtual Backup strategy.
Shopzilla’s environment also includes Sybase Adaptive Server Enterprise (ASE) for transactional
data. ASE feeds data overnight to Sybase IQ via flat ASCII files. As a note, Shopzilla also
maintains a traditional data warehouse in Oracle that is updated once a day from flat files and
ASE database tables. This is used for internal reporting on more static financial and historical data.
NEW SURVEYS. For a new survey, Shopzilla provides a standard survey form that the merchant
customizes. Shopzilla then enters the customized survey data into the Sybase ASE system and the
survey goes live when scheduled. Merchants can request Shopzilla to change a survey at any time
to add, delete, or modify questions. When Shopzilla updates the database, the change is
immediately reflected in the survey.
COLLECTING SURVEY RESULTS. When a customer takes a survey, he/she is connected to a
Shopzilla server. The survey questions come from the ASE transactional database to ensure that
all customers get the same survey from each merchant. Surveys are divided into pages, with five
questions on each page; a survey can have several pages. Shopzilla already has the customer’s
order number, email address, and other identifying information (even if the customer cancels
while in the middle of taking the survey). A pre-script checks survey answers for data consistency
and validation. For example, if someone says he is 13 years old and has four children, the script
would clear the answer as incorrect.
As the customer completes each page, it is saved on the server in a flat file. Another post-script
runs every 15 minutes to collect all completed survey pages (even if the customer hasn’t
completed the entire survey), reformat the flat files, and load the data into the Sybase IQ survey
database. “We would like to do this more frequently but can’t because query performance
degrades to an unacceptable level, especially during the holiday season when daily data volumes
increase significantly,” says Balakrishnan.
Shopzilla also designed a rule-based system to automate the process of adding new survey
questions to the database schema. Based on the type of question, the script decides what type of
column is appropriate for storing the data and creates the column on the fly if necessary as the
data is loaded into Sybase IQ. For certain types of new questions, the script creates a table. An
example is a question that can have more than one answer, such as “Where did you hear about
us?” The POS online survey table is the largest fact table in Sybase IQ, with as many as 1,600
columns. Shopzilla also builds multiple, highly-optimized indexes on each column to capture
averages, min-max data, etc.
Right-Time Business Intelligence
11
Access to customer
comments provides
context
Shopzilla also uses IQ
for internal analysis
Performance optimization
was key
MERCHANT REPORTING SYSTEM. Shopzilla provides a secure, online environment where
a merchant can run ad-hoc reports to analyze its survey data, look at trends, and compare results
with category and industry aggregates. The merchant can make projections for a category, view
volume growth, identify days with the highest sales volume, etc.
The merchant sees a list of questions on the survey, and can apply filters to the data. Standard
filters are retail versus business customers and date ranges. The merchant can apply up to four
additional filters to slice and dice the data for each query. Available filters are based on the
questions on that survey (e.g., age ranges, sex, marital status, product type, overall customer
satisfaction, etc.). Results are displayed as graphs. “This is a reporting tool for the merchant,
something that can even be taken into a board meeting to show quarterly or annual results,” states
Balakrishnan.
The merchant can see exact customer comments, and in some cases, may be able to match a
comment to a specific order. Suppose a customer says “No” to the question “Would you shop
here again?” and gives details about why. This gives the merchant useful information about ways
to improve the business. The merchant might also decide to take action to try to retain the
customer with a special offer. As a note, Shopzilla’s “quick response” feature sends comments
from Sybase IQ out to the merchant as soon as they are received.
An additional merchant benefit accrues from Shopzilla’s partnership with Coremetrics.
Coremetrics software analyzes Web site traffic and captures and stores all customer and visitor
clickstream activity to examine the causes of site abandonment. Shopzilla ties Coremetrics data to
Shopzilla survey results to add context for merchant analysis.
INTERNAL ANALYSIS. The business intelligence environment also includes ad-hoc queries
and reporting on the inventory of 30 million products. This is a fairly recent application of
business intelligence. For example, Shopzilla uses Sybase IQ to analyze how products are
categorized. It is very important to assign every product to the most relevant category to ensure
customers find what they want.
THE PERFORMANCE CHALLENGE. A major challenge for Shopzilla was figuring out how
to load data into Sybase IQ without impacting anyone reading the data. The goal was to handle a
mixed workload of data updates, schema changes to add columns to tables, and multiple readers.
In the 2001-2002 implementation timeframe, the Multiplex version of Sybase IQ did not support
schema changes on the fly. (It does in version 12.6). So Shopzilla installed the Simplex version of
Sybase IQ. The company faced two significant technical issues.
One was the fact that when an update transaction on a table locked out readers, queries were
terminated. Shopzilla developed its own technology to efficiently interleave schema changes,
updates, and reader processes. A system of “handshakes” (using semaphore locks) between a
write process and reader processes ensure that reader processes on a table wait until the write is
complete. Schema changes don’t take much time, adding perhaps two seconds to each query in
the worst case.
Second, to maintain data consistency, Sybase IQ creates a version of the data—a snapshot—for
each reader based on the last checkpoint. However, with frequent data loads, these “old” versions
proliferate, taking up memory space that could be otherwise used for processing ad-hoc queries.
This eventually leads to a situation where Sybase IQ runs out of memory and begins paging to
disk, resulting in unacceptable performance. To handle this, Shopzilla explicitly issues “rollback”
commands after every SQL query to release read locks on each table accessed. In addition, a
script runs every five minutes to kill idle connections to the database.
Right-Time Business Intelligence
12
Understand the issues
with near-real-time
updates
Performance tradeoffs
Ability to identify changes
and react quickly
Investment supports
dramatic growth
To achieve its goals, Shopzilla has combined these and other optimizations with regular data
purges and Sybase IQ’s native performance functionality (e.g., data compression, highly-
optimized indexes, detailed query plans) to ensure users get the desired fast response time. No
query, not even the most complex ad-hoc query, takes more than 10-15 seconds.
IMPLEMENTATION ADVICE
Balakrishnan and Srinivasan recommend answering a series of questions before considering a
business intelligence solution based on near real-time data updates. The first and most important
question is: Do you need near real-time data? “Too often, this is a nice-to-have. The effort and
costs involved are significant enough to really question the necessity of this.” Once an
organization decides to go with a near real-time solution, consider the following:
• What is the projected total data volume and what percent needs to be near real-time? A highvolume of data with a small percentage in near real-time is very different from a situationwhere most of the data needs to be near real-time.
• What is the projected growth of the data? If it is high, consider solutions that offer datacompression.
• What is the average query response time desired? There is always a performance trade-off ina mixed environment of queries and near real-time updates. To achieve fast queries withfrequent updates, an organization may have to place limits on how users access the data. Forexample, Shopzilla wanted instantaneous query response with updates every 15 minutes on adatabase containing two years of data. To achieve this, the company purges unneededhistorical data as often as possible, and controls user access to data through an application.Every query is restricted to a maximum of one year of data; looking at two years requirestwo separate queries. The results (graphs) of both queries can be viewed together. This is atrade-off Shopzilla made to get both query and update performance.
• Does the system need to support ad hoc queries? If yes, look for a solution with highly-optimized indexes to make queries as efficient as possible.
• How much data is text-oriented? Shopzilla surveys can allow the consumer to entercomments. “We get huge comments [up to 8,000 characters each] which are a huge problemto handle in our data warehouse environment. The biggest issue is retrieving the text from thedatabase. A year’s worth can be a lot of text. We store text comments in a file-based systemand have a highly-optimized way of retrieving them.”
SUMMARY OF BENEFITS/RETURN ON INVESTMENT
BENEFIT. The major benefit of right-time business intelligence to Shopzilla and its merchants—
ad-hoc query and reporting on near real-time survey data—is the ability to react quickly to
frequently changing business environments. In the online world, customer satisfaction can make
you or break you. Having current information about customer experiences is critical to identifying
the potential need for corrections and the ability to take timely action to maintain or improve
satisfaction.
ROI. Since Shopzilla went live with its right-time business intelligence solution in mid-2002, it
has maintained performance even though the volume of data has tripled and the number of users
has increased dramatically. Merchants now have access to 18 to 36 months of survey data instead
of just 13 months. And Shopzilla has not had to significantly increase its investment in hardware
and software, keeping the TCO low.
Right-Time Business Intelligence
13
Sybase IQ meets business
requirements
New platforms and
technologies to handle
future growth
The goal: all data is
integrated and available
Right-time BI is a critical
foundation for future
growth
SUMMARY OF FEATURES MOST IMPORTANT TO SUCCESS
Balakrishnan states that Sybase IQ “gave us the groundwork on which we could build a solution
that solved our problems.” Specifically, Sybase IQ offers these important features:
• Instant query response time out-of-the-box
• The ability to quickly load data in batch
• The ability to have up to 45,000 columns in a single table
• Automatic data compression
• Low TCO
• Support for Sybase’s T-SQL provides compatibility with the ASE server
• The ability to build specialized indexes
• The ability to import data from Oracle and Microsoft SQL Server
• Easier to manage and maintain than almost any other DBMS
• The data warehouse can grow as the business grows (add more query servers to expand)
• Out-of-the-box high availability.
FUTURE DIRECTIONS
Shopzilla has spent the past two and a half years optimizing the Sybase IQ data warehouse. The
solution is now very mature, and the company is looking to move to the next level of capability.
Future directions include:
• Tripling the size of the data warehouse and adding many new users, both merchant partnersand internal users.
• Moving to the Multiplex version of Sybase IQ on Linux to enhance performance, handlepeak loads, and adapt to growth while keeping TCO and the cost of adding incrementalcapacity low.
• Evaluating use of Sybase Replication Server to move data in near real-time from ASE toSybase IQ.
• Making the Sybase IQ data warehouse more accessible to less technical users through moreintuitive interfaces such as the Excel-like SPSS client. This will enable Shopzilla to extendthe scope of the Sybase IQ data warehouse with new applications.
CONCLUSIONS
Right-time business-intelligence highlights the state-of-the-art in developing BI platforms to
support timely, effective business decisions. The goal is to synthesize all available data (internal
and external); produce an integrated, consolidated view of the business (including data, business
definitions, business rules, and organizational hierarchy); and make it available (either directly or
as a callable service) to business users, applications, and processes for analysis and decision-
making. The raw data on which to make effective decisions exists in most companies. It just isn’t
available when, where, and in the format necessary. Shopzilla recognizes this and has implemented
a solution to develop business intelligence and connect it to business-decision processes.
Shopzilla’s right-time business intelligence solution has enabled the company to satisfy a difficult
set of business requirements and provide an important service to its growing community of
merchant clients—fast and easy access to near real-time survey results. An intense, three-year
effort has resulted in a high-performance, near real-time business intelligence platform that offers
Shopzilla a solid foundation on which to grow the business. Right-time business intelligence is
and will continue to be a key factor for Shopzilla in achieving success in a highly-competitive,
fast-moving business environment.
Right-Time Business Intelligence
14
Enterprise-wide
information management
Sybase IQ for business
intelligence
A real-time analytics
server for high-
performance query
and reporting
The bottom line: It’s your business. Your goal should be to get it right with right-time
business intelligence.
References
1 Hackathorn, Richard. “The BI Watch: Real-Time to Real-Value,” DM Review, January, 2004.
2 White, Colin. “Data Integration: Using ETL, EAI, and EII Tools to Create an Integrated Enterprise,” research
report published by The Data Warehousing Institute, October, 2005.
ABOUT SYBASE
COMPANY BACKGROUND
Sybase (www.sybase.com) is a global software vendor focused on managing information
throughout the enterprise. The company provides open, cross-platform solutions for information
management and delivery. The goal is to enable customers to optimize and enhance current
investments, link existing data resources, and extend the reach of business-critical information to
users on the front lines. Key industries supported are financial, government, communications, and
healthcare. Annual revenues were $789 million in 2004.
In 1987, Sybase was the first RDBMS to incorporate intelligence in the DBMS engine itself with
features such as triggers and stored procedures. Sybase has pioneered other RDBMS
developments as well, including an open architecture and data replication. The company’s family
of database servers includes Sybase Adaptive Server Enterprise (ASE) for transactional
environments; Sybase IQ for business intelligence (BI) and data warehousing (DW); and Sybase
iAnywhere for mobile, embedded, and small-to-medium business (SMB) support. Sybase
Replication Server moves data among heterogeneous sources, including IBM DB2, Informix,
Microsoft SQL Server, and Oracle in addition to Sybase.
Sybase is headquartered in Dublin, CA.
SYBASE IQ
Sybase IQ is Sybase’s strategic analytics server for BI and DW environments. Sybase IQ provides
a real-time analytics server designed for very-fast query response on large volumes of data.
Sybase IQ runs on HP-UX, IBM AIX, Linux, Microsoft Windows, and Sun Solaris. The latest
version is 12.6.
Sybase IQ began as a bit-mapped indexing system to provide read-only access to data stored in
the Sybase RDBMS. The idea was to build bit-mapped indexes on the base data with the option
to include in the indexes most, if not all, of the data that would be needed to satisfy queries.
Sybase IQ was also designed to cache the indexes in memory where possible. Thus, queries ran
primarily against the cached Sybase IQ indexes, and did not require disk I/O to look up data.
Sybase IQ’s performance was fast and queries did not impact the performance of the underlying
transactional database.
Sybase IQ has since evolved into its own high-performance database-management server. To
achieve query performance, Sybase IQ is designed to support complex analytics on large data
volumes. As a result, Sybase IQ has a number of features that distinguish it from a traditional
RDBMS.
Right-Time Business Intelligence
15
Data is stored in columns,
not rows
Bit-mapped indexes, large
page size for fast access
Database-level
compression
Versioning supports
concurrent queries and
updates
Dynamic Operational Data
Store
Real-Time Data Services
Risk Analytics Platform
• Column-based architecture—Sybase IQ stores data as columns on disk instead of as rows.This makes retrieval extremely fast because Sybase IQ only has to access the data in thecolumns included in the query. It doesn’t have to read all of the data in every qualifying rowto find the result. For example, to find all customers in a particular zip code, Sybase IQwould simply read the zip code column (or index) to locate the qualifying rows. And if all ofthe data requested in the query is included in the index, Sybase IQ would not have to do anydisk I/O to return the results. Sybase IQ’s column-based architecture also adds flexibility toschema changes, such as adding or deleting columns. Sybase IQ allows up to 45,000columns per table.
• Bit-mapped indexes—Sybase IQ’s bit-mapped indexing capability supports a variety of indextypes that can be tailored to the data type, data distribution, and cardinality in a particularcolumn. To evaluate multiple query criteria, Sybase IQ overlays the bit maps for eachcolumn to find the intersection (and) or the union (or).
• Large page size—Sybase IQ supports a page size of up to 512K to optimize I/O.
• Data compression—Another feature of Sybase IQ is the fact that it compresses data by 30%to 70%. For Sybase IQ, it is faster to read and write compressed data than to read and writeuncompressed raw data. Compression minimizes the problem of data explosion as databasevolumes grow by reducing the cost of both storage and memory for the customer. Forexample, a 1TB database could require as much as 3 to 5TB or more of disk space in atraditional RDBMS. Keeping 10% of the data in memory would require 100GB of memory.Sybase IQ, in contrast, would compress 1TB down to 300 to 700GB on disk; keeping 10% inmemory requires 30 to 70GB of memory. Alternatively, compression allows Sybase IQ tostore more data in the same amount disk space.
• Table-level versioning—To maintain query performance while data is loaded, Sybase IQcreates a snapshot of a table whenever the table is updated. Sybase IQ also only allows onewriter at a time per table.
• Multiplexing—Sybase IQ-Multiplex uses a shared-storage architecture for parallel processingacross multiple servers.
OTHER BI SOLUTIONS
To meet evolving customer requirements, Sybase has developed solutions that combine multiple
products and target particular industries or applications. Here are some recent examples.
Dynamic Operational Data Store (ODS) is a horizontal solution for any customer that needs to
execute queries on operational data without impacting operational performance. Sybase
Replication Server or an ETL (extract, transform, and load) product, such as Informatica, is used
to “dynamically” offload data from the operational system into a separate Sybase IQ “ODS”
database for reporting and analytical queries. The ODS is dynamically loaded based on changes
in the operational system. This solution takes pressure off operational systems while giving users
query and reporting capability.
Real-Time Data Services combines Sybase replication technology, ASE triggers, and integration
with the JMS messaging bus from Tibco. The goal is to reduce the latency between when a
business event occurs and the ability to take action based on that event.
The Risk Analytics Platform is aimed at capital markets applications. This platform combines
Sybase’s analytics server, enterprise modeling tools, and a risk-analytics data model derived from
industry best practices. The data model captures inbound “trades and quotes” from the major
exchanges and updates the analytics server, where the data is integrated with historical data. The
intent is to help traders analyze and spot opportunities faster and support near-real-time trading
decisions.
Right-Time Business Intelligence
16