+ All Categories
Home > Documents > Skytide_for_online_content_v2

Skytide_for_online_content_v2

Date post: 16-Mar-2016
Category:
Upload: patrick-hurley
View: 214 times
Download: 1 times
Share this document with a friend
Description:
Beyond Web Analytics: Analytics & Reporting for Real Business Results Answering these questions may seem simple, but it requires accessing and merging together multiple sources of extremely high volumes of data. Data as diverse as content server log fi les, network traffi c control data, and content application log fi les, all notorious for their complexity, lack of standards, and huge volumes of raw data (think terabytes/day or hundreds of thousands of records generated/second).
Popular Tags:
12
Beyond Web Analytics Analytics and Reporting for Real Business Results By Joseph Rozenfeld VP Strategy & Solutions Skytide, Inc. January, 2008 Skytide 1 Waters Park Drive, #160 San Mateo, CA 94403 650.292.1900 www.skytide.com
Transcript
Page 1: Skytide_for_online_content_v2

Beyond Web Analytics

Analytics and Reporting for Real Business Results

By Joseph Rozenfeld

VP Strategy & Solutions

Skytide, Inc.

January, 2008

Skytide

1 Waters Park Drive,

#160

San Mateo, CA 94403

650.292.1900

www.skytide.com

Page 2: Skytide_for_online_content_v2

Table of ContentsOverview 1

New Data Environment 2

Data Value Chain 2

The Skytide Process 3

The Old Way vs. The Skytide Way: A User Scenario 4

Sample Reports 5

Skytide Application Sets for Online Content Delivery 8

Excerpt from 451 Group Impact Report 9

Summary 9

Table 1 10About the Author

Joseph Rozenfeld

Vice President of Strategy and Solutions (Co-Founder), Skytide

Joseph is responsible for shaping the company’s strategic vision and technologies to meet the fast-growing needs of the business intelligence marketplace. Joseph has more than 20 years of software development and management experience and has founded or co-founded four companies including Skytide and ChainCast Networks.

As executive vice president and CTO of ChainCast Networks, the fi rst provider of commercial peer-to-peer software for broadcast streaming, he grew the company into the largest streaming provider for terrestrial radio broadcasters in the U.S. with a client list including ClearChannel, NTT, Cox, and ABC. Joseph was also a founding engineer, development manager and architect of Essbase and IBM DB2 OLAP servers at Hyperion Solutions (formerly Arbor Software). Joseph holds an M.S. in Computer Science and a B.S. in Applied Mathematics from Moscow Polytechnique University.

Page 3: Skytide_for_online_content_v2

Page 1

Beyond Web Analytics:Analytics & Reporting for Real Business Results

Overview

Online content is exploding. In parallel, user expectations and reliance on this content have set the stage for accelerated growth in this market. Delivery of online content has become a legitimate and increasing portion of bottom-line revenue for both “new-economy” and traditional businesses-model companies. One has only to look at Microsoft, which generated over $150M in 2007 from movie downloads alone1 and imagine the scores of unforeseen revenue streams yet to be realized by new media companies such as YouTube and Facebook.

Driven largely by the arrival of Web 2.0 and the availability of enterprise-level bandwidth, the online content phenomena has spawned a host of enabling technologies and business models to help companies deliver new, diverse and compelling types of content ever faster to an increasingly hungry global audience. This phenomena is responsible for revitalizing the content delivery market, projected to reach $4B in just a few years. From iTunes to Netfl ix movie downloads, maximizing the return on investment of the ongoing stream of online content is mission-critical.

Companies have implemented new ways to produce, transport, store, and make available all sorts of exciting new content and applications – from streaming video and live Flash presentations to interactive chat sessions. What has lagged behind, however, are solutions that provide content owners effective ways to understand the “who, what, when, where, why, and how” of content usage, which are critical components to monetize content. Solutions offered by web analytics and Content Delivery Networks (CDNs) fall short of delivering comprehensive analytics and reporting on content usage, and remain unable to answer critical questions, such as:

• How long is any one video clip is watched?

• Which users downloaded my least traffi cked content?

• What region is responsible for the most content downloads?

Answering these questions may seem simple, but it requires accessing and merging together multiple sources of extremely high volumes of data. Data as diverse as content server log fi les, network traffi c control data, and content application log fi les, all notorious for their complexity, lack of standards, and huge volumes of raw data (think terabytes/day or hundreds of thousands of records generated/second).

Web analytics solutions are built around analyzing a single data source – clickstream, and are thus not able to manage the multiple streams of high volume data. Enterprise content owners and content service providers are left to cobble together a mix of traditional business intelligence/analytics tools, web analytics, and even enterprise search tools. But, this mix of “old” and “new” technologies doesn’t solve the signifi cant analytical challenges:

Traditional Business Intelligence: Business Intelligence (BI) typically relies on database technologies, which break down when dealing with very large and very diverse data sets. These high volume data sets can cause extreme latency issues, with analysis taking days, weeks or even months, rendering the intelligence meaningless to the business user. It is also infl exible in handling dynamic, changing data sources or complex data formats.

Web Analytics: There are a host of web analytics solutions that can offer insights into web traffi c logs, but typically involve special coding placed on corporate web pages, and often require predictive path for further drill-down. What these solutions don’t do is offer an un-biased view across all web traffi c, and the ability to combine and relate the web log data to other data sources, such as streaming content logs, customer data, marketing campaign or lead/sales tracking data bases. These solutions have implemented complex, heavy infrastructure to handle a single stream of data (HTTP – click-stream data), but this can’t handle the multitude of content application data streams.

Advanced Enterprise Search: Technology advances now offer faster, easier ways to search across enterprise databases and networks to locate data entities. However, enterprise search today is far from offering analysis and insights about these data streams.

Page 4: Skytide_for_online_content_v2

• Scale: Handle massive amounts of data over time

• Flexibility: Seamlessly integrates diverse, “weird” data formats

• Speed: Offer fi ngertip access to meaningful results based on this data.

These challenges must be solved before organizations can turn content into sustainable revenue streams. Because understanding the details of content usage over time drives the ability to deliver differentiated service offerings as well as maximizing the return on investment into online content delivery.

The New Data Environment

Monetizing content requires an understanding of how your customers interact with all of your content — not just easy to reach segments. The more you know about your user behavior, the better you can deliver what they want. As customer and prospect interactions increasingly happen over the network and the Internet, companies need to track behavior in this new cyber environment, which is far more complex, faster, and covers greater territory than ever before.

And, online user interactions are no longer limited to HTTP transactions. Instead, users are accessing a wide range of online applications and services which generate a unique data streams. Users move seamlessly between online interactions, such as:

• Viewing a Flash demo to review a new product

• Watching a TV episode

• Clicking on a special banner ad offer

• Placing an eCommerce order

• Downloading a music clip

• Viewing a single web page

Each interaction touches different network components (web servers, live media servers, ad servers, etc), and produces complex, diverse, and extremely large, streams of data describing customer interactions with content (Figure 1).

The sheer volume and complexity of the data generated from these multiple user interactions make it impossible to effi ciently put the data into a database, a requirement of traditional BI solutions. Too often, the different pieces of data required to paint a complete picture of user behavior is either discarded, accessed for “point in time” analysis, stored and ignored, or sampled for a limited historic view.

Thus, businesses make decisions about online content offerings — the real basis of monetizing online content investments — based on a limited understanding of how users and customers interaction with the content.

Data Value Chain: Answering the Important Business Questions

Data that is transformed into rich, timely information fuels successful businesses. The most valuable information, however, refl ects the joining together of multiple sources of data, over time. Tracking how customers and prospects access and use online content can only be obtained by combining the various data sources generated by both online and offl ine processes. Each data source adds a dimension to understanding the patterns of behavior, with the highest business value derived from a complete multi-dimensional view of customer behavior as shown in Figure 2.

Figure 1: Dynamic websites offer a wide range of possible user interactions, each of which connect with different network components, illustrated here. To obtain a comprehensive view of user behavior requires that each of these interactions be combined and analyzed together – over time.

Key Question: A user goes to your website and clicks on a special promotional 2 minute video clip. Will your web analytics solution show you how long they spent viewing the clip?

Answer: No. Web Analytics solutions track the HTTP data only for pages that you have inserted a special code, but, even with the code in place, you would only see the number of times the clip

was accessed, but it would not include any metrics on viewing time, including average viewing times. This data is only available if you have access to the media server log fi les responsible for streaming the content. Web Analytics solutions can not track this data.

Analytics and Reporting for Online Content Page 2

.

Page 5: Skytide_for_online_content_v2

Table 1(page 10) further demonstrates the types of meaningful results that are obtained when a single stream of operational data is merged with different customer data sources, permitting business users at the highest levels to better understand key factors driving both short and long-term profi ts and revenues.

While the benefi ts of combining multiple data sources to optimize content monetization are self-explanatory, the practical implementation is more diffi cult due to the nature of network and content traffi c data – complexity, volume, and lack of standards.

The Skytide Process:The Skytide Analytical Platform™ presents a new way of processing data that makes possible this merging of data sources, even when volumes of data reach terabytes and billions of records daily. Because Skytide analyzes data directly, without the need to store data in a database,

it drastically increases the ability to perform meaningful analytics and reporting on the data that traditional Business Intelligence (BI) solutions can not deal with. The Skytide breakthrough enables business intelligence in previously prohibitive areas for BI, due to that technology’s dependency on database centric technologies and storage.

According to leading market analyst, Dennis Drogseth, Vice President of Enterprise Management Associates (EMA), “Traditional Business Intelligence solutions lack the ability to deliver dynamic, real-time information on very large heterogeneous data sets. A new generation of business analytics, such as the Skytide Analytical Platform, will allow companies to bridge these high volumes of structured and unstructured data sets with speed and power for on-the-fl y information without the overhead of a data warehouse.”

The content delivery market is a key area where Skytide has quickly gained market share because of its ability to answer this sector’s business imperative to monetize content delivery.

Figure 2: Data Value Chain shows the increasing benefi ts obtained from combining multiple data sources.

Analytics and Reporting for Online Content Page 3

.

Page 6: Skytide_for_online_content_v2

At a high level the Skytide process follows these steps:

1. Skytide connects to each data source where it resides, and data is pulled into the Skytide engine for processing.

2. Standard or user defi ned parsers are applied to data, transposing it “on-the-fl y” into an XML structure. A Skytide library of standard parsers covers tens of formats, which can be quickly extended for new or proprietary formats in a matter of hours. With parsers in place, this is an automatic, near-real-time process.

3. Data fl ows into pre-defi ned multi-dimensional “cubed” models, reducing data volume without loss of fi delity.

4. Models are instantly available for analysis and reporting. 5. Queries are built using Skytide’s unique modeling point-

and-click Designer environment, which can be performed by non-technical business users without IT involvement.

6. Reports are automatically generated through a wide range of presentation layers: Corda interactive dashboards, CrystalReports, JasperSoft, Excel, etc.

7. Incremental updates are automatically performed, providing users near real-time reports.

8. Ad hoc queries and reports can be quickly created to allow further drill-down or slicing through the data aggregated data in analytical models.

Data Source Type of Data Data Challenges/Characteristic

Network/Server Data Content logs, CDN traffi c logs Click-stream, IP, HTTP, syslogs, VM, Flash, Real, Quicktime

• Extremely high volumes• Very diverse, “weird” formats• Semi-structured

Customer Data CRM, Finance & Billing Records • Highly structured• Security an issue

Advertising Content HTML, Flash, Jpg/Gif, Streaming Video • Diverse data formats• Placement issues relate to content & user profi les• Billing issues are critical

Ad Traffi c Metrics HTTP, Click-Stream • Extremely high volumes• Very diverse, “weird” formats• Semi-structured

Table 2

The Old Way vs. The Skytide Way: A User Scenario

While every industry and company has unique data information requirements, the following scenario demonstrates the value of multiple data sources to better inform business decisions and actions for any enterprise or service provider generating online content.

The Business Model: Online Content Delivery & Sales

This sample media company generates revenue by selling online content to end-users, which could include music, video downloads & streams, live Flash presentations. Customers access content through a password protected user account, with automated billing per content item viewed. In addition, it delivers a range of online ad properties for its partners, ranging from graphic banners to fl ash videos.

The data landscape is shown in Table 2 below.

Neither traditional BI nor web analytics tools are capable of transforming this type of online content traffi c data into meaningful information. Here’s why:

• BI Tools: All of the data from each continuous stream would need to be transformed and placed in a structured relational database/data warehouse. The volumes of the network data alone make this impossible to achieve if timely results are expected. Using traditional BI tools for this analysis process would also require extensive storage, licensing costs, and time. This method is also inherently infl exible when dealing with ad hoc queries or changing data sources. BI in this new data environment isn’t scalable, fl exible, timely, or affordable.

• Web Analytics: The complexity of multiple large volume data streams is not possible to deal with using web analytics tools, which are limited to HTTP fi les analysis. Web analytics work to provide insight into the online interactions, but the view remains single dimensional. And, because it also requires changes to the actual HTML web page, it requires additional labor. It remains infl exible, and marginally scalable.

Analytics and Reporting for Online Content Page 4

.

Page 7: Skytide_for_online_content_v2

With Skytide, the organization can maintain a constant tap on all data streams, at the highest dimensional view. Here’s how:

Skytide Results: Skytide is deployed as described above, allowing seamless merging of the following data sources: Windows Media Server log fi les, Customer information records in CRM and Finance System, Ad Traffi c Server, HTTP Web Analytics Files. This results in the automated generation of reports that answer key business and IT-related questions including:

• Identify the top 10 viewed content titles and media types • Calculate average view time across different media types

and content titles• Show Sales and billing revenue across customers and

aggregated by region • Identify top 10 error messages related to media type and

player • Demonstrate ad traffi c views segmented by user

demographics linked to associated content• Calculate and trend ad revenue by publisher over time,

segmented by top producers

A Key Question Revisited: A user goes to your website and clicks on a special promotional 2 minute video clip. With Skytide, can you track viewers of your promotional video even if this

video represents a “long-tail” (a video with very few users on watching it)?

Answer: Yes. Skytide can provide viewing statistics on any content and any URL — and not just the most popular ones. With Skytide

you no longer need to worry about millions of individual users that you might want to track. Skytide can not only provide all of the details of the interaction times, it can also show you top geographic regions where the clip was viewed, specifi c times of the day, what players were used to view the data, and identify top errors that may have occurred.

Report 1: Traffi c KPI Summary

Overview report reveals traffi c patterns by content type, time intervals, unique users, and number of requests. Click-through drill down shows exact traffi c details.

Sample Reports

Traffi c for this time period shows a dramatic

spike on Thursday, which is made up mainly of Windows Live Media

access.

Analytics and Reporting for Online Content Page 5

.

Page 8: Skytide_for_online_content_v2

The report easily identifi es California as the most active in viewing content. Further

details pinpoint the San Francisco zip code of 94120

generated the most traffi c.

Report 2: Geographic Segmentation Summary

Overview report identifi es user-requests across geographies, with further segmentation by demographics available in click-through drill-down reports.

This report quantifi es traffi c across “buckets” or groups of content pages

on an iTunes site. For this time period, the Spanish

language group of content was the most popular.

Report 3: Basket Analysis Summary

Report provides comprehensive overview of most popular pages or groups of content viewed.

Analytics and Reporting for Online Content Page 6

.

Page 9: Skytide_for_online_content_v2

Action & Adventure, Comedy and Sci-Fi &

Fantasy were the top three types of video viewed.

Report 4: Content-Uptake Drill Down (Video Traffi c)

Drill-down report combines data from CRM records and video log fi les to show content usage by type, gender, age, and geography.

Report 5: Demographic Segmentation Summary

Traffi c across content types is segmented by gender and age, revealing content usage patterns.

Thursday traffi c is predominently from

viewers who are over 31 years old, and the majority

are male.

Analytics and Reporting for Online Content Page 7

.

Page 10: Skytide_for_online_content_v2

Skytide Applications for Online Content Delivery

The Skytide Analytical Platform includes application sets designed to help organizations delivering content over the Internet to better utilize data to uncover untapped revenue opportunities.

These applications include internal and external facing analytical applications that process extremely high volumes of diverse data sets, with a specifi c focus on data generated in the transfer of content over the Internet. These applications include:

• Traffi c Segmentation Analysis

o Sample Reports: Network Provisioning, Network Troubleshooting, Network Usage and Billing

o Application Areas: IT, Finance

• Customer Segmentation Analysis

o Sample Reports: Customer Demographic Content Uptake, Customer Network Usage Statistics, Customer Behavior Trend Analysis

o Sales, Marketing, Executive Offi ce

• Content Uptake Analysis

o Sample Reports: Content Utilization Statistics Analysis, Content Basket Analysis, Customer Content Behavior

o Application Areas: Content Owners, Marketing, IT

• Content Segmentation Analysis

o Sample Reports: Content Provisioning, Content Troubleshooting, Customer Usage & Billing

o Application Areas: Content Owners, Marketing, Finance, IT

• Application Segmentation Analysis

o Sample Reports: Application Provisioning, Application Troubleshooting, Application Usage & Billing

o Application Areas: Business Units, IT, Finance

• Storage Segmentation Analysis

o Sample Reports: Storage Provisioning, Storage Troubleshooting, Storage Usage & Billing

o Application Areas: IT, Finance

Each Skytide application set is confi gured for streamlined deployment that delivers immediate benefi ts — in days, not months or years. Included is a series of pre-confi gured standard analytical models (cubes) that provide automated connections to each required data source. Connections are made via standard parsers, which are easily confi gured to each unique deployment. A variety of standard reports are then available to users, from scheduled email PDF formats to interactive web portal views.

Additionally, each standard analytical model can be accessed for fi ngertip ad hoc queries and reports. This makes even “out-of-the-box” deployments highly fl exible in meeting changing business needs without re-tooling or added professional services. Custom models and reports can also be incorporated — before, during or after deployment.

Figure 3: Skytide is a next generation analytical platform that performs analyis directly across multiple data types without requiring a relational database or warehouse.

Skytide Designer

Tabular & Pivot Views

Multi-dimensionalNavigation

Excel

Skytide SDK

CubesQueriesData

Connectors

VirtualDocuments

Skytide Server

XML RenderingXML Modeling

Engine

RDBMS HTML XML EmailIM/Chat Log Files

Presentation Formats

AnalyticalApps

ReportingTools

Portals

Analytics and Reporting for Online Content Page 8

.

Page 11: Skytide_for_online_content_v2

© 2007 Skytide, Inc. All rights reserved. Skytide and the Skytide logo are registered trademarks of Skytide, Inc. All other trademarks are the property of their respective owners.

Skytide, Inc.

1 Waters Park Drive Suite 160San Mateo, CA 94403

Phone:1.650.292.1900

Fax: 1.650.312.1400

[email protected]

About Skytide

Skytide delivers business analytical solutions that provide timely and unprecedented insight into the constantly changing environment in which today’s businesses operate. The XML-based Skytide Analytical Platform is the fi rst and only solution available today that can understand complex data from virtually any source, including unstructured data such as network traffi c data, content server logs, application transactions, and unstructured data such as e-mails and text , delivering the visibility necessary to make critical business decisions. Skytide customers include Fortune 1000 companies across a wide range of market segments, including networking, fi nancial services, healthcare, utilities, manufacturing, and retail.

SummaryCompanies across every industry are struggling to deliver and justify the investments they have made in online content delivery. Customers, end-users, and prospects expect fi ngertip access to content via the Internet for entertainment, to inform purchase decisions, for educational value, and to speed service & purchase requests. Up to this point, the focus of Web 2.0 applications has been in delivering the actual content or enabling the application. Now, organizations are focused on how to monetize this investment. A process that demands detailed understanding of how and why content is actually being used.

Skytide delivers a way for organizations to maximize the value of the highly diverse volumes of data generated as a result of online content usage. For enterprises delivering their own content, or Content Delivery Network providers, Skytide offers a seamless way to transform these high volume streams of traffi c control data into valuable information that helps monetize online content and drive revenue.

Put Skytide to Work for YouDiscover how the Skytide Analytical Platform can provide valuable insights about your online content usage. Contact us today at [email protected] or 650.292.1900.

Excerpt From 451 Group Impact Report

The way in which Skytide deals with data makes it well suited to handling extremely large volumes of diverse data formats. By linking directly to the data sources, Skytide eliminates the need for a data warehouse. Data is then automatically aggregated, summarized and correlated across all data sources, making possible multidimensional, historical views of the data. In-memory processing speeds queries based on even the largest data sets. These functions allows Skytide customers to build highly segmented trend analysis across data sets that provide deep insight into customer behavior, quality of service, network performance, online content access, etc.

In this manner, a content-delivery network (CDN) can demonstrate for its customers the actual performance details of all content objects transmitted by customer or segment; CDN customers can track performance across multiple CDN providers; and individual organizations delivering content across their own networks could segment content delivery by region, content type or even individual user.

— Krishna Roy, Analyst, 451 Group

1 2007 Morgan Stanley, Content Delivery Market Report

Page 12: Skytide_for_online_content_v2

Dimension Data Types Information/Reports Delivered Primary Users

1-D View

Key Question Answered:

How is my network performing?

Netfl ow Data

— network traffi c logs

• Reports that map IP to URL for traffi c associated with websites & networks.

• Network Performance: focuses provisioning & resource allocation, segmented by network device,

• Error Reports: most common error reports speed troubleshooting and improve service delivery

• IT: Network Operations

2-D View

Key Question Answered:

What and when is online content and/or applications being accessed?

Netfl ow Data + Content Logs

— content server logs, playlists, classifi cation logs, etc.

• Segmented Content Stats Reports: shows details across content type accessed, most often viewed, time viewed

• Content Utilization Reports: allows for better content provisioning

• Trend & Point-in-time Reports across all sectors

• Content Managers: pertinent to various business units including sales, marketing, eCommerce

3-D View

Key Question Answered:

Where is my content/application being used?

Netfl ow Data + Content Logs + IP Geo Data:

— IP/GEO mapping data

• Content Uptake/Traffi c Segmented by Geography: shows distribution of content usage across geographic regions by all factors included in content stat reports, including time viewed, most often accessed, etc.

• Trend & Point-in-time Reports across all sectors

• Content Managers: business units including sales, marketing, ecommerce to better enable regional targeting of advertising & content

4-D View

Key Question Answered:

Who and or/what content drives the most profi table online transactions?

Netfl ow Data + Content Logs + IP GEO Data +

User Data

— CRM records, Finance & Accounting fi les, user demographics

• User Behavior Segmentation Reports: track content usage by individual customer, segmented across demographics (age, gender, income, etc)

• Billing Reports: granular billing details used to generate more accurate customer billing for online content and advertising revenue streams

• Sales Reports: show revenue generated by key drivers, such as sales regions, reps, content types, advertisers, top customers, etc.

• Trend & Point-in-time Reports across all sectors

• Content Owners: ad revenue owners, marketing & sales executives to drive improved ad insertion revenue, improved recommendation engine, drive promotions and improve search capabilities.

Table 1

Analytics and Reporting for Online Content Page 10

.