+ All Categories
Home > Technology > Web analyticsandbigdata techweek2011

Web analyticsandbigdata techweek2011

Date post: 21-Aug-2015
Category:
Upload: raghu-kashyap
View: 5,911 times
Download: 1 times
Share this document with a friend
Popular Tags:
38
Hadoop and Big Data to Drive Web Analytics Raghu Kashyap & Michael Wetta @ Orbitz Worldwide
Transcript

Hadoop and Big Data to Drive Web Analytics

Raghu Kashyap & Michael Wetta

@ Orbitz Worldwide

About Us

Raghu Kashyap - Director Web Analytics

Twitter: @ragskashyapBlog: http://kashyaps.comEmail: [email protected]

Michael Wetta - Marketing Strategy & Analytics

Email: [email protected]

Overview

Web Analytics journey Orbitz Worldwide What challenges exist? Big Data Analysis Business Testimonial Centralized Decentralization Dos and Don’ts What Hadoop is being used for beyond Web

Analytics at Orbitz Where else? Conclusion

What is Web Analytics?

Understand the impact and economic value of the website

Rigorous outcome analysis

Passion for customer centricity by embracing voice-of-customer initiatives

Fail faster by leveraging the power of experimentation(MVT)

Travel details Shopping patterns Visit patterns Page navigation Demand source

Behavioral attributes

Web Analytics History

Early1990s – Hit counters

Reference - http://www.theedifier.com

Web Analytics History

1993 – Web server logs (Webtrends)

213.60.233.243 - - [25/May/2004:00:17:09 +1200] "GET /internet/index.html HTTP/1.1" 200 6792 "http://www.mediacollege.com/video/streaming/http.html" "Mozilla/5.0 (X11; U; Linux i686; es-ES; rv:1.6) Gecko/20040413 Debian/1.6-5”

151.44.15.252 - - [25/May/2004:00:17:20 +1200] "GET /cgi-bin/forum/commentary.pl/noframes/read/209 HTTP/1.1" 200 6863 "http://search.virgilio.it/search/cgi/search.cgi?qs=download+video+illegal+Berg&lr=&dom=s&offset=0&hits=10&switch=0&f=us” "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)”

Web Analytics History

1997 – Javascript tag collection

Server side or Client side tagging?

Web Analytics History

2005 – Google Analytics

Reference - http://www.theedifier.com

Web Analytics History

2009/2010 – Major acquisitions (Adobe, IBM, Comscore)

2009/2010 – Big Data (IBM, Facebook, Orbitz, Informatica, Greenplum)

Web Analytics today

Site Analytics Multi Variate Testing (MVT) Voice of Customer (VOC) Competitive intelligence

Site Analytics

The “What” of Web Analytics

Helps measure:

Visits/Visitors

Page views

Conversion

SEO activities

Traffic Source

Multi Variate Testing (MVT)

The “Why” of Web Analytics

Fail faster

Experiment or Die

Voice of Customer (VOC)

The “Why” of Web Analytics

Surveys

Lab usability tests

Competitive Intelligence

The “What else” of Web Analytics

Data Collection : Toolbar, Panel, ISP

About Orbitz Worldwide

Challenges

Site Analytics Lack of multi-dimensional capabilities Hard to find the right insight Heavy investment on the tools    Precision vs Direction

continued….

Big Data No data unification or uniform platform

across organizations and business units No easy data extraction capabilities

Business Distinction between reporting and

testing(MVT) Minimal measurement of outcomes

Web Analytics & Big Data

OWW generates couple million air and hotel searches every day.

Massive amounts of data. Over hundred GB of log data per day.

Expensive and difficult to store and process this data using existing data infrastructure.

Big Data Infrastructure

Infrastructure provides:

Long term storage for very large data sets. Open access to developers and analysts. Allows for ad-hoc querying of data and rapid

deployment of reporting applications.

Processing of Web Analytics Data

Aggregating data into Data Warehouse

Data Analysis Jobs

Traffic Source and Campaign activities

Daily jobs, Weekly analysis

Map reduce job ~ 20 minutes for one day raw logs ~ 3 minutes to load to hive tables Generates more than 25 million records for a month

Data Categories

Traffic acquisition

Marketing optimization

User engagement

Ad optimization

User behaviour

Shifting from Innovation to Mainstream Consumption

Crossing the Chasm: Shifting from Innovation to Mainstream Consumption

Adapted from Geoffrey A. Moore – Technology Adoption Lifecycle

1. Background on Analytics at Orbitz

2. Crossing the Chasm Framework

3. Application

Crossing the Chasm: Shifting from Innovation to Mainstream Consumption

Adapted from Geoffrey A. Moore – Technology Adoption Lifecycle

Innovators Visionaries Mainstream

Adapted from Geoffrey A. Moore – Technology Adoption Lifecycle

Crossing the Chasm: Shifting from Innovation to Mainstream Consumption

Crossing the Chasm: Shifting from Innovation to Mainstream Consumption

Adapted from Geoffrey A. Moore – Technology Adoption Lifecycle

1. Consistent Message of Capabilities

2. Understanding and Handling Reservations

3. Inclusion in the development cycle

4. Storage and Accessibility

Key Components Adoption:

Centralized Decentralization

Web Analytics team + SEO team + Hotel optimization team

Model for success

Measure the performance of your feature and fail fast

Experimentation and testing should be ingrained into every key feature.

Break down into smaller chunks of data extraction

Should everyone do this?

Do you have the Technology strength to invest and use Big data?

Analytics using Big Data comes with a price (resource, time)

Big Data mining != analysis Key Data warehouse challenges still exist (time,

data validity)

Other Key Projects

Machine Learning team Measuring page download performance using

site analytics logs Storing and processing production application

logs Data cache analysis

Where else?

Amazon - Was Amazon's recommendation engine crucial to the company's success?

Facebook – A Petabyte Scale Data Warehouse using Hadoop

EBay – The power of the Elephant

Apple – iAds, UX and Data analytics

Conclusion

Invest in the 10/90 rule (10$ on tools and 90$ on people) – Avinash Kaushik

Analytical thinking engineers/analysts

Empower individual feature teams to manage their own analytics(Centralized Decentralization)

Focus on Analysis more than reporting

Reference Web Analytics Association

http://www.webanalyticsassociation.org/

Avinash Kaushik http://kaushik.net

Twitter #measure

Analysis Exchange http://www.webanalyticsdemystified.com

Questions?

http://careers.orbitz.com


Recommended