+ All Categories
Home > Documents > E-insights, LLC © 2000 All rights reserved. Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved. Understanding Web Traffic Michael Whelan Part 1 of 2.

Date post: 23-Dec-2015
Category:
Upload: rudolf-parsons
View: 214 times
Download: 0 times
Share this document with a friend
28
E-insights, LLC © 2000 All rights reserved. www.e-insights.com Understanding Web Traffic Michael Whelan Part 1 of 2
Transcript
Page 1: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Understanding Web Traffic

Michael Whelan

Part 1 of 2

Page 2: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Why do you Analyze Traffic

• Management wants to track performance.

• Need to know inventory & usage information to support sales efforts.

• Audit requirements.

• Reconciliation with contracts/vendors.

• May be used for performance bonus targets.

Page 3: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Goals

• To understand the capabilities and limitations of web traffic analysis

• Identify the major pitfalls & workarounds

• Be able to identify erroneous data quickly

• Be able to track down inconsistencies

• Be able to extract marketing/customer support benefits from traffic analysis

Page 4: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

DNS

• Domain Name– xxx.yyy => ‘yyy’ is top level domain.– ‘xxx.yyy’ is a domain name– abc.xxx.yyy is a machine name, as are

a.b.c.d.e.xxx.yyy and aspen.xxx.yyy

• Domain Name Service (DNS) maps from a machine name to an Internet Address – www.e-insights.com => 209.10.106.30

‘telephone book’

Page 5: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Inverse DNS

• Map from IP address to machine name.– Was not part of the original DNS spec.– Does not have to be supported (may be required

for security in certain situations).– Frequently (>40%) simply does not exist.

‘unlisted numbers’

Page 6: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Start with the BrowserHTTP://www.e-insights.com/index.html

Render the page

DNS Server

Whois www.e-insights.com

209.10.106.30

Connect to 209.10.106.30

GET /index.html HTTP/1.0…..

E-insights Server

<html><BODY BGCOLOR=red>Hi There.<IMG SRC=/images/E-logoA.gif></body></html>

Page 7: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

HTML - a little more

• Colors/font sizes & styles

• Actual text (and links).

• Javascript code.

• Frame set definitions.

• References to Images, style sheets, and possibly frames.

• References to java or shockwave, etc.

• References to javascript files.

Each ‘referenced’ element involves a separatetransaction with the server.

Page 8: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Page 9: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

HTML Example

<html>

<head><title>Yahoo! Shopping</title></head>

<body bgcolor="#ffffff">

<center><table cellpadding=2 cellspacing=0 width=675>

<tr><td valign=middle width="1%"><a href="http://shopping.yahoo.com">

<img border=0 height=35 width=314 src="http://us.i1.yimg.com/us.yimg.com/i/sh/sh41.gif" alt="Yahoo! Shopping"></a></td>

<td align=right nowrap valign=bottom><font face=arial size="-1">

<a href="http://shopping.yahoo.com">Shopping&nbsp;Home</a> -

<a href="http://www.yahoo.com">Yahoo!</a> -

<a href="http://help.yahoo.com/help/shop/">Help</a></font>

<hr size=1 noshade></td></tr>

</table></center>

Page 10: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

References => Transactions

• Each ‘referenced’ element is seperately requested and transferred to the browser. A record of each transfer is recorded in the server logs.

Page 11: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Page 12: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

What’s in a Log ?

• The actual contents, format and ordering can be customized on the servers.

• Browser identification (e.g. IE5.1 or NS4).• Date and time of request• Requesting IP address• Request & item• Status• Number of bytes sent

Page 13: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Sample Log - Apache - simple#Fields: date time c-ip cs-method cs-uri-stem cs-protocol sc-status sc-bytes

cs(User-Agent)cs(Referer)

2000-02-21 00:01:26 192.168.2.100 GET / HTTP/1.0 200 5199 - - 2000-02-21 11:07:32 192.168.2.108 GET /buzzco/logs/byminplot.php3 HTTP/1.0 200 14093

Mozilla/4.0+(compatible;+MSIE+5.0;+Windows+98;+DigExt)http://www.e-insights.com/buzzco/logs/logsummary.php3

2000-05-31 13:22:15 216.206.70.134 GET /meyers/ HTTP/1.1 401 483Mozilla/4.0+(compatible;+MSIE+5.0;+Windows+NT;+DigExt)

http://monitor2/2000-05-31 14:58:34 206.189.239.171 GET /images/ei-logoC.gif HTTP/1.0 200 496

Mozilla/3.01+(compatible;)

-

Note - I have added line breaks and tabs to make this more readable, each log entry is actually recorded as a single line.

Page 14: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Extended Log Options

• Cookies

• Query Strings

• Referrer Location

• Time to complete request

• Bytes received

Page 15: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Sample Log2 - NS/BVformat=%Ses->client.ip% - %Req->vars.auth-user% [%SYSDATE%] "%Req->reqpb.clf-request%”

%Req->srvhdrs.clf-status% %Req->srvhdrs.content-length%"%Req->headers.referer%" "%Req->headers.user-agent%"

205.217.100.73 - - [27/Aug/2000:00:00:37 -0400] "GET /cgi-bin/pm/international/community.jsp?channel=International&community=Dragracing HTTP/1.1”200 -"-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows 98; TDSNET71)"

205.188.197.51 - - [27/Aug/2000:00:00:35 -0400] "GET /cgi-bin/pm/showroom/showroomview.jsp?channel=Truckin&community=Ford&oid=12958 HTTP/1.0”200 -"http://www.grstgv.com/cgi-bin/pm/search/showroom_searchresult.jsp?ResultStart=20&ResultCount=10”

"Mozilla/4.0 (compatible; MSIE 5.0; AOL 5.0; Windows 95; DigExt)"207.115.63.13 - - [27/Aug/2000:00:00:34 -0400] "GET /articles/013680af/013680afp07s05.jpg HTTP/1.0”

200 15240"http://www.grstgv.com/cgi-bin/pm/common/morePhotos.jsp?channel=Electronics&community=DIY&oid=24952&contentType=Feature”

"Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"

Note - I have added line breaks and tabs to make this more readable, each log entry is actually recorded as a single line.

Page 16: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Sample Log3 - MS/IIS#Version: 1.0#Fields: date time c-ip cs-authname s-ip s-sitename cs-method cs-uri-stem cs-uri-query c-version

sc-status sc-bytes cs-bytes cs(User-Agent) cs(Cookie) sc(Referer)2000-09-02 05:00:00 205.188.197.34 - 192.168.2.2 host.whobei.com GET /jetson/Detailed_Quote.html

Symbol=MAYS&nocache=278099 HTTP/1.0 200 29396 368"Mozilla/4.0 (compatible; MSIE 5.0; AOL 5.0; Windows 98; DigExt)”"TUSER=1649012.60578.rt0; AccipiterId=00000000*Def; PortfDispPrefs=0" ""

2000-09-02 05:00:00 24.177.29.24 - 192.168.2.2 host.whobei.com GET /jetson/Real_Time_Quote.htmlType=Real&Symbol=MTIC&nocache=943545020770 HTTP/1.1 302 288 473"Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)”"TUSER=256182.292920.rt0; AccipiterId=00000000*Def; PortfDispPrefs=0" ""

2000-09-02 05:00:00 205.188.198.154 - 192.168.2.2 host.whobei.com GET /jetson/Detailed_Quote.htmlSymbol=BBY&nocache=282499 HTTP/1.0 200 29368 461"Mozilla/4.0 (compatible; MSIE 5.0; AOL 4.0; Windows 95; DigExt)”

"GUID=000E71DE876609B052ABE9630A001608; AccipiterId=00000000*Def" ""

Note - I have added line breaks and tabs to make this more readable, each log entry is actually recorded as a single line. IP’s and names modified.

Page 17: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Server defaults

• Upon receiving a request which does not specify a specific resource, the server looks through an ordered list of ‘defaults’ until a match is found & returns that resource.

• However, the log entry records what was asked for, not what was returned.

http://ww.acme.com/ returns ‘default.htm’ but Log shows ‘/’

Page 18: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

HTTP Status Codes

• 200 - OK

• 300’s Moved– 301 permanently

– 302 temporarily

– 304 not modified

• 400’s Error– 400 bad request

– 401 unauthorized

– 403 forbidden

– 404 not found

• 500’s Server Errors– 500 internal error

– 503 too buzyNote not all codes areshown. Bold are mostimportant.

Page 19: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Some Definitions

• Page View - one person looking at one page of information they have asked for.

• Visitor a distinct individual who came to the site at least once during a specified period.

• Visit, or visitor session - activity of a specific visitor such that there were no ‘pauses’ of greater than (for example) 30 minutes.

Page 20: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Basic Questions

1. How many page views were there ?

2. How many visitors were there ?

3. How many site ‘visits’ were there ?

4. How long did people stay on the site ?

5. Where did people come to the site from ?

6. When people left the site where did they go ?

7. Who (where) are these people anyway ?

Page 21: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Page Views

• Count all lines in the logfile which– 1) were not errors– 2) were not images or javascript or style ..– 3) that were not ‘monitors’ or other automated

processes.

– Challenge is knowing what not to count - what to ‘filter out’ . Frequently hard, sometimes impossible.

Page 22: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Visitors

• Count the number of distinct visitors .

• Challenge - How do you know which traffic is from one visitor and which is from another ?

Page 23: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Visits

• Start at the beginning and in time order check log entries. Keep track of when you last saw traffic from a particular visitor. If this is the first time - then it’s a new visit, or if the time since the last traffic from that visitor was greater than 30* minutes ago, its also a new visit.

• Challenge - as in visitors plus the choice of time interval, and is ‘harder’ .

30 mins in the normal used - but it can vary.

Page 24: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

How Long did Visitors Stay ?

• Do the ‘visit’ analysis & record the times for each visit - take average.

• Challenges - visits+ period beginning/end, and the impact of ‘automated monitors’ - may appear as a small number of VERY LONG visits.

Page 25: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Where Did These Visitors Come From

• Use the referrer field

• Challenges tells you the page (no query), is sent by the browser => it may look different & not always logged.

Page 26: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Where did the People Go ?

• You cannot tell at all if they simply typed in another URL or picked a site from their history list or favorites list.

• You also cannot tell if they follow a normal link on your site to another site.

• You can track where they go if the site is coded to use ‘re-direct’ scripts.

Page 27: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

Who are these people anyway ?

• You don’t know the people at all.

• You may know the ‘computers’ .

• Proxies, Firewalls, ISP’s, all ‘hide’ computers behind a single IP.

• Unless - you use cookies, possibly combined with registration.

Page 28: E-insights, LLC © 2000 All rights reserved.  Understanding Web Traffic Michael Whelan Part 1 of 2.

E-insights, LLC © 2000 All rights reserved.www.e-insights.com

• Continued in Part 2


Recommended