+ All Categories
Home > Documents > ANALYSE AF WEBADFÆRD - OAW OAW – LEKTIONSGANG 3. ANALYSE AF WEBADFÆRD - OAW SUMMARY, LECTURE 2...

ANALYSE AF WEBADFÆRD - OAW OAW – LEKTIONSGANG 3. ANALYSE AF WEBADFÆRD - OAW SUMMARY, LECTURE 2...

Date post: 21-Dec-2015
Category:
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
23
ANALYSE AF WEBADFÆRD - OAW OAW – LEKTIONSGANG 3
Transcript

ANALYSE AF WEBADFÆRD - OAW

OAW – LEKTIONSGANG 3

ANALYSE AF WEBADFÆRD - OAW

SUMMARY, LECTURE 2

• Users, Visits, Pageviews• Reach, Acquisition rate, Conversion

Rate, Retention Rate, Loyalty • Abandonment, Attrition, Churn• Recency, Frequency, Monetary value,

Duration, Yield• Acquisition cost, Conversion cost, Net

Yield, Connect rate

ANALYSE AF WEBADFÆRD - OAW

ANALYSE AF WEBADFÆRD - OAW

WEB SERVERS

• A Web server is a program that, using the client/server model and the World Wide Web's Hypertext Transfer Protocol (HTTP), serves the files that form Web pages to Web users (whose computers contain HTTP clients that forward their requests). Every computer on the Internet that contains a Web site must have a Web server program. Two leading Web servers are Apache, the most widely-installed Web server, and Microsoft's Internet Information Server (IIS). Other Web servers include Novell's Web Server for users of its NetWare operating system and IBM's family of Lotus Domino servers, primarily for IBM's OS/390 and AS/400 customers.

whatis.com, Feb. 2002

ANALYSE AF WEBADFÆRD - OAW

WEB SERVERS

Netcraft.com, Feb. 2002

ANALYSE AF WEBADFÆRD - OAW

THE WEB SERVER LOG

• An access log is a list of all the requests for individual files that people have requested from a Web site. These files will include the HTML files and their imbedded graphic images and any other associated files that get transmitted. The access log (sometimes referred to as the "raw data") can be analyzed and summarized by another program. In general, an access log can be analyzed to tell you: – The number of visitors (unique first-time requests) to a home page – The origin of the visitors in terms of their associated server's domain

name (for example, visitors from .edu, .com, and .gov sites and from the online services)

– How many requests for each page at the site, which can be presented with the pages with most requests listed first

– Usage patterns in terms of time of day, day of week, and seasonally

whatis.com, Feb. 2002

ANALYSE AF WEBADFÆRD - OAW

THE WEB SERVER LOG

• Boundaries for any type of log analysis• Common Log Format – Extended CLF.

Data Element CLF ECLF

Host

Ident

Authuser

Time

Request

Status

Bytes

Referrer

User-agent

ANALYSE AF WEBADFÆRD - OAW

212.97.237.62 - - [22/Oct/2001:02:22:24 +0200] "GET / HTTP/1.1" 304 0 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"212.97.237.62 - - [22/Oct/2001:02:22:30 +0200] "GET /Internet HTTP/1.1" 301 300 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"131.202.130.143 - - [22/Oct/2001:02:27:57 +0200] "GET /research/bed/ HTTP/1.1" 200 9079 "http://google.yahoo.com/bin/query?p=Boolean+expression&hc=0&hs=0" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"131.202.130.143 - - [22/Oct/2001:02:27:58 +0200] "GET /research/bed/icons/Book.gif HTTP/1.1" 200 227 "http://www.itu.dk/research/bed/" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"131.202.130.143 - - [22/Oct/2001:02:27:58 +0200] "GET /research/bed/icons/Tools.gif HTTP/1.1" 200 251 "http://www.itu.dk/research/bed/" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"131.202.130.143 - - [22/Oct/2001:02:27:58 +0200] "GET /people/hra/hoved_logo4.gif HTTP/1.1" 200 3643 "http://www.itu.dk/research/bed/" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"209.185.143.138 - - [22/Oct/2001:02:43:58 +0200] "HEAD /people/kfl/fltk-1.0.4-linux-intel.rpm HTTP/1.0" 200 0 "-" "Slurp.so/1.0 ([email protected]; http://www.inktomi.com/slurp.html)"216.200.130.207 - - [22/Oct/2001:03:03:08 +0200] "HEAD /courses/W2/F2001/ HTTP/1.0" 200 0 "-" "Mozilla/2.0 (compatible; Ask Jeeves)"216.200.130.207 - - [22/Oct/2001:03:03:10 +0200] "GET /courses/W2/F2001/ HTTP/1.0" 200 39357 "-" "Mozilla/2.0 (compatible; Ask Jeeves)"133.11.12.2 - - [22/Oct/2001:03:04:57 +0200] "HEAD /people/birkedal/papers/index.html HTTP/1.0" 200 0 "-" "-"213.122.171.29 - - [22/Oct/2001:03:11:40 +0200] "HEAD /people/birkedal/realizability/index.html HTTP/1.0" 200 0 "-" "Mozilla/3.0 (compatible)"202.70.68.176 - - [22/Oct/2001:03:22:03 +0200] "GET / HTTP/1.1" 200 77 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"202.70.68.176 - - [22/Oct/2001:03:22:07 +0200] "GET /Internet HTTP/1.1" 301 300 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"199.172.149.172 - - [22/Oct/2001:03:31:12 +0200] "GET /people/jm/ HTTP/1.0" 200 1539 "-" "ArchitextSpider"199.172.149.173 - - [22/Oct/2001:03:39:14 +0200] "GET /research/ddd/ HTTP/1.0" 200 2342 "-" "ArchitextSpider"12.75.131.29 - - [22/Oct/2001:03:42:35 +0200] "GET /connection HTTP/1.1" 404 272 "http://www1.umn.edu/twincities/directory/indexi.html" "Mozilla/4.0 (compatible; MSIE 5.01; Windows 95; AT&T CSM6.0)"12.75.131.29 - - [22/Oct/2001:03:42:50 +0200] "GET / HTTP/1.1" 200 77 "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows 95; AT&T CSM6.0)"12.75.131.29 - - [22/Oct/2001:03:42:51 +0200] "GET /Internet HTTP/1.1" 301 300 "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows 95; AT&T CSM6.0)"12.75.131.29 - - [22/Oct/2001:03:43:10 +0200] "GET /Internet HTTP/1.1" 301 300 "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows 95; AT&T CSM6.0)"12.75.131.29 - - [22/Oct/2001:03:43:13 +0200] "GET /Internet HTTP/1.1" 301 300 "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows 95; AT&T CSM6.0)"61.9.192.142 - - [22/Oct/2001:03:45:24 +0200] "GET / HTTP/1.1" 304 0 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"61.9.192.142 - - [22/Oct/2001:03:45:26 +0200] "GET /Internet HTTP/1.1" 301 300 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"61.9.192.142 - - [22/Oct/2001:03:46:11 +0200] "POST /main/cgi-bin/people.cgi HTTP/1.1" 200 2206 "http://www.it-c.dk/English/find_person/" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"61.9.192.142 - - [22/Oct/2001:03:47:34 +0200] "GET /courses HTTP/1.1" 301 299 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"66.7.131.158 - - [22/Oct/2001:03:47:57 +0200] "GET /courses/GP/F2000/index.html HTTP/1.0" 200 4393 "-" "Openfind data gatherer, Openbot/3.0+([email protected];+http://www.openfind.com.tw/robot.html)"130.226.133.8 - - [22/Oct/2001:04:00:47 +0200] "GET /sysadm/software/lprng/printcap HTTP/1.0" 200 2012 "-" "Wget/1.6"130.226.133.92 - - [22/Oct/2001:04:01:05 +0200] "GET /sysadm/software/lprng/printcap HTTP/1.0" 200 2012 "-" "Wget/1.6"130.226.141.6 - - [22/Oct/2001:04:02:00 +0200] "GET /sysadm/software/lprng/printcap HTTP/1.0" 200 2012 "-" "Wget/1.6"130.226.141.15 - - [22/Oct/2001:04:02:00 +0200] "GET /sysadm/software/lprng/printcap HTTP/1.0" 200 2012 "-" "Wget/1.6"130.226.143.195 - - [22/Oct/2001:04:02:49 +0200] "GET /sysadm/software/lprng/printcap HTTP/1.0" 200 2012 "-" "Wget/1.6"66.7.131.158 - - [22/Oct/2001:04:08:21 +0200] "GET /courses/GP/F2000/Eksempler/JavaSoftwareSolutions/chap07/Doodle.html HTTP/1.0" 200 255 "-" "Openfind data gatherer, Openbot/3.0+([email protected];+http://www.openfind.com.tw/robot.html)"80.62.239.98 - - [22/Oct/2001:04:12:28 +0200] "GET /people/tofte HTTP/1.1" 301 298 "http://www.it-c.dk/Internet/itu/" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"80.62.239.98 - - [22/Oct/2001:04:12:28 +0200] "GET /people/tofte/leftorange.htm HTTP/1.1" 200 1279 "http://www.it-c.dk/people/tofte/" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"80.62.239.98 - - [22/Oct/2001:04:12:29 +0200] "GET /people/tofte/pics/spacer22.GIF HTTP/1.1" 404 286 "http://www.itu.dk/people/tofte/leftorange.htm" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"80.62.239.98 - - [22/Oct/2001:04:12:30 +0200] "GET /people/tofte/Tofte2.jpg HTTP/1.1" 200 10618 "http://www.it-c.dk/people/tofte/madscontents.htm" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"66.7.131.158 - - [22/Oct/2001:04:15:28 +0200] "GET /courses/GP/F2000/Eksempler/JavaSoftwareSolutions/chap11/MirroredPictures.html HTTP/1.0" 200 228 "-" "Openfind data gatherer, Openbot/3.0+([email protected];+http://www.openfind.com.tw/robot.html)"66.7.131.158 - - [22/Oct/2001:04:23:33 +0200] "GET /courses/GP/F2000/Eksempler/Tekstfiler/places.txt HTTP/1.0" 200 90 "-" "Openfind data gatherer, Openbot/3.0+([email protected];+http://www.openfind.com.tw/robot.html)"199.172.149.172 - - [22/Oct/2001:04:24:14 +0200] "GET /people/hra/notes-index.html HTTP/1.0" 200 1670 "-" "ArchitextSpider"66.7.131.158 - - [22/Oct/2001:04:29:43 +0200] "GET /courses/GP/F2000/hold.html HTTP/1.0" 200 5212 "-" "Openfind data gatherer, Openbot/3.0+([email protected];+http://www.openfind.com.tw/robot.html)"61.9.149.155 - - [22/Oct/2001:04:33:22 +0200] "GET /courses/W2/ssh.html HTTP/1.1" 200 2602 "-" "-"203.58.38.86 - - [22/Oct/2001:04:36:34 +0200] "GET /~haas/GC/c-tut.html HTTP/1.0" 200 77 "http://www.student.dtu.dk/~c971714/GC/c-tut.html" "Mozilla/4.0 (compatible; MSIE 5.0; Mac_PowerPC)"203.58.38.86 - - [22/Oct/2001:04:36:37 +0200] "GET /~haas/GC/c-tut.php HTTP/1.0" 200 24819 "http://www.itu.dk/~haas/GC/c-tut.html" "Mozilla/4.0 (compatible; MSIE 5.0; Mac_PowerPC)"64.55.148.54 - - [22/Oct/2001:04:43:30 +0200] "GET /people/slauesen/ HTTP/1.0" 200 11173 "-" "Mozilla/2.0 (compatible; Ask Jeeves)"64.55.148.54 - - [22/Oct/2001:04:47:21 +0200] "GET /main/projektboers.html HTTP/1.0" 200 75604 "-" "Mozilla/2.0 (compatible; Ask Jeeves)"198.81.17.166 - - [22/Oct/2001:04:48:13 +0200] "POST /main/cgi-bin/people.cgi HTTP/1.0" 200 928 "http://www.it-c.dk/English/find_person/" "Mozilla/4.0 (compatible; MSIE 5.01; AOL 6.0; Windows 98)"

AN EXAMPLE, IT-C.DK (Oct 2001)

ANALYSE AF WEBADFÆRD - OAW

HOST

• Fully qualified domain name of the client or its IP address if the name is unavailable

• The address to which the server’s response will be sent

• Reverse Address Lookup on the fly is possible – however in most cases performed while postprocessing the log instead

• Important issues: dial up connections, proxies

ANALYSE AF WEBADFÆRD - OAW

IDENT

• Identifier supplied by client applications that support identd (identification daemon)

• Mail, Ftp, Irc .. Rarely http. • Also referred to as RFC931

ANALYSE AF WEBADFÆRD - OAW

AUTHUSER

• The authenticated user name (if user authentication is required for that file)

ANALYSE AF WEBADFÆRD - OAW

TIME

• Usually the time when the web server completed responding to the HTTP request

• DD/Month/YYYY:HH:MM:SS +XXX0

ANALYSE AF WEBADFÆRD - OAW

REQUEST

• The actual request from the user client. Typically it looks like the following:

• Different types of requests: GET, POST, HEAD

• Protocol version included (HTTP/1.1)

"GET /people/tofte/leftorange.htm HTTP/1.1"

ANALYSE AF WEBADFÆRD - OAW

STATUS

• A three-digit status code, which the server returns to the browser– Four classes of codes. Information (100 series).

Success (200 series). Redirect (300 series). Failure (400 series). Server Error (500 series).

• Examples – 200 OK, 302 Redirect, 401 Unauthorized, 403

Forbidden, 404 File not found

ANALYSE AF WEBADFÆRD - OAW

BYTES

• For GET requests: Number of bytes returned by the server to the client.

ANALYSE AF WEBADFÆRD - OAW

REFERRER

• Indicates the page where the visitor was located when making the request

• Important for path-analysis• Can be used for referring schemes and

for measuring banner effects etc.• RFC2068 (HTTP/1.1):

– Note: Because the source of a link may be private information or may reveal an otherwise private information source, it is strongly recommended that the user be able to select whether or not the Referer field is sent.

ANALYSE AF WEBADFÆRD - OAW

USERAGENT

• Browser name/version (operating system) – "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"– "Mozilla/3.0 (Macintosh; I; PPC)"

• Note reg. Mozilla:– Mozilla was Netscape Communication's nickname for

Navigator, its Web browser, and, more recently, the name of an open source public collaboration aimed at making improvements to Navigator.

ANALYSE AF WEBADFÆRD - OAW

USERAGENT - STATISTICS

• Example from early 20021 MSI E 5.0 61.48

2 MSI E 6.0 22.99

3 MSI E 5.5 6.75

4 MSI E 4.0 4.62

5 Unresolved: J ava Enabled 0.87

6 Netscape 4.7 0.73

7 MSI E (AOL) 5.5 0.35

8 MSI E 3.0 0.34

9 MSI E (AOL) 5.0 0.25

10 Netscape 4.5 0.24

21 Web TV 0,02

25 Opera 5.1 0,01

ANALYSE AF WEBADFÆRD - OAW

MORE OPTIONS

• Filename• Time-to-serve• IP address• Server port• URL-requested• Cookie

ANALYSE AF WEBADFÆRD - OAW

THE QUIZ1. The referrer indicates where in the world the users is located. 2. Apache installed on Windows 2000 is an open source web

server3. A webserver receives information from the client (the browser)4. A webserver sends information to the client (the browser)5. Webserver failures returns a 30x status code6. It is possible to calculate an estimate of the website’s traffic

(eg Gb per month) from the web server log7. One IP number in the webserver log is by definition one user8. A line in a web server log file is maximum 80 characters9. Microsoft has a market share of less than one third of all

webservers in the world10. User agent information is part of the Common Logfile Format

ANALYSE AF WEBADFÆRD - OAW

AN EXAMPLE

80.62.239.98 - - [22/Oct/2001:04:12:28 +0200] "GET /people/tofte/leftorange.htm HTTP/1.1" 200 1279 "http://www.it-c.dk/people/tofte/" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"

ANALYSE AF WEBADFÆRD - OAW

MORE INFORMATION

• Apache HTTP Server Documentation, Log Files– http://httpd.apache.org/docs/logs.html

• Microsoft IIS Log Format– http://www.microsoft.com/windows2000/en/server/iis

/htm/core/iiabtlg.htm#MicrosoftIISLogFormat

• HTTP/1.1 Documentation– http://www.w3.org/Protocols/rfc2068/rfc2068

ANALYSE AF WEBADFÆRD - OAW

FURTHER ISSUES

• Proxies• Firewalls


Recommended