+ All Categories
Transcript
Page 1: Evaluating Web Server Log Analysis Tools

Evaluating Web Server Log Analysis Tools

David [email protected]

SD’98 2/13/98

Page 2: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 2

Summary

• Examine different log files• What you can and can’t learn from your

logs• Pros and cons of various tools

Page 3: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 3

Different types of log files

• Access• Error• Referral• Other

Page 4: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 4

Access logs

• Domain name• Date, time• Server command processed and result• URL of visitor• Bytes transmitted

Page 5: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 5

Sample access log data• rm258.fav.usu.edu [31/May/1995:09:03:23 +0600] "GET

/NEI.html HTTP/1.0" 302 396• rm258.fav.usu.edu [31/May/1995:09:03:28 +0600] "GET

/xculture/nei/nei.html HTTP/1.0" 200 2114• rm258.fav.usu.edu [31/May/1995:09:03:30 +0600] "GET

/gifs/sedlbutton.gif HTTP/1.0" 200 1336• 129.71.83.161 [31/May/1995:09:20:32 +0600] "GET /RELs.html

HTTP/1.0" 304 0• Leslie-Francis.tenet.edu [31/May/1995:09:36:06 +0600]

"GET / HTTP/1.0" 200 1867• ls973.ulib.albany.edu [31/May/1995:09:40:52 +0600] "GET

/viii1.html HTTP/1.0" 404 244

Page 6: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 6

Errors reported in your logs

• Clients that time out (or leave in frustration!)

• Scripts that don’t produce any output• Server bugs• User authentication or configuration

problems

Page 7: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 7

Sample error log data• [Thu May 30 07:25:32 1996] send timed out for

bamberg.sedl.org• [Thu May 30 07:57:41 1996] send timed out for

kenya.sedl.org• [Thu May 30 08:23:11 1996] send timed out for ppp092.kyoto-

inet.or.jp• [Thu May 30 09:15:52 1996] access to

/usr/local/www/htdocs/scimath/compass/vol03 failed for 170.211.67.51, reason: File does not exist

• [Thu May 30 09:57:56 1996] send timed out for dd10-048.compuserve.com

• [Thu May 30 10:47:25 1996] read timed out for ncia110b.ncia.net

Page 8: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 8

Referral logs

• Who links to your site?• Who downloads your pages?

Page 9: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 9

Sample referral log data• http://www.isisnet.com/ ->/change/welcome.html• http://www.ipl.org/ref/RR/EDU/Research-rr.html

->/welcome.html• http://www.tenet.edu/snp/main.html

->/policy/networks/toc.html• http://www.tenet.edu/new/main.html

->/policy/networks/toc.html• http://guide-p.infoseek.com/NS/Titles?qt=teacher+training -

>/resources/SCIMAST/announcement.html• http://www.tenet.edu/new/main.html

->/policy/networks/toc.html• http://www.tenet.edu/new/main.html

->/policy/networks/toc.html• http://www.nwrel.org/national/regional-labs.html

->/welcome.html

Page 10: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 10

Common log format

• Output by most standard servers• Needed by most third-party log analyzers• hoohoo.ncsa.uiuc.edu/docs/setup/httpd/Overview.html

Page 11: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 11

Extended/custom log formats

• Log whatever you wish in whatever order you wish

• Useful if you will read them regularly!• But can’t work with the analyzers• Now in IIS v4, NSCP v3, others.

Page 12: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 12

What you can learn from your log files

• Hits per day• Domain origins• The path people take in and around your

web• Problem areas

Page 13: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 13

HITS

• (How Idiots Track Success)• Nobody uses this word anymore• Doesn’t really measure individual users,

just access• Catching servers and proxies mess up these

statistics

Page 14: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 14

Domain origins

• Where users are coming from -- sometimes• Just because they are from ibm.net doesn’t

mean they work at IBM!• Forgotten accounts, friends and family

using the account• Hacked user names• Proxies don’t help here either

Page 15: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 15

The path people take in and around your web

• Search engines help sometimes• Which search site was the most popular

front door • Who links to you and why• Is there a pattern or a random walk?

Page 16: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 16

Problem areas to deal with

• Broken links (locally)• Broken outbound links• Time outs (sunspots?)

Page 17: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 17

What you can’t learn from your logs

• Who are these people, anyway?– No specific user names– Is it a bot or a real human?

• How long did they view a page?– Most people don’t spend much time on your

web– Where did they go visit next?

Page 18: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 18

What technologies are available?

• Built-in analyzer tools• Sites that capture user info• Secure sites with registration• Build your own from perl• Third-party tools

Page 19: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 19

Built-in tools

• WebSite, website.ora.com• IIS with Site Server,

www.microsoft.com/iis• Netscape servers, www.netscape.com• Easy to use but limited

Page 20: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 20

WebSite Professional v2

• Win NT, 95• Best web server for learning about logs, best

docs• QuickStats module for instant analysis:

– single report but nice set of information– shows today, last two days requests and unique

hosts– IP addresses of visitors, average requests/hour

Page 21: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 21

IIS Site Server

• NT Server v4 w/SP3 only• Lots of preconfigured reports• Two versions, Express and Full (customized

reports)• backoffice.microsoft.com/products/

siteserver/express/

Page 22: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 22

Netscape v3 web servers

• Various NT, Unix versions• Reports for a few variables but nothing too

extensive• Best to use a third-party tool here

Page 23: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 23

Sites that capture user info

• WebCounter, www.digits.com -- third-party hit counter

• Someone else does the programming and debugging

• But beyond your control

Page 24: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 24

Secure sites with registration

• You know your users• But many won’t register, or forget their

passwords• Requires scripting, database integration,

more maintenance

Page 25: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 25

Build your own from perl

• Needs some in-house support• Works best with Unix-based webs• Examples:

– refstats, members.aol.com/htmlguru/refstats.html

– surfreport, bienlogic.com/SurfReport/

Page 26: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 26

Third-party tools

• WebTracker, www.CQMInc.com/webtrack• WebTrends, www.webtrends.com• net.Genesis, www.netgen.com• MarketWave, www.marketwave.com• IIS Assistant, www.go-iis.com

Page 27: Evaluating Web Server Log Analysis Tools

SD'98 (c) David Strom, Inc. 27

Third-party tools (con’t)

• Can make very pretty reports• Customizable • Make sure they support your particular log

format• Not that expensive, mostly run on Windows


Top Related