2004
221-239
222 Electronic Commerce Studies
A Cache Document Replacement Mechanism Considering Document Added-Value
Toly Chen
Feng Chia University
Ju-Chi Huang
Chaoyang University of Science and Technology
Abstract
Most traditional cache document replacement policies are focused on
the efficiency respect and cache documents are replaced according to their
last access times, request frequencies, and sizes. However, the goal of an
EC website is to make profits, and the information that can be cached to a
user should also promote the user to consume. For this reason, a new cache
document replacement policy also considering the value added to the
website by caching a document to a user is proposed in this study. As a
result, the new policy considers four attributes of every document
including the last access time, request frequency, size, and value added to
the website. To evaluate the added value to the EC website by caching a
document, some data and web mining techniques are applied. Firstly, the
aggregation tree of all users’ browsing paths is analyzed to found out the
relationship between each web page and the payment page. Then the
strength of such a relationship is assessed with the association rule and the
absolute-value distance. Based on them, the added value of every cache
document can be derived according to an equation modified from the
traditional GDSF cache replacement policy. When cache replacement
occurs, the document with the smallest added value will be firstly taken
away from the cache. In this way, web pages that are more frequently
associated with consumption behaviors will be cached to the users with
higher probabilities, and users with consumption behaviors (i.e. consumers)
can be provided with better caching performance than normal users. An
experimental EC website has been constructed in this study to generate
some data for evaluating the effectiveness of the proposed mechanism. In
the respects of “customer hit rate” and “customer byte hit rate”, the new
mechanism outperformed all traditional cache document replacement
policies.
Keywords: E-Commerce, Cache Document Replacement Policy, Added
Value
224 Electronic Commerce Studies
Yes
NO
( )
( )
P
W
P
S
b
S
Z
nb
WC
H
n
(1)
226 Electronic Commerce Studies
)(
)()()(
PS
PCPFLPK (2)
( )
cache
log
WUM
default
228 Electronic Commerce Studies
A B C T
1 0 1 0 1
2 1 1 1 1
3 1 0 1 0
4 1 1 1 1
5 0 1 1 1
6 0 1 1 1
7 1 0 1 0
1 ABCAT
2 ACDAD
3 ACT
4 ABC
( )
( )
AprioriFrequent item
set (Level = 2)
Select the rule
matching the
meta rule
Confidence
>threshold?
230 Electronic Commerce Studies
,||
1)(
1
n
i ii TPPPPR (3)
PPi i
TPi i
.5.1|32|
1
|53|
1)(CR
.)(
)()()(
PS
PFLPRCPV
TP (4)
TPC
.797.1250
13824.05.167.0)(CV
232 Electronic Commerce Studies
4
Cache size(Kb)
Hit rate(%)
LFU
LRU
SIZE
GDSF
0%
10%
20%
30%
40%
50%
60%
70%
25 50 75 100
125
150
Cache size(Kb)
Byte hit rate(%)
LFU
LRU
SIZE
GDSF
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
25 50 75 100
125
150
Cache size(Kb)
Customer hit
rate(%)
LFU
LRU
SIZE
GDSF
234 Electronic Commerce Studies
0%
5%
10%
15%
20%
25%
30%
35%
25 50 75 100
125
150
Cache size(Kb)
Customer byte
hit rate(%)
LFU
LRU
SIZE
GDSF
Hit rate Byte hit
rate
Customer
hit rate
Customer
byte hit rate
2 3 1 2 8
GDSF 1 2 2 3 8
LFU 4 1 5 4 14
LRU 5 4 4 1 14
SIZE 3 5 3 5 16
Hit rate Byte hit
rate
Customer
hit rate
Customer
byte hit rate
3 3 1 1 8
GDSF 1 2 2 3 8
LFU 4 1 5 4 14
SIZE 2 5 3 5 15
LRU 5 4 4 2 15
Hit rate Byte hit
rate
Customer
hit rate
Customer
byte hit rate
GDSF 2 2 2 2 8
3 4 1 1 9
SIZE 1 3 3 5 12
LFU 4 1 4 3 12
LRU 5 5 5 4 19
10% 30% 50%
LRU Y Y Y
LFU Y Y Y
SIZE Y Y ---
GDSF Y Y Y
236 Electronic Commerce Studies
10% 30% 50%
LRU --- Y Y
LFU Y Y Y
SIZE Y Y Y
GDSF Y Y Y
10% 30% 50%
LRU Y Y Y
LFU Y Y Y
SIZE Y --- ---
GDSF --- --- ---
10% 30% 50%
LRU Y Y Y
LFU --- --- ---
SIZE Y Y Y
GDSF --- --- ---
238 Electronic Commerce Studies
(1)
(2)
(3)
Arlitt, M., Cherkasova, L., Dilley, J., Fridrich, R. and Jin, T., Evaluating
Content Management Techniques for Web Proxy Caches,
Proceedings of the 2nd Workshop on Internet Server Performance,
Atlanta GA, 1999.
Arlitt, M., Friedrich, R. and Jin, T., Performance Evaluation of Web Proxy
Cache Replacement Policy, Performance Evaluation, Vol. 39, 2000,
pp. 149-164.
Arlitt, M., Krishnamurthy, D. and Rolia, J., Characterizing the Scalability
of a Large Web-based Shopping System, ACM Transactions on
Internet Technology, Vol. 1, No. 1, 2001, pp. 44-69.
Candan, K. S., Li, W.-S., Luo, Q., Hsiung, W.-P., and Agrawal, D.,
Enabling Dynamic Content Caching for Database-driven Web Sites,
Proceedings of the ACM SIGMOD 2001, Santa Barbara, California,
USA, 2001.
Cao, P. and Irani, S., Cost-aware World Wide Web Proxy Caching
Algorithm, Proceedings of the USENIX Symposium on Internet
Technologies and Systems(USITS), Monterey CA, 1997.
Chen, M. S., Park, J. S. and Yu P. S., Data Mining for Path Traversal
Patterns in a Web Environment, Proceedings of the 16th International
Conference on Distributed Computing Systems, 1996, pp. 385-392.
Cooley, R., Mobasher, B. and Srivastava, J., Web Mining: Information
and Pattern Discovery on the World Wide Web, ICTAI’97, 1997.
Law, A. M. and Kelton, W. D., Simulation Modeling and Analysis,
McGraw Hill, 2nd edition, 1991.
Lorenzetti, P., Rizzo, L. and Vicisano, L., Replacement Policies for a
Proxy Cache, IEEE/ACM Transaction on Networking, Vol. 8, No. 2,
2000, pp. 158-170.
Ramakrishnan, S. and Yinghui, Y., Mining Web Logs to Improve Website
Organization, The Tenth International Conference on World Wide
Web, 2001.
Spiliopoulou, M. and Faulstich, L., WUM: A Tool for Web Utilization
Analysis, EDBT Workshop WebDB'98, Valencia, Spain, 1998.
Wang, J., A Survey of Web Caching Schemes for the Internet, ACM
SIGCOMM Computer Communication Review, Vol. 29, 1999, pp.
36-46.
Williams, S., Abrams, M., Standridge, C. R., Abdulla, G. and Fox, E. A.,
Removal Policies in Network Caches for World-Wide Web
Documents, Proceedings of Sigcomm'96, 1996.
Wooster, R. P. and Abrams, M., Proxy Caching that Estimates Page Load
Delays, Proceedings of the 6th International WWW Conference,
1997.