+ All Categories
Home > Documents > Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson,...

Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson,...

Date post: 02-Jan-2016
Category:
Upload: mavis-riley
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
22
Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science Department Norfolk, VA 23529 USA {lutken,mln,jbollen}@cs.odu.edu HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia 6.-9.Sept. 2005, Salzburg Austria
Transcript
Page 1: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Distributed, Real-Time Computation of Community Preferences

Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen

Old Dominion UniversityComputer Science Department

Norfolk, VA 23529 USA

{lutken,mln,jbollen}@cs.odu.edu

HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia

6.-9.Sept. 2005, Salzburg Austria

Page 2: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Distributed, Real-Time

Computation of

Community Preferences

not CS if you don’t compute

changes are immediate

no central state

not personalization

Page 3: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Outline

• Review of technologies– buckets– Hebbian learning– previous results

• Experiment design• Results• Lessons learned• Conclusions

Page 4: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Non-evolution of DL Objects

. . .

RSS

SRW

!?

Page 5: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Buckets

• Premise: repositories come and go, but the objects should endure

• Began as part of NASA DL research– focus on digital preservation– implementation of the “Smart Objects, Dumb

Archives” (SODA) model for digital libraries• CACM 2001, doi.acm.org/10.1145/374308.374342• D-Lib, dx.doi.org/10.1045/february2001-nelson

Page 6: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Smart Objects• Responsibilities generally associated with the repository are

“pushed down” into the stored object– T&C, maintenance, logging, pagination & display, etc…

• Aggregate:– metadata– data– methods to operate on the metadata/data

• API examples• http://www.cs.odu.edu/~mln/teaching/cs595-f03/?

method=getMetadata&type=all• http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=listMethods• http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=listPreference• (cheat) http://www.cs.odu.edu/~mln/teaching/cs595-f03/bucket/bucket.xml

Page 7: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Examples

• 1.6.X bucket– http://ntrs.nasa.gov/– http://www.cs.odu.edu/~mln/phd/

• 2.0 buckets– http://www.cs.odu.edu/~mln/teaching/cs595-f03/– http://www.cs.odu.edu/~lutken/bucket/

• 3.0 buckets (under development)– http://beaufort.cs.odu.edu:8080/– uses MPEG-21 DIDLs

• cf. http://www.dlib.org/dlib/november03/bekaert/11bekaert.html

Page 8: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Hebbian Learning

Implementation issues: - gather log files

- problematic when spread across servers/domains

- determine a T for session reconstruction- typically 5 min

- compute links & weights - update the network periodically

- typically monthly

Page 9: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Previous, Log-Based Recommendation Implementations

• LANL Journal Recommendations– collection analysis based on journal readership patterns

• D-Lib Magazine, dx.doi.org/10.1045/june2002-bollen

• NASA Technical Report Server– compared recommendations with those generated by

VSM• WIDM 2004, doi.org.acm/1031453.1031480

• Open Video Project– generated recommendations for videos (little

descriptive metadata)• JCDL 2005, doi.acm.org/1065385.1065472

Page 10: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Hebbian Learning with Bucket Methods

http://a?method=display&referer=http://a&redirect=http://b?method=display%26referer=http://a

http://b?method=display&referer=http://b&redirect=http://a?method=display%26redirect=http://c?method=display%26referer=http://b

Page 11: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Experiment• Spin Magazine’s “Top 50 Rock Bands of All Time”

– something other than reports, journals, etc.– harvest allmusic.com for metadata for all LPs by the 50 bands

(total = 800 LPs)

• Maintain hierarchical arrangement– 1 artist N albums

• Initialize the network of 800 LPs with each LP randomly linked to 5 other LPs

• Send out email invitations to browse the network– have them explore, and then examine the resulting network– users not informed about the workings of the network

Page 12: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Display of LPs

Page 13: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Hierarchical, Weighted Links

weights - initial: 0.5 - frequency : 1.0 - symmetry: 0.5 - transitivity: 0.3

- <structural>- <element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/121/">- <metadata>- <descriptive>  <title>Terrapin Station, Capital Centre, Landover, MD, 3/15/90</title>   </descriptive>

  <administrative />   </metadata>

  </element>

- <element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/11/">- <metadata>- <descriptive>  <title>Jealousy/Progress</title>   </descriptive>

  <administrative />   </metadata>

  </element>

- <element wt="3" id="~http://www.cs.odu.edu/~lutken/bucket/434/">- <metadata>- <descriptive>  <title>Nevermind</title>   </descriptive>

  <administrative />   </metadata>

  </element>

- <element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/130/">- <metadata>- <descriptive>  <title>Technical Ecstasy</title>   </descriptive>

  <administrative />   </metadata>

  </element>…….

Page 14: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

• August 2004 - October 2004• 160 respondents

– self-identify at the beginning; exit survey at the end

– 1200 bucket-to-bucket traversals (7.5 average traversals per session)

Respondents

Table 1. Profile of the 160 Volunteers

Nationality 1 Brazil, 1 Portugal, 4 Canada, 10 UK, 20 Belgium, 124 US

Sex 124 Male, 36 Female

Age High 72, Low 7, Average 37

Domain Knowledge Self-Assessment (1=low, 7=high)

Average = 4.0

Assessment of link utility(1=low, 5=high)

Average = 2.8

Page 15: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

How to Evaluate the Resulting Network?

• Compute network analysis metrics:– PageRank– Degree Centrality– Weighted Degree Centrality

• Compare the results to:– Other “expert” lists (VH1, DigitalDreamDoor,

original Spin Magazine list)– Artist / LP best seller according to RIAA– Artist / LP Amazon sales rank

Page 16: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Expert Rankings

• No correlation with:– VH1 artist list– DigitalDreamDoor list– original Spin Magazine list (!)

(critics don’t agree with each other, or the record buying public)

Page 17: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

RIAA Results

• RIAA had only– only 51/800 LPs– only 14/50 artists

(critics don’t buy records!)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

All Bands Top 50% Top 20% Top 10%

Rank

Probability of being a bestseller

Degree Centrality

Weighted Degree Centrality

Page Rank

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

30.00%

35.00%

40.00%

All albums Top 50% Top 20% Top 10% Top 5% Top 2% Top 1%

Rank

Probability of being a best seller

Degree Centrality

Weighted Degree Centrality

Page Rank

Figure 6. Probability of albums being best-sellers.

Figure 7. Probability of artists being best-sellers.

*RIAA sales caveat

Page 18: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Amazon Sales Rank

• No correlation with individual LP sales rank…

• …but correlated with mean artist sales rank– similar to RIAA data– interpretation: popular artists often have

obscure LPs

Page 19: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Relatedness(?)

Page 20: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Relatedness(?)

Page 21: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Lessons Learned

• While the subject matter was interesting, it was oriented for music geeks

• i.e., no actual music was delivered to the users (intellectual property considerations)

• more traversals needed

• Random initial starting points were difficult to overcome

• “cold start problem” - pre-seed the links according to some criteria?• weights did not decay over time/traversals

• Choosing only artists from Spin Magazine may have pre-filtered the response

• choose artists from Down Beat (Jazz), Vibe (Urban), Music City News (Country), etc.

Page 22: Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science.

Conclusions

• Can build a network of smart objects featuring adaptive, hierarchical links constructed in real-time without central state– network is created without latency and with

computations amortized over individual accesses

• Experimental testbed with popular music LP metadata shown to approach sales rank of artists, not LPs


Recommended