+ All Categories
Home > Education > Peer to Peer Information Retrieval

Peer to Peer Information Retrieval

Date post: 13-Apr-2017
Category:
Upload: chetan-sundarde
View: 390 times
Download: 6 times
Share this document with a friend
21
Peer to Peer Information Retrieval By, Chetan K. Sundarde @CHETANSUNDARDE https://www.linkedin.com/in/ chetansundarde 6/17/22 1 P2PIR
Transcript
Page 1: Peer to Peer Information Retrieval

Peer to PeerInformation Retrieval

By, Chetan K. Sundarde@CHETANSUNDARDE

https://www.linkedin.com/in/chetansundarde

May 3, 2023 1P2PIR

Page 2: Peer to Peer Information Retrieval

Outlines :- Peer to Peer Network Information Retrieval Peer to Peer Information Retrieval (P2PIR) Peer to peer IR system architectures Techniques used in IR in P2P networks Basic algorithms used in P2PIR Evaluation techniques used P2PIR Challenges Conclusion References May 3, 2023 2P2PIR

Page 3: Peer to Peer Information Retrieval

Peer To Peer Network Collection of distributed system Computers leave and join the network frequently Each computer acts as a server and a client

simultaneously three tasks that every peer-to-peer network performs

Searching: Querying and getting list of document references.

Locating: Resolve a document reference to concrete location - full document

Transferring: download the document.May 3, 2023 3P2PIR

Page 4: Peer to Peer Information Retrieval

Applications of P2P

Information Retrieval File Sharing Gnutella, Napster, Bit-torrent, etc.

May 3, 2023 4P2PIR

Page 5: Peer to Peer Information Retrieval

Information Retrieval :- A field dealing with the structure, analysis,

organization, storage, searching and retrieval of information is called information retrieval

Search relevant documents, on the basis of user input

Document collection

Info. need

Query

Answer list

IR Retrieval

May 3, 2023 5P2PIR

Page 6: Peer to Peer Information Retrieval

Comparison between File Sharing and Information Retrieval

File Sharing Information Retrieval

Application Locating SearchingIndex-Content File Identifiers Document Content-Size Small LargeData Exchange-Unit File Search Result-Size Megabyte+ Kilobyte(small)

May 3, 2023 8

P2PIR- file sharing networks and federated information retrieval

P2PIR

Page 7: Peer to Peer Information Retrieval

Peer to peer Information Retrieval (P2PIR)

Searching in peer-to-peer networks Each peer shares its information with other peer

Peer searches information by sending queries to its peer Routed to one or many other peers. Query result is provide in the form of index

May 3, 2023 9P2PIR

Page 8: Peer to Peer Information Retrieval

Peer to peer IR system architectures Based on relationship between peers:o Cooperative systemo Uncooperative system Based on the network structureo Centralized networko Structured architectureo Unstructured architecture Based on task perform in P2P networko Centralized Global Indexo Distributed Global Indexo Strict Local Indiceso Aggregated Local Indices May 3, 2023 11P2PIR

Page 9: Peer to Peer Information Retrieval

Peer-to-Peer architectures used in IR

May 3, 2023 15

G

GG

G

G

GGG

G

G

L L

L

LL

L

L

LL

L

LL

Central Global Index Distributed Global Index

Aggregated Local Index Strict Local IndexP2PIR

Page 10: Peer to Peer Information Retrieval

Algorithm used in P2PIR Statistical IR algorithms

Vector Space Model (VSM)

Document A: “books on computer networks”Document B: “network routing in P2P networks”Query Q: “computer network”

Each elements of the vector corresponds to the importance of the term in the document

Ranking of retrieved documents based Similarity between document vector and query vector

bookcomputernetworkrouting

vocabulary0.50.50.80

VA000.90.6

VB

00.50.80

VQ

0.89 0.72

May 3, 2023P2PIR 16

Page 11: Peer to Peer Information Retrieval

Algorithm used in P2PIR Statistical IR algorithms

Latent Semantic Indexing (LSI)documents

terms …..

V’a V’bsemantic vectors

SVD …..

SVD: singular value decomposition– Reduce dimensionality– Discover word

semanticsCat <-> PetBus <-> Travel

Va Vb

May 3, 2023 17P2PIR

Page 12: Peer to Peer Information Retrieval

Algorithm used in P2PIR… Distributed Hash Table (DHT) method of hash table lookup over a decentralized distributed network Key–value pairs are stored in

Kd=hash (“books on computer networks”) Kq=hash (“computer network”)

the DHT at a parent node. (Structured Architecture) Any node in the DHT can then efficiently retrieve the value by providing its key. Napster and BitTorrent modern DHTs are CAN, Chord, etc. Extend with Content-Based Search

Full-Text Retrieval Content-Based Image Retrieval Content-Based Music Retrieval ,etc. May 3, 2023 18P2PIR

Page 13: Peer to Peer Information Retrieval

P2P Information Retrieval Techniques

Unstructured

BFS, RBFS,

Eg. GnutellaBlind Search

Random Walk

Blind Search

RoutingIndicesIndexing

Semantic Searchin

gEg. (SON)Clustering

Structured

pSearchClustering

May 3, 2023 19P2PIR

Page 14: Peer to Peer Information Retrieval

Evaluation in P2P IR Recall (Are all the relevant documents retrieved?)

fraction of the documents that are relevant to the query that are successfully retrieved

Recall = number of retrieved relevant in answer/ total number of relevant in the collection.

Precision (Are the retrieved documents relevant?) fraction of documents retrieved that are relevant to a search query Precision = number of retrieved relevant in answer/ number of retrieved

Measure

retrieved relevant

Relevant Retrieved

May 3, 2023 20P2PIR

Page 15: Peer to Peer Information Retrieval

Evaluation Techniques in P2P IR… F-Score / F-measure

Harmonic mean of precision and recall.

Hits per Query average number of distinct relevant documents discovered per search

query.

May 3, 2023 21P2PIR

Page 16: Peer to Peer Information Retrieval

Applications Of P2P Information RetrievalIn Real World YaCy (www.yacy.net)

local index entries are injected into a distributed global index YaCy uses no centralized servers, but The resulting decentralized web search currently has about 1.4 billion

documents in its index and more than 600 peer operators contribute each month. About 130,000 search queries are performed with this network each day (Feb 2015)

Faroo (www.faroo.com) This is a proprietary peer-to-peer search engine that uses a distributed global

index. They perform distributed crawling and ranking. Faroo encrypts queries and results for privacy protection. 2 million peers.

Some other P2PIR system: Sixearch, ODISSEA, MINERVA, Seeks, etc.May 3, 2023 22P2PIR

Page 17: Peer to Peer Information Retrieval

Challenges:- Cross-Language Information Retrieval Maintaining index freshness Security features Quality of service Efficient use of resources Increase range of peer-to-peer network

May 3, 2023 24P2PIR

Page 18: Peer to Peer Information Retrieval

Conclusion :- P2PIR is one of the application of peer to peer network P2PIR combines key elements of File Sharing and Federal Information

Retrieval No single technique is used for all P2PIR problem Recall and Precision are used for Evaluation of P2PIR

May 3, 2023 25P2PIR

Page 19: Peer to Peer Information Retrieval

References ALMER S. TIGELAAR, DJOERD HIEMSTRA and DOLF

TRIESCHNIGG “Peer-to-Peer Information Retrieval ” University of Twente, IEEE PAPER SEPT 2012.

Rasanjalee Dissanayaka Mudiyanselage. “Ontology-based Search Algorithms over Large- Scale Unstructured Peer-to-Peer Networks.”Georgia State University, IEEE , OCT 2014

Demetrios Zeinalipour-Yazti . “Information Retrieval in Peer-to-Peer Systems .” UNIVERSITY OF CALIFORNIA RIVERSIDE, JUNE, IEEE 2003.

Chengye lu. “Peer to Peer English/Chinese Cross-Language Information Retrieval.”Queensland University of Technology, SEPT 2008.

May 3, 2023 26P2PIR

Page 20: Peer to Peer Information Retrieval

References Xiuqi Li and Jie Wu “Searching Techniques in Peer-to-Peer

Networks.” Florida Atlantic University Boca Raton, FL 33431, 2007

Christos Gkantsidis, Milena Mihail, and Amin Saberi. “Random Walks in Peer-to-Peer Networks.” Georgia Institute of Technology, Atlanta, GA, 2002.

Taoufik Yeferny, Amel Bouzeghoub and Khedija Arour. “A QUERY LEARNING ROUTING APPROACH BASED ON SEMANTIC CLUSTERS.”International Journal of Advanced Information Technology (IJAIT) Vol. 1, No.6, December 2011

Yulian YANG . “Semantic Information Retrieval over P2P Networks.”Universit de Lyon, CNRS INSA-Lyon, LIRIS, UMR5205, F-69621, France, 2009. May 3, 2023 27P2PIR

Page 21: Peer to Peer Information Retrieval

May 3, 2023 28P2PIR


Recommended