Date post: | 23-Jun-2015 |
Category: |
Data & Analytics |
Upload: | inria-oak |
View: | 50 times |
Download: | 0 times |
ViP2P-‐ Views in Peer-‐to-‐Peer Network Owner : Jesus
Presenter : Soudip
Overview • Enables efficient distributed data management in P2P n/w (Distributed hashtable, materialized views)
• Any peer can pose a query in the conjuncAve tree paCern format
• The querying peer – lookups in the DHT in the n/w to find the views that correspond to the query
– rewrites the query and produce a logical plan based on the view definiAons that it found
– generates a physical plan that implements the logical plan in a most opAmized way
– executes the physical plan and returns the result
N/w Parameter SeMngs
Logs
InteracAve N/W Topology VisualizaAon
InformaAon About a Peer
Query DefiniAon
Logical Plan of the Query
Physical Plan of the Query
Few Details • Online repository
– hCps://gforge.inria.fr/projects/vip2p/ • Project webpage
– hCp://vip2p.saclay.inria.fr/ • Code size
– 100 packages, 75,674 LoC, 637 java classes
• List of people contributed -‐ – Ioana Manolescu, François Goasdoué , Jesús Camacho-‐Rodríguez, Alexandra
RoaAs, StamaAs Zampetakis, KonstanAnos Karanasos, Asterios Katsifodimos, Julien Leblay, MarAn Goodfellow, Spyros Zoupanos, Domenica Sileo, Silviu Julean, Alin Tilea, Varunesh Mishra
• Current Owner (OAK member) of the Code -‐ Jesús Camacho-‐Rodríguez • Who is using the code now
– Internal • XML NavigaAon based on Tree paCerns– AMADA, PAXQuery
– External • Asterios Katsifodimos shared (Delta code) with someone!
Architecture of ViP2P
What does the code do?
• FreePastry -‐ Open Source implementaAon of Pastry – Provides implementaAon for underlying DHT layer – provides efficient request rouAng, determinisAc object locaAon, and load balancing
– DHT is used for sending small messages (index info, look up view definAons)
– RMI is used to send larger messages (containing view tuples)
3rd party somware “used amer adaptaAon”
What does the code do? 3rd party somware “used as it is”
• Log4j – Used for enriched logging – Provides more control over logging funcAonaliAes (with different parameter seMngs and different log output format)
What does the code do? 3rd party somware “used as it is”
• Apache Commons-‐ConfiguraAon – Provides the necessary means for saving/loading the configuraAon properAes to/from the configuraAon files.
What does the code do?
• BerkleyDB – Used for storing view tuples in each peer – It provides the rouAnes to store, retrieve and sort entries, while guaranteeing ACID transacAons when view data are wriCen and read concurrently
3rd party somware “used as it is”
What does the code do?
• XML Summary – provides a way of creaAng data guides from the XML documents, that can help us get beCer esAmaAons for our execuAon opAmizaAons
– Implemented using external libraries • Piccolo XML Parser for Java • DTDParser 3rd party somware
“used as it is”
What does the code do?
• XML Processors – Saxon-‐Home EdiCon provides a suite of tools (It provides an open-‐source implementaAon of XSLT 2.0 and XPath 2.0, and XQuery 1.0) for XML processing
– Xalan-‐Java is an XSLT processor for transforming XML documents into HTML, text, or other XML document types
3rd party somware “used as it is”
What does the code do?
– Before publishing a new document , the view lookup module idenAfies the view definiAons (from DHT) to which the document may contribute data
– It passes views definiAons to the data extracAon module
What does the code do?
– View Extractor at publisher peer extracts tuples matching each view from the document
– It sends (via RMI) results in parallel fashion, to the different consumers
– It is capable of matching several views on a given document simultaneously.
Query Management
– Given a query, performs a lookup in the DHT network to retrieve the view definiAons that can be used to rewrite the query
Query Management
– Given query + set of available view definiAons it produces a logical plan which, evaluated on some views, produces exactly the results required by the query
Query Management – Takes a logical rewriAng plan from the query rewriAng module and translates it to an opAmized physical plan
– The opAmizaAon takes place both at the logical (join reordering, push selecAons and projecAons etc.) and physical (dictaAng the exact flow of data during query execuAon) level
Query Management
– This module provides a set of physical operators which can be deployed at any ViP2P peer, implemenAng the standard iterator-‐based execuAon model.
View Management – View materializaAon module receives tuples from remote publishers and stores them in the respecAve BerkeleyDB database
– It implements a back-‐pressure tuple-‐send/receive protocol which informs the publisher when the incoming buffer is full at the consumer, to save bandwidth
View Management – View indexing makes the definiAons
of all the views declared in the ViP2P network, visible to all network peers
– When a new view is defined, the indexer inserts in the DHT (key,value) pairs used to describe it, based on one of the four indexing strategies (hCps://gforge.inria.fr/scm/viewvc.php/*checkout*/trunk/ViP2P/documentaAon/programmerguide/programmerguide.pdf?root=vip2p) • Label indexing • Return label indexing • Leaf path indexing • Return path indexing
Sub Projects
• AnnoVIP-‐ Extended paCern dialect for views and queries (tree paCerns with value joins -‐ subset of XQuery).
• LiquidXML -‐ Its main feature is to help in adapAng the set of materialized views on each peer for improving the query processing performance in the network.
Thank you!!