+ All Categories
Home > Documents > I2.1: In-Network Storage PI Presentation by Arun Iyengar

I2.1: In-Network Storage PI Presentation by Arun Iyengar

Date post: 06-Jan-2016
Category:
Upload: jenski
View: 45 times
Download: 0 times
Share this document with a friend
Description:
I2.1: In-Network Storage PI Presentation by Arun Iyengar. NS-CTA INARC Meeting 23-24 March 2011 Cambridge, MA. Key Aspects of this Work. Make better use of space within network storage Remove redundant content From communications between network nodes From network storage itself - PowerPoint PPT Presentation
32
I2.1: In-Network Storage PI Presentation by Arun Iyengar NS-CTA INARC Meeting 23-24 March 2011 Cambridge, MA
Transcript
Page 1: I2.1: In-Network Storage PI Presentation by Arun Iyengar

I2.1: In-Network StoragePI Presentation by

Arun Iyengar

NS-CTA INARC Meeting23-24 March 2011

Cambridge, MA

Page 2: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Key Aspects of this Work• Make better use of space within network storage

– Remove redundant content• From communications between network nodes• From network storage itself• From images which may have similar content

– Only store most relevant content• Use proper cache replacement policies

• Handle disruptions and failures– Nodes may fail, become unreachable– Packet losses may occur

• Data Issues– Dealing with data provenance– Data consistency

• How provenance affects data consistency decisions

Page 3: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Key Problem• Nodes within a network need adequate

storage and memory– Mobile devices may have limited storage/memory– Content may be replicated for high availability,

increasing storage/memory requirements– Even if persistent storage is sufficient, maintaining

as much content in main memory may be desirable for performance reasons

• Our work– Redundancy elimination at multiple levels– Making better use of existing memory/storage

space

Page 4: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Redundancy Elimination at Multiple Levels

File Level (Disk)

Page Level (main memory)

Application Level (e.g. cached items)

Communication Protocol Level

Semantic Level (e.g. redundancy in visual images)

Redundancy elimination at different levels can causeMutual interference, complicated behavior

Page 5: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Redundancy Elimination at Communication Protocol Level

• In-network caching algorithms to reduce network utilization by removing redundant bytes communicated

GW

Page 6: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Motivation• Several in-network caching algorithms previously suggested

(e.g. Spring , Wetherall, SIGCOMM 2000)• However, evaluations mainly conducted on packet traces• When deploying those algorithms on protocols

(TCP/UDP/IP), several issues can arise– E.g., protocol correctness (e.g., termination)

• In addition, packet-traces studies are restrictive – Can only analyze certain metrics (e.g., bytes savings)– Cannot evaluate other metrics (e.g., TCP delay) and interactions

between mechanisms (e.g., TCP exp. Backoff)

Page 7: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Goals• Analyze Protocol Correctness• Conduct a more comprehensive evaluation of

performances in real environments – Packet losses, packet re-ordering, etc.– Bytes, Delay

• Design new algorithms that are more robust• Develop analytical models to rigorously analyze and

predict performances

Page 8: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Redundancy Elimination at Communication Protocol Level

• Communications between nodes over a network may contain redundant content– Removing those redundancies can reduce network

utilization and congestion• Fingerprint: number computed from a byte string

using a one-way function.– Fingerprint consumes less space than byte string– Low probability of different strings having same

fingerprints• Rabin fingerprint for string of bytes t1, t2, t3,…, tn:

– RF(t1, t2, t3,…, tn) = (t1pn + t2pn-1…+ tn-1p + tn) mod M– p and M are constant integers

• Computationally cheap to compute next fingerprint from previous one:– RF(ti+1, …, tn+i) = (RF(ti, …,tn+i-1) – ti * pn) * p + tn+i mod M– For faster execution, all values of ti * pn can be

precomputed, stored in a table

Page 9: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Redundancy Elimination Algorithm

• Maintain caches (at both sender and receiver) of recent packets indexed by fingerprints

• Generate representative fingerprints for packets being sent

• Look up each fingerprint in cache• If match found, compare bytes to make sure that

there was no fingerprint collision• If bytes match, expand match to largest possible

matching string• Update sender cache• For byte strings with matching content, send tokens

indentifying fingerprint instead of actual bytes

Page 10: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Our Implementation

• Original Spring-Weatherall paper did not implement a system with this redundancy scheme

• Our implementation encountered following issues

Page 11: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Illustration of Interactions

• RE creates dependencies between packets• Wireless links are loss-prone• Chains of packets may be undecodable• TCP performance may be severely affected (e.g., exp. backoff)

GW

IP 1IP 2IP 3x

IP 1

IP 2

IP 3 IP 2

IP 3

IP 2, IP 3 cannot be decoded

Network in-caching alg. create dependencies between IP packets, which increase correlated losses and trigger TCP algorithms (e.g., exp. Backoff)

Page 12: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Illustration of Protocol Correctness Violation

GW

IP 1IP 2x

IP 1

IP 2

IP 3

Re-transmit IP 1

IP 2 cannot be decoded

IP 3 IP 2

IP 3 cannot be decoded

Re-transmit IP 1

IP 4 IP 3 IP 4

IP 4 cannot be decodedA single lost (or re-ordered) packet can stall a TCP connection!

IP 1 IP 3 IP 4

Same object

Page 13: I2.1: In-Network Storage PI Presentation by Arun Iyengar

New Algorithms• Algorithm 1:

– Check TCP sequence numbers– Packets only encoded with previous packets– Variant: I-P frames (similar to MPEG)

• Algorithm 2:– Flush caches upon packet retransmissions

Page 14: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Results

Page 15: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Results: I/P frames

Page 16: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Memory Deduplication• Memory system, cache, file system may have

multiple entities which are identical– For example, two pages in a memory system may be

identical• Challenge is to identify duplicate entities, combine

them so that only one copy is stored and shared• Maintain a hash of stored entities.• When two items hash to same value, do a byte-by-

byte comparison to verify that the entities are in fact identical– Use of hash function significantly reduces number of

comparison operations to check for identical entities

Page 17: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Delta Encoding• Multiple objects within a cache may be similar but

not identical– Deduplication will not work

• Identify similar cached objects by taking Rabin fingerprints of parts cached objects, looking for objects with similar Rabin fingerprints

• For objects with similar content, some of the objects o1, o2,…, on can be stored as differences from one or more base objects– No need to store complete data for o1, o2,…, on

• If overhead of unencoding a differenced object is an issue, delta encoding can be restricted to cached objects which are infrequently requested

Page 18: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Compression• Cached objects can be compressed to further

decrease memory/storage consumption• If computational overhead of compression is a

problem, compression should only be applied to cached objects which are infrequently accessed.

Page 19: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Cache Replacement• Determine how to make best use of cache

space• LRU, Greedy-dual-size have been used in the

past• Have developed new cache replacement

algorithms which are superior for DTNs

Page 20: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Caching: Basic Idea• Utility-based data placement

– A unified probabilistic framework– Ensure that the more popular data is always

cached nearer to the brokers• Data utility is calculated based on

– Its chance to be forwarded to the brokers– Its popularity in the network

• Each two caching nodes optimize their data placement upon contact

Page 21: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Data Utility

• Data specifications– Data i is destined for brokers B1, …, Bk– Data i has a finite lifetime T

• The utility uij of data i at node j evaluates the benefit of caching data i at node j

– cij: The probability that i can be forwarded to the brokers within T

– wi: The probability that data i will be retrieved in the future

Page 22: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Cached Data Placement

• Whenever two nodes contact, they exchange their cached data to maximize the utilities of the data they cache– Hard replacement: only for data which is currently

requested by the brokers– Soft replacement: for the other unrequested data

• Hard replacement is prioritized to ensure that the requested data is forwarded to the brokers

Page 23: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Unified Knapsack Formulation• When nodes A and B contact, put the data

cached on both A and B into the same selection pool with size k– – uiA: utility of data i at A– si: size of data I– SA: buffer size of A

• Similar for node B

Page 24: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Hit Rate

Page 25: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Data access delay

Page 26: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Overhead

Page 27: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Military Relevance

• Our In-Network storage techniques are important for communications between soldiers in real deployments– Handling disruptions– Dealing with failed/captured nodes and unreliable network

links– Handling lost communications, packet losses

• Collaborations with Robert Cole (CERDEC), John Hancock (ArtisTech/IRC), Matthew Aguirre (ArtisTech/IRC), Alice Leung (BBN/IRC)– They are studying how are techniques can be applied to

military scenarios– Studying our DTN caching techniques to run experiments

on traces typical of military scenarios• Joint work with Guohong Cao’s group, CNARC

Page 28: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Impact and Collaborations• ICDCS 2011 paper co-authored with Guohong

Cao (Penn State) of CNARC• Collaborations with Robert Cole (CERDEC),

John Hancock (ArtisTech/IRC), Matthew Aguirre (ArtisTech/IRC), Alice Leung (BBN/IRC)– They have used our DTN caching code to run

experiments on traces typical of military scenarios

Page 29: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Future Work• Refine techniques on redundancy elimination

in network communications• Study redundancy at other levels in the system• Study interactions, interferences, and

synergies between redundancy elimination at different levels of the system

Page 30: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Summary and Conclusion• New techniques for redundancy elimination

– Reduces space requirements for in-network storage

• Redundancy elimination being performed at multiple levels of the entire system

• New methods for making better use of caches in DTNs– These methods are of interest to other

collaborators in NS-CTA

Page 31: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Arun Iyengar’s NS-CTA Contributions

• I2.1 - In-Network Storage is only task funding Arun’s research• Research Contributions in caching, redundancy elimination summarized in

previous slides• Publications:

– “Supporting Cooperative Caching in Disruption Tolerant Networks”. W. Gao, G. Cao, A. Iyengar, M. Srivatsa. Accepted in ICDCS 2011

– "Social-Aware Data Diffusion in Delay Tolerant MANETs", Yang Zhang, Wei Gao, Guohong Cao, Tom La Porta, Bhaskar Krishnamachari, Arun Iyengar. Book chapter to appear in Handbook of Optimization in Complex Networks: Communication and Social Networks, Springer.

– “Provenance driven data dissemination in disruption tolerant networks”. M. Srivatsa, W. Gao and A. Iyengar. Under submission, Fusion 2011

– “Resolving Negative Interferences between In-Network Caching Methods”. F. Le, M. Srivatsa, A. Iyengar and G. Cao. Under preparation

• Patent Application:– "System and method for caching provenance information", Wei Gao, Arun

Iyengar (lead inventor), Mudhakar Srivatsa, IBM patent application.• Initiated and established collaboration with Guohong Cao’s group

(CNARC).– Mentor for Wei Gao (Guohong Cao’s PhD student) for internship at IBM.

• Initiated and established collaboration with Robert Cole (CERDEC), John Hancock (ArtisTech/IRC), Matthew Aguirre (ArtisTech/IRC), Alice Leung (BBN/IRC)

Page 32: I2.1: In-Network Storage PI Presentation by Arun Iyengar

Data Consistency• Problem: How to make sure cached data is

current– Resolving inconsistencies between different copies

• Data consistency in DTNs– Limited connectivity can make it difficult to

achieve strong consistency– Expiration times can be used for maintaining

cache consistency


Recommended