+ All Categories
Home > Documents > Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified...

Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified...

Date post: 20-Dec-2015
Category:
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
21
Lecture 11: Data Synchronization Techniques for Mobile Devices © Dimitre Trendafilov 2003 Modified by T. Suel 2004 CS623, 4/20/2004
Transcript

Lecture 11: Data Synchronization Techniques for Mobile Devices

© Dimitre Trendafilov 2003

Modified by T. Suel 2004

CS623, 4/20/2004

Problem Definition

Given two versions of a data set on different machines, say an outdated and a current one, how can we update the outdated one with minimum communication cost?

Related Problem: What if data has been changed in several machines? (How to reconcile data: difficult, application dependent)

Obvious Solutions

Send the all of the current data. Compress the current data and then send it. Send only the compressed difference

between the two data sets. If the sender has both versions use a suitable

delta compression tool.What if the sender has no access to the outdated

version?

Two Aspects of the Problem File Synchronization (rsync)

Update an outdated file so that it becomes identical to a current one

Set Reconciliation (today)Assume you have many small data records, but

you only want to send modified recordsE.g., Database with a set of 100-byte recordsUnordered: order of records not importantFind which records need to be transmitted, then

send the entire recordRecord identified by number (hash, record ID)

Applications for Data Synchronization

Synchronizing data between PDA and PC Microsoft briefcase etc. Synchronizing databases over a network Synchronizing a file system in two stages:

find which files have changed (MD5 of files) use rsync on those that have changed

Palm Hot Sync

Relies on metadata maintained on both machines.

The metadata is stored in Palm DB There is one Palm DB for each application

(Date Book, To Do, Address Book, etc) A record in Palm DB consist of unique id,

pointer to the object, and status flag.

Palm Hot Sync Preferred mode of operation:

Fast Sync Exchange only the modified records. Works only if the synchronization is done between

two machines.

Palm Hot Sync “Backup” mode of operation:

Slow Sync Copy all of the data. Used when the last synchronization was done

with different machine.

Timestamps

Maintain a timestamp for each record. Send only the records with timestamp greater then

timestamp of the last synchronization Good for synchronization between two machines

but inefficient for more

SyncML (www.syncml.org, now part of Open Mobile Alliance) Fairly large initiative funded by Ericsson, IBM,

Lotus, Matsushita, Motorola, Nokia Seeks to provide an open standard for

synchronization between different platforms and devices

Uses XML Based on timestamps A device stores a timestamp for each record

and each device it communicates with. N records and M devices result in N*M timestamps Not scalable!

Intellisync Anywhere

Developed by Puma Technologies. Relies on a central server Similar to Fast Sync, but each devices

synchronizes only with the central server. It has a single point of failure The central server can get congested

Intellisync Anywhere Puma technologies

Characteristic Polynomial Interpolation Synchronization (CPISync)

Time/bandwidth complexity depends on the number of differences.

Computationally expensive – cubic in the number of differences

But can be improved Computations could be done on only one of

the two devices (the faster one) Works in general peer-to-peer environment

CPISync Preliminaries

Each data set can be represented as a set of numbers [using hash functions].

A characteristic polynomial for a sets is:

Note that for two polynomials SA and SB

CPISync

Host A and B evaluate their characteristic polynomials and at the same sample points , .

Host B sends to host A its evaluations The evaluations are combined at host A to

compute . The zeroes in and are determined.

Those are the differences!

CPISync

IPSync – Finding the Number of Differences Guess a bound. Send evaluations at k random points Verify at k points Repeat with another bound if needed. The probability for error is:

IPSync vs. Slow Sync

Taxonomy of Synchronization Techniques

More Techniques: Bloom Filters

Get a bloom filter for the receivers data set Send only elements that are not found in the

bloom filter.

More Techniques:Using Error Correction Codes

Send error correction code for the data set The receiver, “correct the errors” in its

outdated data set. Reed-Solomon Codes Decoding time depends only on the number

of differences between the sets (almost linear, not cubic)

But extra factor of 2 transmission


Recommended