+ All Categories
Home > Documents > A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need...

A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need...

Date post: 18-Jan-2018
Category:
Upload: camron-walsh
View: 214 times
Download: 0 times
Share this document with a friend
Description:
Introduction Data from CERNBox FS and network logs CS3 Zurich, January AnalysisSimulation Proposed implementation Decision
25
A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University of Science and Technology / CERN
Transcript
Page 1: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

A study of delta sync and other optimisations in HTTP/WebDav

synchronisation protocolsDo we need changes in OwnCloud protocol?

Wojciech JaroszAGH University of Science and Technology /

CERN

Page 2: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 2

Introduction• Owncloud protocol, CERNBox service• Enhancing current protocol• Investigation of following enhancements:

o Bundlingo Delta-syncingo Compressiono Chunk size adjustment

• Context: scientific environment at CERN

Page 3: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 3

Introduction

• Data from CERNBox FS and network logs

Analysis SimulationProposed

implementation

Decision

Page 4: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 4

CERNBox• Distinguished features:

o Integrated with 80PB of physics data

o Future: easy and effective to share experiment results

o Future: focus on scientific usage

o Currently: a mix of scientific and personal use

Page 5: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 5

CERNBox as of Oct 15• ~ 31 TB of data• ~ 3700 users• ~ 24 milion files in ~ 3 milion directories• Average file size: ~ 1.3 MB, median file size <

100kB• 200k file uploads / downloads per day

Page 6: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 6

Filesizes

0 1 - 9 b 10 - 99 b 100 - 1000b

1kb - 10kb

10kb - 100kb

100kb - 1mb

1mb - 10mb

10mb - 100mb

100mb - 1gb

over 1gb0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

8000000

Files by size

Page 7: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 7

Files count and size

null

png pd

fda

tjpg sv

nro

ot txt eps h

npy c

html

log gif xml

olk14

messa

ge tex o f0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

Countsize(GB)

No extension

Page 8: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 8

Where are the transfers coming from?Transfers

CERNUnviersities / Insti-tutionsOthers

Page 9: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 9

Downloads vs Uploads

44%

56%

GETs vs PUTs

PUTGET

Page 10: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 10

Protocol - chunking• Could be used for:

o partial uploado delta-synco deduplication

• Is the chunk size chosen correctly?o Most of the files are smallo Modern protocols should use network-aware chunking

• Currently only ~0.15% of all PUTs are chunked• Is dynamic chunking a viable option?

Page 11: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 11

Enhancements to the current OwnCloud

protocolFocus on bundling, delta-sync and compression

Page 12: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 12

Bundling• Typically users are active only a few days a month

2/15/2015 4/6/2015 5/26/2015 7/15/2015 9/3/2015 10/23/20150

20000400006000080000

100000120000140000160000180000200000

Sample user transfers count

Page 13: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 13

Bundling• Even power users work in cycles

3/1/2015 4/20/2015 6/9/2015 7/29/2015 9/17/20150

5000

10000

15000

20000

25000

Power user file transfers

Page 14: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 14

Bundling• Typically users are active only a few days a month• Often over 2000 requests in 10 minutes• Small file size

Implementation?• Simple bundling – TARBall?• Choose the right bundle size• Send chunks in parallel• Error reporting

tar untar

Page 15: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 15

BundlingDROPBOX[1]

CERNBOX*

• Reduce TCP slow-start effect

Before bundling After bundlingMedian flow size 16.2 kB 42.4 kB

Throughput PUT 358 kbit/s 552.92 kbit/s

Throughput GET 783 kbit/s 1294 kbit/s

Before bundling

After bundling

Throughput PUT

~3600 kbit/s Up to 400 Mbit/s ?

Throughput GET

~7653 kbit/s Up to 500 Mbit/s ?

[1] I. Drago, M. Mellia, M. M. Munaf`o, A. Sperotto, R. Sadre, and A. Pras. Inside Dropbox: Understanding Personal Cloud Storage Services. In Proceedings of the 12th ACM Internet MeasurementConference, IMC’12, pages 481–494, 2012.* Based on users inside CERN and affiliated institutions

Page 16: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 16

Extensions and filesizes

root null jpg mp4 pdf mov enc avi zip mp3 gz img epio pptx 1 wav txt png iso nef0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

Countsize(GB)

?

Page 17: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 17

Delta-sync• About 7.8 % of the files are versions• Typically files are modified the same day• Usually small files

root mov pptx pdf mp4 zip h5 key bz2 jpg tc null vdi gz epio pxp hep tgz f4v0

1000000000

2000000000

3000000000

4000000000

5000000000

6000000000

CountSize

Page 18: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 18

ROOT files• Scientific software framework• Complex file structure• Already compressed• Small changes scattered

throughout the file

Page 19: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 19

Delta-sync• Possible implementations

o Chunk-basedo Byte-range request

• More data and simulation needed• It might be not worth implementing

Page 20: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 20

Compression• From TOP20 extensions (sizewise) only .txt will

compress well• Compression can be slow, but almost all requests

are executed from desktop clients

root null jpg mp4 pdf mov enc avi zip mp3 gz img epio pptx 1 wav txt png iso nef0

1000000200000030000004000000500000060000007000000

Countsize(GB)

Page 21: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

Future

Page 22: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 22

Future - service• CernBOX fully exposed to a very large

scientific repository (ATLAS, LHCb, CMS…)

• Fuse-mount to underlying CernBOX storage available everywhere at CERN

• Will users use CERNBox in new ways?

Page 23: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 23

Conclusion• Owncloud protocol is simple, but is it enough?• Understand before implementation

• Work in progress!• MSc at AGH

Analysis SimulationProposed

implementation

Decision

Page 24: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 24

Conclusion• Bundling looks like the most viable

enhancement

• Further research is needed for delta-sync and dynamic chunking

• Compression is less likely to enhance current protocol

Page 25: A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols Do we need changes in OwnCloud protocol? Wojciech Jarosz AGH University.

CS3 Zurich, January 2016 25

Contact detailsWojciech Jarosz

[email protected] +41 22 76 75970

Opinions / questions most welcome!• How the usage compares to

your system?• How to implement the new

features?• Feedback, ideas, comments…


Recommended