+ All Categories
Home > Documents > Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Date post: 29-Dec-2015
Category:
Upload: everett-gregory
View: 224 times
Download: 1 times
Share this document with a friend
Popular Tags:
22
Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce
Transcript
Page 1: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Yongzhi Wang, Jinpeng Wei

VIAF: Verification-based Integrity Assurance Framework for MapReduce

Page 2: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

MapReduce in BriefSatisfying the demand for large scale data

processingIt is a parallel programming model invented

by Google in 2004 and become popular in recent years.

Data is stored in DFS(Distributed File System)

Computation entities consists of one Master and many Workers (Mapper and Reducer)

Computation can be divided into two phases: Map and Reduce

2

Page 3: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Word Count instance

Hello, 2

Hello, 4

3

Page 4: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Integrity VulnerabilityHow to detect and eliminate malicious mappers to guarantee high computation integrity ? Straightforward approach: Duplication

4

Non-collusive worker

Page 5: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Integrity vulnerabilityBut how to deal with collusive malicious workers?

5

Page 6: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

VIAF SolutionDuplicationIntroduce trusted entity to do random

verification

6

Page 7: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Outline

MotivationSystem DesignAnalysis and EvaluationRelated workConclusion

7

Page 8: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Architecture

8

Page 9: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

AssumptionTrusted entities

MasterDFSReducersVerifiers

Untrusted entitiesMappers

Information will not be tampered in network communication

Mappers’ result reported to the master should be consistent with its local storage (Commitment based protocol in SecureMR)9

Page 10: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

SolutionIdea: Duplication + Verification

Deterministically duplicate each task to TWO mappers to discern the non-collusive mappers

Non-deterministically verify the consistent result to discern the collusive mappers

The credit of each mapper is accumulated by passing verification

A Mapper become trustable only when its credit achieves Quiz Threshold

100% accuracy in detecting non-collusive malicious mapper

The more verification applied on a mapper, the higher accuracy to determine whether it is collusive malicious.

10

Page 11: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Data flow control

1

1 5.b

5.a

3

2

6

2

3

4

4

11

Page 12: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Outline

MotivationSystem DesignAnalysis and EvaluationRelated workConclusion

12

Page 13: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Theoretical AnalysisMeasurement metric (for each task)

Accuracy -- The probability the reducer receive a good result from mapper

Mapping Overhead – The average number of execution launched by the mapper

Verification Overhead – The average number of execution launched by the verifier

13

Page 14: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Accuracy vs Quiz Threshold

Collusive mappers only, different p:m – Malicious worker ratioc – Collusive worker ratiop – probability that two assigned collusive workers are in one collusion group and can commit a cheatq – probability that two collusive workers commit a cheatr – probability that a non collusive worker commit a cheatv – Verification Probabilityk – Quiz Threshold.

14

0 5 10 15 20

0.7

00

.75

0.8

00

.85

0.9

00

.95

1.0

0

QUIZ THRESHOLD k

AC

CU

RA

CY

Accuracy vs Quiz Threshold m=0.4, c=1.0, q=1.0, r=0.0

p=0.3p=0.5p=1.0

Page 15: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Overhead vs Quiz Threshold

Collusive worker only:Overhead for each task stays between 2.0 to 2.4

Collusive worker only:Verification Overhead for each task stays between 0.2 to 0.207 when v is 0.215

0 5 10 15 200

.19

00

.19

50

.20

00

.20

50

.21

0

QUIZ THRESHOLD k

VE

RIF

ICA

TIO

N O

VE

RH

EA

D

Verification Overhead vs Quiz Threshold m=0.4, c=1.0, q=1.0, r=0.0, v=0.2

p=0.3p=0.5p=1.0

0 5 10 15 20

1.5

2.0

2.5

3.0

QUIZ THRESHOLD k

OV

ER

HE

AD

Mapper Overhead vs Quiz Threshold m=0.4, c=1.0, q=1.0, r=0.0

p=0.3p=0.5p=1.0

Page 16: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Experimental EvaluationImplementation based on Hadoop 0.21.011 virtual machines(512 MB of Ram, 40 GB

disk each, Debian 5.0.6 ) deployed on a 2.93 GHz, 8-core Intel Xeon CPU with 16 GB of RAM

word count application, 400 mapping tasks and 1 reduce task

Out of 11 virtual hosts1 is both the Master and the benign worker1 is the verifier4 are collusive workers (malicious ratio is

40%)5 are benign workers

16

Page 17: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Evaluation resultQuiz

Threshold Accuracy Mapper

OverheadVerification Overhead

0 87.20% 2.000 0

1 99.42% 2.045 22.00%

2 99.83% 2.074 23.58%

3 100% 2.053 23.00%

4 100% 2.162 23.58%

5 100% 2.046 21.75%

6 100% 2.111 22.58%

7 100% 2.027 19.83%

Where c is 1.0, p is 1.0, q is 1.0, m is 0.40

170 5 10 15 20

0.7

00

.75

0.8

00

.85

0.9

00

.95

1.0

0

QUIZ THRESHOLD k

AC

CU

RA

CY

Accuracy vs Quiz Threshold m=0.4, c=1.0, q=1.0, r=0.0

p=0.3p=0.5p=1.0

Page 18: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Outline

MotivationSystem DesignAnalysis and EvaluationRelated workConclusion

18

Page 19: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Related workReplication, sampling, checkpoint-based solutions

were proposed in several distributed computing contexts Duplicate based for Cloud: SecureMR(Wei et al, ACSAC

‘09), Fault Tolerance Middleware for Cloud Computing( Wenbing et al, IEEE CLOUD ‘10)

Quiz based for P2P: Result verification and trust based scheduling in peer-to-peer grids (S. Zhao et al, P2P ‘05)

Sampling based for Grid: Uncheatable Grid Computing (Wenliang et al, ICDCS’04)

Accountability in Cloud computing researchHardware-based attestation: Seeding Clouds with Trust

Anchors (Joshua et al, CCSW ‘10)Logging and Auditing: Accountability as a Service for the

Cloud(Jinhui, et al, ICSC ‘10)19

Page 20: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

Outline

MotivationSystem DesignAnalysis and EvaluationRelated workConclusion

20

Page 21: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

ConclusionContributions

Proposed a Framework (VIAF) to defeat both collusive and non-collusive malicious mappers in MapReduce calculation.

Implemented the system and proved from theoretical analysis and experimental evaluation that VIAF can guarantee high accuracy while incurring acceptable overhead.

Future workFurther exploration without the assumption

of trust of reducer.Reuse the cached result to better utilize the

verifier resource.21

Page 22: Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

THANKS!

Q&A


Recommended