+ All Categories
Home > Documents > Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Date post: 04-Apr-2015
Category:
Upload: gilberte-barbier
View: 105 times
Download: 1 times
Share this document with a friend
45
Partage d’information à grande échelle Marc Shapiro Cambridge Distributed Systems Group
Transcript
Page 1: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Partage d’information à grande échelle

Marc Shapiro

Cambridge Distributed Systems Group

Page 2: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

1. Résumé des épisodes précédents…

Page 3: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 3

Grandes étapes

1978: Ingénieur ENSEEIHT1978--1980: doctorant LAAS1980--1982: Post-doc MIT Lab. for Computer Science1982--1985: Chargé de Recherches CMIRH

1984--1985: INRIA1986--1999: Directeur de recherche, responsable

scientifique INRIA projet SOR 1993--1994: Cornell 1997: Jini, Sun Research Labs

1999+: Senior Researcher, MSR Cambridge

Page 4: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 4

Thèses encadrées

Yek Loong Chong. U. de Cambridge, 2003.Nicolas Richer. Paris 6, 2002.Fabrice le Fessant (co-encadrement). Polytechnique, 2001.Xavier Blondel. CNAM, 2000.Aline Baggio. Paris 6, 1999.Georges Brun-Cottan. Paris 6, 1998.Julien Maisonneuve, Paris 6, 1996.Paulo Ferreira. Paris 6, 1996.Hervé Soulard. Paris 6, 1995.David Plainfossé. Paris 6, 1994.Daniel Edelson. UC Santa Cruz, 1993.Michel Ruffin. Paris 6, 1992.Yvon Gourhant. Paris 6, 1991.Sabine Habert. Paris 6, 1989.Mesaac Mounchili Makpangou. Paris 6, 1989.

Page 5: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 5

Publications

Computing Surveys, en cours

PODC 2001

Livre Springer 2000

PLDI 1998

ECOOP 1998

ICDCS 1996

IWMM 1995

WDAG 1995

Livre IEEE 1994

OSDI 1994

ICDCS 1994

PODC 1992

RDS 1991

Computing Systems 1989

SOSP 1989

ICDCS 1986

Page 6: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 6

Le partage de l’information à grande échelle

Complexité des systèmes répartis : parallélisme, pannes, événement, latence

Objets: sémantique non prédéfinie

Système :outils communsaspects dynamiquescompromis

Traiter les problèmes de fond, impact sur le long terme

Page 7: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 7

Objets Fragmentés1981-1990

StructurationContrôle de la transparenceMandataire

programmablespécifique au clientétat du protocole

Système SOSVoir

proxy WebJinipages Web dynamiques

Page 8: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 8

Chaînes de Paires Souche-Scion 1990—1999

Automatiser la gestion d’objets en réparti

Références RM locaux + coordination

asynchroneSimplicité + efficacitéTolérant les pannes

détectablesDérivation formelle

x

y

z

t

Page 9: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 9

RM asynchrone en mémoire répliquée 1993—2000

Scalability asynch.ConsistencyLocal GCDistributed GC

Sufficient safety rulesUnion ruleClean propagationComprehensive scanningCreate before deleteCausal delivery

PerDiS

Page 10: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 10

Réplication optimiste 2001+

Partage en écriture décentralisé

IceCube : moteur général de réconciliationjournalisation d’opérationsparamétré par la sémantique

Cohérence à terme :modèle : opérations + contraintessûreté + vivacitéinvariant globaldécrire et comprendre les solutions

Très préliminaire

Page 11: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

2. Réplication optimiste et IceCube

Page 12: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 12

Optimistic replication

Replicas of shared objects on sitesWithout synchronisation:

peer-to-peer read and update!

Applications:high latency networksdisconnected operationcooperative work

Improves availability & performanceConsistency: a posteriori, offline

Merge independent updates

Page 13: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 13

Example: cooperative engineering with CVS

CVS: developing shared code

Local, disconnected replica: no interference

Conflicts:Write same file = syntacticOverlap in file = violates edit semanticsDoesn’t compile, test = violates

application semantics

Both sides of a conflict are excluded

Manual repair

Page 14: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 14

Example: Bayou

General-purpose databaseAny replica can update, log actions

action = { dependency check, operation, merge-procedure }

Optimistic replication:epidemic exchange logs{ roll-back, replay }*; commitdep-check: semantic check for conflict merge-proc: semantic repair

Page 15: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 15

Operation-based model

0

0

1

2

0

0

4

3

schedulingcommitment

Page 16: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 16

Execution model

operation = code + pre/post-conditionsSchedule must satisfyViolation conflictBut pre/post-conditions often unknown

Conservative approximations

pre(x0) post(x0, f(x0))

x1:= f(x0)

pre(x0) post(x’1, g(x0))

x’1:= g(x0)

pre(x1) post(x1, g(x1))

x2:= g(x1)

Page 17: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 17

Happens-beforeTrue constraints unknowne1 precedes e2 in processe1 sends, e2 receives

e1 e2

(e1 e2) (e2 e1) e1 || e2

e1 || e2: e1 does not cause e2

e1 e2: e1 might cause e2

Partial order, consistent with causal dependence

Schedule consistent with

Page 18: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 18

Syntactic vs. semantic mechanisms

Scalar timestampsno concurrency detectionvery conservative approx.

of causalityVector timestamps

detect concurrencyconservative approx. of

causality

Alternative: explicit constraints

Page 19: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 19

Constraints between operations

Not all schedules are acceptable

Constraints, e.g.x > 50respect causal orderingall-or-nothing transactionsalternative execution pathsconflicting operations exclude each

other

Page 20: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 20

IceCube: Primitive constraints

Constraint = predicate (action, schedule)

Declarative (“static binary”):MustHave: a b

if as and ab then bs(not necessarily contiguous nor in

order)Order: a b

if a, bs and ab then a before b in s(not necessarily both nor contiguous)

Imperative (dynamic): a.preCondition (State)

Page 21: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 21

IceCube: log constraints

parcelpredecessor-

successor

alternatives

Express user intents:Predecessor/successor: a b b a

b uses effect of a; “a causes b”Parcel: a b b a

transactionAlternatives: a b b a

Page 22: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 22

IceCube: Object constraints

Shared data type advertises static semanticsmutually exclusive a b b a

best order (e.g. bank: credits before debits) a b

Only between concurrent actions

Also: dynamic constraints

commutebestorder

mutuallyexclusive

Page 23: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 23

Optimistic concurrency control & scheduling

Two actions are either:Dependent

schedule in dependence orderCommutative

schedule in any orderConcurrent with favourable order

schedule in non-conflicting orderConcurrent and conflicting

or exclude one, the other, or both

Page 24: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 24

IceCube scheduling

Insight: conflict: choice of which action to excludemaximise value

Page 25: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 25

IceCube scheduling model

0 1

0 2

0

0

0

0

0

8

11

4

5

6

log constraints

log constraintsobjectconstraints

0 9

0 10

0 7

dynamic constraints

Page 26: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 26

Search vs. syntactic order

0

50

100

150

200

250

5 40 75 110 145 180 215 250

Number of actions

Solu

tion

siz

e OptimalConcatenateIceCubeSingle log

Page 27: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 27

Performance of IceCube heuristics

0

500

1000

1500

2000

2500

3000

1000

2000

3000

4000

5000

6000

7000

8000

9000

1000

0

Number of actions

Ex

ec

uti

on

tim

e (

ms

)

Total

Page 28: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

3. Cohérence à terme (Eventual consistency)

Page 29: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 29

Eventual consistency

Consistent with user intentsConsistent with data invariantsReplicas consistent with each other

Eventual consistency:Each site receives all actionsSchedule that satisfies constraintsCommon stable prefixEquivalent results

Page 30: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 30

Stability

Peer-to-peer, indefinite tentative update + advisory reconciliation OK

But stability needed:Users, external world depend on itGarbage collect multilogOnly stable actions relevant for

consistencyStable: eventually decisions not changed

Committed: definitely included in all schedules

Aborted: definitely excluded

Page 31: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 31

Eventual consistency: intuitive

Liveness: sites receive all operations Epidemic multicastQuickly

Safety: sites compute the same valueEquivalent schedules

Stability: actions eventually not undoneCommit / abort

Page 32: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 32

Sound schedules

s sound = s satisfies constraints for all its actions:Closed for MustHave

as ab bsConsistent with Order ( acyclic)

(a,b s ab) a before b in sActions succeed

a s a.preCondition (state)

Page 33: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 33

Maintaining local soundness site i, schedule si:

Legal:committedi si abortedi si =

Safe:si sound

When aborting a, also abort actions that MustHave a

When committing a, also abort uncommitted actions that are ‘Order’ed before a

Page 34: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 34

Schedule equivalence

Equivalence: s ts, t soundas atordering is irrelevant!

Eventual consistency reduces to:Same committed operations everywhereAll committed operations in every

scheduleSchedules are sound

Page 35: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 35

Eventual consistency

action a, site i,k, schedule si: Legal:

committedi si abortedi si = Safe:

si soundLive:

a committedi abortedi

a committedi a committedk

a abortedi a abortedk

Page 36: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 36

Global safety invariant

sound [t time, i sitescommittedi(t) ]Closed for MustHave Non-conflicting: Acyclic in Order Actions successful

s, a: a.preCondition (state)Very strong!

i commits a at t only if j won’t commit conflicting b at t’

a will succeed everywhere, anytime

Page 37: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 37

Maintaining global invariant

Alternatives:Common knowledge: deterministic

abort rule [idem commit?]TWR

Unilateral abort [idem commit?]CVS, Holliday 2000

Single primary site decidesBayou, CVS

Consensus before decidingDeno, Holliday 2000-2002

Page 38: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 38

Stability with TWR

Independent objects

Independent writes (no MustHave nor Order)

All sites take same decision:Given two writes to same object, abort

the earlierWhether concurrent or notWrite stable when seen by all sites

Disjointness: committedi =

Soundness: no MustHave (no transactions)

Page 39: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 39

Stability in Bayou

Databases:DisjointIndependent: no multi-DB transaction1 primary / database

Log constraints: transactions, time order

Disjointness: Only 1 site decides about a: the primary for the database that a updates

Soundness: whole transaction commits or aborts

Page 40: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 40

Holliday’s pre-commit protocol

Log constraints: multi-object transactionshappens-before order

Read transactions commit locally

Read-Write transactions: consensus to commitconvert locks to intentionspre-commit, votecommit if quorum ‘yes’abort if anti-quorum ‘no’ or conflict with

committed

Page 41: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 41

Trade-offsNo perfect solution

Common knowledge:syntactic: fast, inflexibleaborts, doesn’t commit

Partition + primarysingle point of failureno MustHave across partition boundaries

Consensusslowscalabilityimpossibility of consensus in asynchronous

systems with failure

Page 42: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

4. Conclusion

Page 43: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 43

Passage à l’échelle ?

Réplication en écriture :CC pessimiste = attendreCC optimiste = spéculer

• Progrès malgré pannes• Non transparent• Limité par le commit

Compromis possibles : partitionnerdiminuer la granularitélimiter nombre d’écrivains

Page 44: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 44

Perspectives

Importance grandissante du partageLecture et écriture

Commerce électroniquePertinence des techniques

Mandataires spécifiques encapsulant l’état du protocole

Java, .Net : ramasse-miettes répartiRéplication: centres Web, BDTravail déconnecté

Page 45: Partage dinformation à grande échelle Marc Shapiro Cambridge Distributed Systems Group.

Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 45

The end


Recommended