Date post: | 04-Apr-2015 |
Category: |
Documents |
Upload: | gilberte-barbier |
View: | 105 times |
Download: | 1 times |
Partage d’information à grande échelle
Marc Shapiro
Cambridge Distributed Systems Group
1. Résumé des épisodes précédents…
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 3
Grandes étapes
1978: Ingénieur ENSEEIHT1978--1980: doctorant LAAS1980--1982: Post-doc MIT Lab. for Computer Science1982--1985: Chargé de Recherches CMIRH
1984--1985: INRIA1986--1999: Directeur de recherche, responsable
scientifique INRIA projet SOR 1993--1994: Cornell 1997: Jini, Sun Research Labs
1999+: Senior Researcher, MSR Cambridge
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 4
Thèses encadrées
Yek Loong Chong. U. de Cambridge, 2003.Nicolas Richer. Paris 6, 2002.Fabrice le Fessant (co-encadrement). Polytechnique, 2001.Xavier Blondel. CNAM, 2000.Aline Baggio. Paris 6, 1999.Georges Brun-Cottan. Paris 6, 1998.Julien Maisonneuve, Paris 6, 1996.Paulo Ferreira. Paris 6, 1996.Hervé Soulard. Paris 6, 1995.David Plainfossé. Paris 6, 1994.Daniel Edelson. UC Santa Cruz, 1993.Michel Ruffin. Paris 6, 1992.Yvon Gourhant. Paris 6, 1991.Sabine Habert. Paris 6, 1989.Mesaac Mounchili Makpangou. Paris 6, 1989.
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 5
Publications
Computing Surveys, en cours
PODC 2001
Livre Springer 2000
PLDI 1998
ECOOP 1998
ICDCS 1996
IWMM 1995
WDAG 1995
Livre IEEE 1994
OSDI 1994
ICDCS 1994
PODC 1992
RDS 1991
Computing Systems 1989
SOSP 1989
ICDCS 1986
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 6
Le partage de l’information à grande échelle
Complexité des systèmes répartis : parallélisme, pannes, événement, latence
Objets: sémantique non prédéfinie
Système :outils communsaspects dynamiquescompromis
Traiter les problèmes de fond, impact sur le long terme
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 7
Objets Fragmentés1981-1990
StructurationContrôle de la transparenceMandataire
programmablespécifique au clientétat du protocole
Système SOSVoir
proxy WebJinipages Web dynamiques
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 8
Chaînes de Paires Souche-Scion 1990—1999
Automatiser la gestion d’objets en réparti
Références RM locaux + coordination
asynchroneSimplicité + efficacitéTolérant les pannes
détectablesDérivation formelle
x
y
z
t
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 9
RM asynchrone en mémoire répliquée 1993—2000
Scalability asynch.ConsistencyLocal GCDistributed GC
Sufficient safety rulesUnion ruleClean propagationComprehensive scanningCreate before deleteCausal delivery
PerDiS
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 10
Réplication optimiste 2001+
Partage en écriture décentralisé
IceCube : moteur général de réconciliationjournalisation d’opérationsparamétré par la sémantique
Cohérence à terme :modèle : opérations + contraintessûreté + vivacitéinvariant globaldécrire et comprendre les solutions
Très préliminaire
2. Réplication optimiste et IceCube
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 12
Optimistic replication
Replicas of shared objects on sitesWithout synchronisation:
peer-to-peer read and update!
Applications:high latency networksdisconnected operationcooperative work
Improves availability & performanceConsistency: a posteriori, offline
Merge independent updates
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 13
Example: cooperative engineering with CVS
CVS: developing shared code
Local, disconnected replica: no interference
Conflicts:Write same file = syntacticOverlap in file = violates edit semanticsDoesn’t compile, test = violates
application semantics
Both sides of a conflict are excluded
Manual repair
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 14
Example: Bayou
General-purpose databaseAny replica can update, log actions
action = { dependency check, operation, merge-procedure }
Optimistic replication:epidemic exchange logs{ roll-back, replay }*; commitdep-check: semantic check for conflict merge-proc: semantic repair
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 15
Operation-based model
0
0
1
2
0
0
4
3
schedulingcommitment
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 16
Execution model
operation = code + pre/post-conditionsSchedule must satisfyViolation conflictBut pre/post-conditions often unknown
Conservative approximations
pre(x0) post(x0, f(x0))
x1:= f(x0)
pre(x0) post(x’1, g(x0))
x’1:= g(x0)
pre(x1) post(x1, g(x1))
x2:= g(x1)
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 17
Happens-beforeTrue constraints unknowne1 precedes e2 in processe1 sends, e2 receives
e1 e2
(e1 e2) (e2 e1) e1 || e2
e1 || e2: e1 does not cause e2
e1 e2: e1 might cause e2
Partial order, consistent with causal dependence
Schedule consistent with
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 18
Syntactic vs. semantic mechanisms
Scalar timestampsno concurrency detectionvery conservative approx.
of causalityVector timestamps
detect concurrencyconservative approx. of
causality
Alternative: explicit constraints
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 19
Constraints between operations
Not all schedules are acceptable
Constraints, e.g.x > 50respect causal orderingall-or-nothing transactionsalternative execution pathsconflicting operations exclude each
other
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 20
IceCube: Primitive constraints
Constraint = predicate (action, schedule)
Declarative (“static binary”):MustHave: a b
if as and ab then bs(not necessarily contiguous nor in
order)Order: a b
if a, bs and ab then a before b in s(not necessarily both nor contiguous)
Imperative (dynamic): a.preCondition (State)
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 21
IceCube: log constraints
parcelpredecessor-
successor
alternatives
Express user intents:Predecessor/successor: a b b a
b uses effect of a; “a causes b”Parcel: a b b a
transactionAlternatives: a b b a
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 22
IceCube: Object constraints
Shared data type advertises static semanticsmutually exclusive a b b a
best order (e.g. bank: credits before debits) a b
Only between concurrent actions
Also: dynamic constraints
commutebestorder
mutuallyexclusive
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 23
Optimistic concurrency control & scheduling
Two actions are either:Dependent
schedule in dependence orderCommutative
schedule in any orderConcurrent with favourable order
schedule in non-conflicting orderConcurrent and conflicting
or exclude one, the other, or both
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 24
IceCube scheduling
Insight: conflict: choice of which action to excludemaximise value
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 25
IceCube scheduling model
0 1
0 2
0
0
0
0
0
8
11
4
5
6
log constraints
log constraintsobjectconstraints
0 9
0 10
0 7
dynamic constraints
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 26
Search vs. syntactic order
0
50
100
150
200
250
5 40 75 110 145 180 215 250
Number of actions
Solu
tion
siz
e OptimalConcatenateIceCubeSingle log
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 27
Performance of IceCube heuristics
0
500
1000
1500
2000
2500
3000
1000
2000
3000
4000
5000
6000
7000
8000
9000
1000
0
Number of actions
Ex
ec
uti
on
tim
e (
ms
)
Total
3. Cohérence à terme (Eventual consistency)
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 29
Eventual consistency
Consistent with user intentsConsistent with data invariantsReplicas consistent with each other
Eventual consistency:Each site receives all actionsSchedule that satisfies constraintsCommon stable prefixEquivalent results
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 30
Stability
Peer-to-peer, indefinite tentative update + advisory reconciliation OK
But stability needed:Users, external world depend on itGarbage collect multilogOnly stable actions relevant for
consistencyStable: eventually decisions not changed
Committed: definitely included in all schedules
Aborted: definitely excluded
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 31
Eventual consistency: intuitive
Liveness: sites receive all operations Epidemic multicastQuickly
Safety: sites compute the same valueEquivalent schedules
Stability: actions eventually not undoneCommit / abort
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 32
Sound schedules
s sound = s satisfies constraints for all its actions:Closed for MustHave
as ab bsConsistent with Order ( acyclic)
(a,b s ab) a before b in sActions succeed
a s a.preCondition (state)
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 33
Maintaining local soundness site i, schedule si:
Legal:committedi si abortedi si =
Safe:si sound
When aborting a, also abort actions that MustHave a
When committing a, also abort uncommitted actions that are ‘Order’ed before a
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 34
Schedule equivalence
Equivalence: s ts, t soundas atordering is irrelevant!
Eventual consistency reduces to:Same committed operations everywhereAll committed operations in every
scheduleSchedules are sound
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 35
Eventual consistency
action a, site i,k, schedule si: Legal:
committedi si abortedi si = Safe:
si soundLive:
a committedi abortedi
a committedi a committedk
a abortedi a abortedk
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 36
Global safety invariant
sound [t time, i sitescommittedi(t) ]Closed for MustHave Non-conflicting: Acyclic in Order Actions successful
s, a: a.preCondition (state)Very strong!
i commits a at t only if j won’t commit conflicting b at t’
a will succeed everywhere, anytime
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 37
Maintaining global invariant
Alternatives:Common knowledge: deterministic
abort rule [idem commit?]TWR
Unilateral abort [idem commit?]CVS, Holliday 2000
Single primary site decidesBayou, CVS
Consensus before decidingDeno, Holliday 2000-2002
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 38
Stability with TWR
Independent objects
Independent writes (no MustHave nor Order)
All sites take same decision:Given two writes to same object, abort
the earlierWhether concurrent or notWrite stable when seen by all sites
Disjointness: committedi =
Soundness: no MustHave (no transactions)
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 39
Stability in Bayou
Databases:DisjointIndependent: no multi-DB transaction1 primary / database
Log constraints: transactions, time order
Disjointness: Only 1 site decides about a: the primary for the database that a updates
Soundness: whole transaction commits or aborts
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 40
Holliday’s pre-commit protocol
Log constraints: multi-object transactionshappens-before order
Read transactions commit locally
Read-Write transactions: consensus to commitconvert locks to intentionspre-commit, votecommit if quorum ‘yes’abort if anti-quorum ‘no’ or conflict with
committed
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 41
Trade-offsNo perfect solution
Common knowledge:syntactic: fast, inflexibleaborts, doesn’t commit
Partition + primarysingle point of failureno MustHave across partition boundaries
Consensusslowscalabilityimpossibility of consensus in asynchronous
systems with failure
4. Conclusion
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 43
Passage à l’échelle ?
Réplication en écriture :CC pessimiste = attendreCC optimiste = spéculer
• Progrès malgré pannes• Non transparent• Limité par le commit
Compromis possibles : partitionnerdiminuer la granularitélimiter nombre d’écrivains
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 44
Perspectives
Importance grandissante du partageLecture et écriture
Commerce électroniquePertinence des techniques
Mandataires spécifiques encapsulant l’état du protocole
Java, .Net : ramasse-miettes répartiRéplication: centres Web, BDTravail déconnecté
Soutenance HDR -- 2002-12-12 Gestion répartie d'objets 45
The end