Date post: | 12-May-2015 |
Category: |
Technology |
Upload: | igor-sfiligoi |
View: | 122 times |
Download: | 3 times |
Aug 2014 HTCondor Overview 1
glideinWMS Training
HTCondor Overviewby Igor Sfiligoi, UC San Diego
Aug 2014 HTCondor Overview 2
Overview
● These slides present a HTCondor overview, with high level views of– Deamons involved
– Communication paths
– Scalability considerations
Aug 2014 HTCondor Overview 3
HTCondor DaemonsThe basics
Collector
Schedd Startd
Negotiator
Submit side Execute sideGlue
Aug 2014 HTCondor Overview 4
HTCondor DaemonsThe basics + the master
Collector
Schedd Startd
Negotiator
Submit side Execute sideGlue
Master Master
Master
Aug 2014 HTCondor Overview 5
HTCondor Daemons
● One startd per (logical) compute resource– Can handle multiple CPUs
● One schedd per submit node– Can handle multiple users
● Collector has the list of all other daemons● Negotiator matches user jobs to machine slots● Master starts all other processes
– Will ignore it in the rest of the talk
Aug 2014 HTCondor Overview 6
Communication flow
Collector
Schedd Startd
Negotiator
Push:I am here andthese are my properties
One ClassAdx slot
Push:I am (still) here
Pull:Send me the listof idle jobs
Aug 2014 HTCondor Overview 7
Claiming protocol
● Startds keep their state current in the Collector– By periodically pushing updates
(every 5 mins by default)
● On a matchmaking cycle, the Negotiator will – Pull the startd slot list from the collector
● In Unclaimed state only (unless preemption enabled)
– Pull the job list from the schedds
– Create a priority list of matches
– Send the matches to relevant schedds
Aug 2014 HTCondor Overview 8
Claiming protocol
● The schedd will contact the startd– Once the connection is accepted,
the schedd owns that slots
● The schedd will spawn a shadow– Which takes over the connection
– The schedd moves on to other business
● Similarly, the startd spawns a starter– And advertise a Claimed state
Aug 2014 HTCondor Overview 9
Communication flow
Collector
Schedd Startd
Negotiator
Claimed/Idle
Aug 2014 HTCondor Overview 10
HTCondor DaemonsStage 2
Collector
Schedd Startd
Negotiator
Shadow Starter
A shadow and a starter are created for every running job
Claimed/Busy
Aug 2014 HTCondor Overview 11
HTCondor DaemonsStage 2
Collector
Schedd Startd
Negotiator
Shadow Starter
If the network connection is lost, either side can re-establish it.
Claimed/Busy
✗
Aug 2014 HTCondor Overview 12
HTCondor Daemons
● The shadow takes ownership of a running job– One per job
● The starter takes ownership of a claimed slot– One per slot
● Together they babysit the two sides until the jobs is done and the slot can be un-claimed
● Corollary:– Each schedd node will have O(10k) shadows
Aug 2014 HTCondor Overview 13
Claiming protocol
● Once the job terminates– The starter goes away
– The schedd will send another job to the startd,unless
● The lease has expired● There are no more suitable jobs
– The existing shadow can be reused● But does not need to
● If the schedd does not send a new job– Startd goes into UnClaimed state
Aug 2014 HTCondor Overview 14
Matchmaking and latency
● The Negotiator pulls the startd slot list from the Collector– In a single transaction
● The Negotiator pulls the job list from the schedds– Basically, one at a time!– But it does cluster similar jobs together at Schedd level– The idea being that it will not ask for more, if either the
user runs out of priority or no more slots are available● Negotiator thus sensitive to Network latencies
– Matching Schedds far away may be limited by network latency not Negotiator CPU use
Aug 2014 HTCondor Overview 15
Security considerationsThe glidein use case
Collector
Schedd Startd
Collector center of all trust
Mutual authentication betweenStartd and Collector
using x509 whitelisting
Mutual authentication betweenStartd and Collector/Negotiatorusing x509 whitelisting
Negotiator
Negotiator and Collectorco-located, FS auth
Aug 2014 HTCondor Overview 16
Security considerationsThe glidein use case
Collector
Schedd Startd
Collector center of all trust
After initial handshake,use shared secret
After initial handshake,use shared secret
Negotiator
Negotiator and Collectorco-located, FS auth
Full x509 expensive, used only on daemon restartand periodically once every few days
Aug 2014 HTCondor Overview 17
Security considerationsThe glidein use case
Collector
Schedd Startd
Collector center of all trust
Startd also sendsshared secretfor matchmaking
Aug 2014 HTCondor Overview 18
Security considerationsThe glidein use case
Collector
Schedd Startd
Collector center of all trust
Negotiator only authorizeduser of Startd's shared secrets
Negotiator
Startd shared secretsent on job match
Schedd may get many secrets,one per matched job
Aug 2014 HTCondor Overview 19
Security considerationsThe glidein use case
Collector
Schedd Startd
Collector center of all trust
Use given shared secret for auth
No other credentialsin play
Aug 2014 HTCondor Overview 20
Security considerationsThe glidein use case
Collector
Schedd Startd
Collector center of all trust
Shadow StarterShadow and starter inherit the socket
Also inherit shared secret, for reconnect
Aug 2014 HTCondor Overview 21
Security considerationsThe glidein use case
Collector
Schedd Startd
Collector center of all trust
If Startd goes inUnclaimed state,a new secret iscreated and sent
Aug 2014 HTCondor Overview 22
Security cost and scalabilityThe glidein use case
Collector
Schedd Startd
x509 too expensive for a single central service(both due to CPU use, and network latency issues)
Mutual authentication betweenStartd and Collector
using x509 whitelisting
Mutual authentication betweenStartd and Collector/Negotiatorusing x509 whitelisting
Glideins can startat 10+Hz rate
Aug 2014 HTCondor Overview 23
Security cost and scalabilityThe glidein use case
Collector
Schedd Startd
Spread the load over multiple child CollectorsChild collectors forward all ads
Mutual authentication betweenStartd and child Collector
using x509 whitelistingNew Scheddsjoin rarely
Collector
Co-located,thus cheap
x N
Collector...
Randomly pick oneof many child Collectors
and then stick with it
Aug 2014 HTCondor Overview 24
Network/Firewall considerations
Collector
Schedd Startd
Negotiator
Shadow Starter
HTCondor is conceptually a Peer-to-Peer systemEveryone talks to everyone
Aug 2014 HTCondor Overview 25
Network/Firewall considerations
Collector
Schedd Startd
Shadow Starter
Execute nodes often behind firewalls and/or NATs
✓
Aug 2014 HTCondor Overview 26
Network/Firewall considerations
Collector
Schedd Startd
CCB protocol creates a tunnelCollector implements the CCB
✓ Startd->Collectorcommunicationstill direct
A separatechannel overlong livedTCP socket
Aug 2014 HTCondor Overview 27
Network/Firewall considerations
Collector
Schedd Startd
CCB protocol creates a tunnelCollector implements the CCB
✓CCB delivers messages
to the startd
Aug 2014 HTCondor Overview 28
Network/Firewall considerations
Collector
Schedd Startd
CCB protocol creates a tunnelCollector implements the CCB
✓Only callback requestgoes through CCB Startd opens
a long livedTCP connectionto the Schedd
All further communicationon that channel from there on
✓
Aug 2014 HTCondor Overview 29
Network/Firewall considerations
Collector
Schedd Startd
CCB protocol creates a tunnelCollector implements the CCB
✓
Shadow and Starterinherit this socket
Shadow Starter
✓
Aug 2014 HTCondor Overview 30
CCB and scalability
Collector
Startd
A single central service cannot really handle all the loadProcesses usually limited to O(1k) sockets
We can haveO(10k+) glideins
Aug 2014 HTCondor Overview 31
CCB and scalability
Collector
Startd
No real need to use a single CCB,could use any number of dedicated CCB Collectors
Collector
x N
Collector...
The standard strategy is to just piggy-back on
the “Child Collector” paradigm
Randomly pick oneof many CCBs
and then stick with it
Aug 2014 HTCondor Overview 32
CCB and scalability
Collector
Schedd Startd
✓Only callback requestgoes through CCB Startd opens
a long livedTCP connectionto the Schedd
✓
The Schedd now needs to accept incoming connectionsDefault HTCondor mechanism of “one port x connection” does not scale
(only ~30k usable ports in IP)
Aug 2014 HTCondor Overview 33
CCB and scalability
Collector
Shared_Port_Daemon Startd
✓Only callback requestgoes through CCB
Startd opens a long livedTCP connection to the shared_port_daemon
✓
HTCondor added shared_port_daemon to multiplex requests on a single port
Schedd
Specifying that itis for the schedd
Aug 2014 HTCondor Overview 34
CCB and scalability
Collector
Shared_Port_Daemon Startd
✓
Socket is movedto the schedd ✓
HTCondor added shared_port_daemon to multiplex requests on a single port
Schedd
Same node,local UNIX command
Aug 2014 HTCondor Overview 35
CCB and scalability
Collector
Shared_Port_Daemon StartdSocket is movedto the schedd
✓
Can be used by starter to contact the Shadow, too
Shadow
Same node,local UNIX command
Starter
Aug 2014 HTCondor Overview 36
CCB and scalability
Collector
Startd
✓
Starter also accepts incoming connectionsThus needs a CCB connection
Starter
✓Plus there is normallyone for the Master, too
And there isone Starterper slot
x N
Starter
Aug 2014 HTCondor Overview 37
CCB and scalability
Collector
Startd
✓
Adding a shared_port_daemon will cutnumber of CCB connections to exactly one
Starter
x N
Starter
Shared_Port_Daemon
Route incomingrequests to appropriatedaemon
Aug 2014 HTCondor Overview 38
High Availability setup
Collector
Schedd Startd
Negotiator
Collector
x N
Collector...
Central Manager Node
Using a single CM node risky; if it dies, the pool dies with it.
Having multiple child collectorsdoes not help
Aug 2014 HTCondor Overview 39
High Availability setup
Collector
Schedd Startd
Negotiator
Collector
x N
Collector...
Central Manager Node
HTCondor allows for 2 or more CM nodes
Schedds and Startds talk to all of themIncluding one CCB per CM node
Collector
Negotiator
Collector
x N
Collector...
Central Manager Node
Aug 2014 HTCondor Overview 40
High Availability setup
Collector
Schedd Startd
Negotiator
Collector
x N
Collector...
Central Manager Node
There can be only one active Negotiator,to make user priority decision
HAD daemons maintain only one alivewith others in standby mode
Collector
Negotiator
Collector
x N
Collector...
Central Manager Node
HADHAD
Aug 2014 HTCondor Overview 41
High Availability setup
Collector
Schedd Startd
Negotiator
Collector
x N
Collector...
Central Manager Node
Schedd “HA” typically just “partition the jobs between many schedds”Temporary Schedd downtimes result in other schedds taking over the slots
HAD daemons maintain only one alivewith others in standby mode
Collector
Negotiator
Collector
x N
Collector...
Central Manager Node
HADHAD
No real needfor Startd HA
Schedd
x M...
True Schedd HApossible, butrequires shared FS
Aug 2014 HTCondor Overview 42
The end