+ All Categories
Home > Documents > Scheduling & Resource Management in Distributed Systems Rajesh Rajamani, May 2001.

Scheduling & Resource Management in Distributed Systems Rajesh Rajamani, May 2001.

Date post: 06-Jan-2018
Category:
Upload: brent-fox
View: 223 times
Download: 5 times
Share this document with a friend
Description:
b Power = Work / Time b High Performance Computing Fixed amount of work; how much time? Fixed amount of work; how much time? Traditional Performance metrics: FLOPS, MIPS Traditional Performance metrics: FLOPS, MIPS Response time/latency oriented Response time/latency oriented b High Throughput Computing Fixed amount of time; how much work? Fixed amount of time; how much work? Application specific performance metrics Application specific performance metrics Throughput oriented Throughput oriented Power of Computing environments
25
Scheduling & Resource Management in Distributed Systems Rajesh Rajamani, [email protected] http://www.cs.wisc.edu/condor May 2001
Transcript
Page 1: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Scheduling & Resource Management in Distributed Systems

Rajesh Rajamani, [email protected]

http://www.cs.wisc.edu/condorMay 2001

Page 2: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

OutlineOutline Hi-throughput computing and CondorHi-throughput computing and Condor Resource Management in distributed systemsResource Management in distributed systems MatchmakingMatchmaking Current research/Misc.Current research/Misc.

Page 3: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Power = Work / TimePower = Work / TimeHigh Performance ComputingHigh Performance Computing

• Fixed amount of work; how much time?Fixed amount of work; how much time?• Traditional Performance metrics: FLOPS, MIPS Traditional Performance metrics: FLOPS, MIPS • Response time/latency orientedResponse time/latency oriented

High Throughput ComputingHigh Throughput Computing• Fixed amount of time; how much work?Fixed amount of time; how much work?• Application specific performance metrics Application specific performance metrics • Throughput orientedThroughput oriented

Power of Computing Power of Computing environmentsenvironments

Page 4: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

In other words …In other words … HPC - Enormous amounts of computing power over HPC - Enormous amounts of computing power over

relatively short periods of timerelatively short periods of time(+) Good for applications under sharp time constraint(+) Good for applications under sharp time constraint

HTC - Large amounts of computing power for HTC - Large amounts of computing power for lengthy periodslengthy periods

(+) What if u want to simulate 1000 applications on ur (+) What if u want to simulate 1000 applications on ur latest DSP chip design over the next 3 months??latest DSP chip design over the next 3 months??

Page 5: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

The Condor ProjectThe Condor Project Goal - To develop, Goal - To develop, implement, deploy,

and evaluate mechanisms and policies that support High Throughput Computing (HTC) on large collections of distributively owned computing resources

Page 6: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

More about CondorMore about Condor Started in late 80’s Started in late 80’s Principal Investigator - Prof.Miron LivnyPrincipal Investigator - Prof.Miron Livny Latest version 6.3.0 releasedLatest version 6.3.0 released Supports 14 different platforms (OS + Arch) Supports 14 different platforms (OS + Arch)

including Linux, Solaris and WinNTincluding Linux, Solaris and WinNT Currently employs over 20 students and 5 staffCurrently employs over 20 students and 5 staff We write code, debug, port, publish papers and We write code, debug, port, publish papers and

YES, we also provide support !!!YES, we also provide support !!!

Page 7: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Distributed ownership of Distributed ownership of resourcesresources

Underutilized - 70% of CPU cycles in a cluster go Underutilized - 70% of CPU cycles in a cluster go wastewaste

Fragmented - Resources owned by different peopleFragmented - Resources owned by different people Use these resources to provide HTC, BUT without Use these resources to provide HTC, BUT without

impacting QOS available to ownerimpacting QOS available to owner Achieved by allowing the user to set access policy Achieved by allowing the user to set access policy

using control expressionsusing control expressions

Page 8: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Access policyAccess policy Current state of the resource (eg, keyboard idle Current state of the resource (eg, keyboard idle

for 15 minutes or load average less than 0.2)for 15 minutes or load average less than 0.2)

Characteristics of the request (run only jobs of Characteristics of the request (run only jobs of research associates)research associates)

Time of day/night that jobs can be runTime of day/night that jobs can be run

Page 9: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

What happens when u What happens when u submit a jobsubmit a job

Central Manager

Submitting machine

Available resource

1. User submits a job

Resources announce theirproperties periodically

2. Submitting machine sendsClassad of the job

3. MatchmakerNotifies parties of a match

4. Parties negotiate

Page 10: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Important MechanismsImportant MechanismsMechanismMechanism ForFor

MatchmakingMatchmaking Resource ManagementResource Management

CheckpointingCheckpointing Saving the state of a jobSaving the state of a job

BypassBypass Remote system callsRemote system calls

DAGMANDAGMAN Automatic job Automatic job submission based on submission based on dependency graphdependency graph

Master-WorkerMaster-Worker Exploiting task level Exploiting task level parallelismparallelism

Page 11: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Condor ArchitectureCondor Architecture ManagerManager

• Collector: Database of resourcesCollector: Database of resources• Negotiator: MatchmakerNegotiator: Matchmaker• Accountant: Priority maintenanceAccountant: Priority maintenance

Startds ( Represent owners of resources)Startds ( Represent owners of resources)• Implement owner's access control policyImplement owner's access control policy

Schedds ( Represent customers of the system)Schedds ( Represent customers of the system)• Maintain persistent queues of resource requestsMaintain persistent queues of resource requests

Page 12: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Condor Architecture, cont.Condor Architecture, cont.

Page 13: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Power of CondorPower of Condor Solves NUG30 Quadratic assignment problem, posed in 1968 Solves NUG30 Quadratic assignment problem, posed in 1968

over a period of over a period of 6.9 days6.9 days, delivering over 96,000 CPU hours by , delivering over 96,000 CPU hours by commandeering an average of 650 machines !!!commandeering an average of 650 machines !!!

Compare this with the RSA-155 problem posed in 1977 and Compare this with the RSA-155 problem posed in 1977 and solved using 300 computers (over a period of 7 months) in the solved using 300 computers (over a period of 7 months) in the last 90s. If you were to use the same amount of resources as that last 90s. If you were to use the same amount of resources as that used to solve NUG30, this could’ve been done in used to solve NUG30, this could’ve been done in 2 weeks2 weeks !!! !!!

““It (Chorus production) was done in parallel on machines in the It (Chorus production) was done in parallel on machines in the computer center running XXX, and on the office machines under computer center running XXX, and on the office machines under Condor. The Condor. The latter did about 90%latter did about 90% of the work!” - of the work!” - - Helge MEINHARD- Helge MEINHARD (EP division, CERN)(EP division, CERN)

Page 14: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Resource management Resource management using Matchmaking using Matchmaking

Opportunistic Resource ExploitationOpportunistic Resource Exploitation• Resource availability is unpredictableResource availability is unpredictable

– Exploit resources as soon as they are availableExploit resources as soon as they are available– Matchmaking performed continuouslyMatchmaking performed continuously

As against a centralized scheduler which would’ve As against a centralized scheduler which would’ve to deal with -to deal with -• Heterogeneity of resourcesHeterogeneity of resources• Distributed Ownership - widely varying allocation Distributed Ownership - widely varying allocation

policiespolicies• Dynamic nature of the clusterDynamic nature of the cluster

Page 15: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Classified AdvertisementsClassified Advertisements A simple language used by resource providers and A simple language used by resource providers and

customers to express their properties/requirements to the customers to express their properties/requirements to the CollectorCollector

Uses a semi-structured data model => no specific schema is Uses a semi-structured data model => no specific schema is required by the matchmaker, allowing it to work naturally in required by the matchmaker, allowing it to work naturally in a heterogeneous enva heterogeneous env

Language folds query language into the data model. Language folds query language into the data model. Constraints may be expressed as attributes of the classadConstraints may be expressed as attributes of the classad

Should conform to advertising protocolShould conform to advertising protocol

Page 16: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Matchmaking with Matchmaking with ClassadsClassads

4 steps to managing resources -4 steps to managing resources -1.1. Parties requiring matchmaking advertise their Parties requiring matchmaking advertise their

characteristics, preferences, constraints, etc.characteristics, preferences, constraints, etc.2.2. Advertisements matched by a MatchmakerAdvertisements matched by a Matchmaker3.3. Matched entities are notifiedMatched entities are notified4.4. Matched entities establish an allocation through a Matched entities establish an allocation through a

claiming process - could include authentication, claiming process - could include authentication, constraint verification, negotiation of terms etcconstraint verification, negotiation of terms etc

Method is symmetricMethod is symmetric

Page 17: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Classad exampleClassad exampleSample classad of a workstationSample classad of a workstation

[ Type [ Type = “Machine”; = “Machine”; OpSys = “Linux”;OpSys = “Linux”; Arch = “INTEL”;Arch = “INTEL”; Memory = 256 M;Memory = 256 M; Constraint = Constraint = truetrue;;] ]

Sample classad of a JobSample classad of a Job

[ Type [ Type = “Job”;= “Job”; Owner Owner = “run_sim”;= “run_sim”; Constraint Constraint == other.Type ==“Machine” &&other.Type ==“Machine” &&Arch == “INTEL && Arch == “INTEL && Opsys == “Solaris251” &&Opsys == “Solaris251” &&Other.Memory >= Memory;Other.Memory >= Memory;] ]

Page 18: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Example Classad Example Classad (workstation)(workstation)

[ [ TypeType = = “Machine”;“Machine”;ActivityActivity == “Idle”;“Idle”;NameName == “crow.cs.wisc.edu”;“crow.cs.wisc.edu”;ArchArch == “INTEL”;“INTEL”;OpSysOpSys == “Solaris251”;“Solaris251”;KflopsKflops == 21893;21893;MemoryMemory = = 64;64;DiskDisk == 323496; 323496; //KB//KBDayTimeDayTime == 36107;36107;

Page 19: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Example Classad (contd.)Example Classad (contd.)

ResearchGrpResearchGrp = {“miron”, “thain”, “john”};= {“miron”, “thain”, “john”};UntrustedUntrusted = {“bgates”, “lalooyadav”, “thief”= {“bgates”, “lalooyadav”, “thief” };};RankRank = member(other.Owner, = member(other.Owner,

ResearchGrp)*10;ResearchGrp)*10;ConstraintConstraint = !member(other.Owner, Untrusted) = !member(other.Owner, Untrusted)

&& Rank >= 10 ?true : false&& Rank >= 10 ?true : false //To prevent //To prevent malicious usersmalicious users

]]

Page 20: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Example Classad Example Classad (Submitted job)(Submitted job)

[[TypeType == “Job”;“Job”;QDateQDate == 886799469;886799469;OwnerOwner == “raman”;“raman”;CmdCmd == run_sim;run_sim;IwdIwd == /usr/raman/sim2;/usr/raman/sim2;MemoryMemory == 31;31;RankRank == Kflops/1e3 + other.Memory/32;Kflops/1e3 + other.Memory/32;

ConstraintConstraint == other.Type == “Machine” && OpSys == other.Type == “Machine” && OpSys == “Solaris251”&& Disk >= 10000 && other.Memory >= self.Memory;“Solaris251”&& Disk >= 10000 && other.Memory >= self.Memory;

]]

Page 21: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

MatchmakingMatchmaking Evaluates expressions in an environment that allows each Evaluates expressions in an environment that allows each

classad to access attributes of the otherclassad to access attributes of the other• Other.Memory >= self.Memory;Other.Memory >= self.Memory;

References to non-existent attribute evaluates to References to non-existent attribute evaluates to undefinedundefined Considers pairs of ads incompatible unless their Considers pairs of ads incompatible unless their ConstraintConstraint

expressions both evaluate to trueexpressions both evaluate to true RankRank is then then used to choose among compatible matches is then then used to choose among compatible matches Both parties are notified about the match - could generate Both parties are notified about the match - could generate

and hand-off session key for authentication and securityand hand-off session key for authentication and security

Page 22: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Separation of Matching Separation of Matching and Claimingand Claiming

Weak consistency requirements - Claiming allows provider Weak consistency requirements - Claiming allows provider and customer to verify their constraints with respect to their and customer to verify their constraints with respect to their current statecurrent state

Claiming protocol could use cryptographic techniques Claiming protocol could use cryptographic techniques (authentication)(authentication)

Principals involved in a match are themselves responsible Principals involved in a match are themselves responsible for establishing, maintaining and servicing a match for establishing, maintaining and servicing a match

Page 23: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Work outside the Condor Work outside the Condor kernel- New challengeskernel- New challenges

Mulitlateral Matchmaking - GangmatchingMulitlateral Matchmaking - Gangmatching IO regulation and Disk allocation - KangarooIO regulation and Disk allocation - Kangaroo User interfaces - ClassadViewUser interfaces - ClassadView Grid applications - GlobusGrid applications - Globus Security Security

Page 24: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

SummarySummary Matchmaking provides a scalable and robust Matchmaking provides a scalable and robust

resource management solution for HTC resource management solution for HTC environments environments

Classads are used by workstations and jobs Classads are used by workstations and jobs Matchmaker forms the match and informs the Matchmaker forms the match and informs the

parties, who in turn invoke the claiming protocolparties, who in turn invoke the claiming protocol The parties are responsible for establishing, The parties are responsible for establishing,

maintaining and servicing a matchmaintaining and servicing a match Questions ? Questions ?

Page 25: Scheduling & Resource Management in Distributed Systems Rajesh Rajamani,  May 2001.

Gangmatch requestGangmatch request[[

TypeType = = “Job”;“Job”;OwnerOwner == “raj”;“raj”;CmdCmd == run_sim;run_sim;PortsPorts == {{ [ Label [ Label = “cpu”; = “cpu”; ImageSize ImageSize = 28 M; = 28 M; //Rank and constraints ],//Rank and constraints ], [Label[Label = “License”; = “License”; HostHost = cpu.Name;= cpu.Name; //Rank and constraints ]//Rank and constraints ]}}

]]


Recommended