+ All Categories
Home > Documents > The Roadmap to New Releases

The Roadmap to New Releases

Date post: 20-Jan-2016
Category:
Upload: obelia
View: 18 times
Download: 0 times
Share this document with a friend
Description:
The Roadmap to New Releases. Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison http://www.cs.wisc.edu/condor [email protected]. Stable vs. Development Series. Much like the Linux kernel, Condor provides two different releases at any time: Stable series - PowerPoint PPT Presentation
30
1 www.cs.wisc.edu/condor The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison http://www.cs.wisc.edu/condor [email protected]
Transcript
Page 1: The Roadmap to New Releases

1www.cs.wisc.edu/condor

The Roadmap to New Releases

Todd TannenbaumDepartment of Computer SciencesUniversity of Wisconsin-Madisonhttp://www.cs.wisc.edu/condor

[email protected]

Page 2: The Roadmap to New Releases

2www.cs.wisc.edu/condor

Stable vs. Development Series

› Much like the Linux kernel, Condor provides two different releases at any time: Stable series Development series

› Allows Condor to be both a research project and a production-ready system

Page 3: The Roadmap to New Releases

3www.cs.wisc.edu/condor

Stable series

› Series number in version is even (e.g. 6.2.0)

› Releases are heavily tested

› Only bug fixes and ports to new platforms are added on a stable series

Page 4: The Roadmap to New Releases

4www.cs.wisc.edu/condor

Stable series (cont.)

› A given stable release is always compatible with other releases from the same series

› Recommended for production pools

Page 5: The Roadmap to New Releases

5www.cs.wisc.edu/condor

Development Series

› Series number in the version is odd (e.g. 6.1.17, 6.3.1)

› New features and new technology are added frequently

› Versions from the same development series are not always compatible with each other

Page 6: The Roadmap to New Releases

6www.cs.wisc.edu/condor

Development Series (cont.)

› Releases are not as heavily tested

› Generally not recommended for production pools … unless new features are required … unless we recommend

otherwise :^)

Page 7: The Roadmap to New Releases

7www.cs.wisc.edu/condor

Where is Condor Today?

› Version 6.3.2 being released asap – this is the v6.4.0 release candidate.

› We expect version 6.4.0 released by the end of March.

Page 8: The Roadmap to New Releases

8www.cs.wisc.edu/condor

What’s new for Condor v6.4.0?

Page 9: The Roadmap to New Releases

9www.cs.wisc.edu/condor

New Ports in 6.4.0

› Full support (with checkpointing and remote system calls): RedHat 7.x

(Linux 2.4.x kernel + glibc 2.2.x)

Page 10: The Roadmap to New Releases

10www.cs.wisc.edu/condor

New Ports in 6.4.0 (cont.)

› ”Clipped" support (no checkpointing, PVM, or remote system calls, but all other functionality is available) Windows 2000 Mac OS X

Page 11: The Roadmap to New Releases

11www.cs.wisc.edu/condor

Secure Communication› Secure network communication

Strong user authentication • Multiple methods supported: Kerberos,

X509, NT LanMan, …

Encryption Integrity

› Authorization based on host or user

Page 12: The Roadmap to New Releases

12www.cs.wisc.edu/condor

New Job Universes

› MPI Universe Launch MPI jobs linked with MPICH

library

› Globus Universe Faster, more reliable, better

integrated

› Java Universe

Page 13: The Roadmap to New Releases

13www.cs.wisc.edu/condor

Java Universe Jobuniverse = javaexecutable = Main.classjar_files = MyLibrary.jarinput = infileoutput = outfilearguments = Main 1 2 3queue

condor_submit

Page 14: The Roadmap to New Releases

14www.cs.wisc.edu/condor

Why not use Vanilla Universe for Java jobs?

› Java Universe provides more than just inserting “java” at the start of the execute line Knows which machines have a JVM installed Knows the location, version, and

performance of JVM on each machine Provides more information about Java job

completion than just JVM exit code• Program runs in a Java wrapper, allowing Condor

to report Java exceptions, etc.

Page 15: The Roadmap to New Releases

15www.cs.wisc.edu/condor

Java support, cont.

condor_status -java

Name JavaVendor Ver State Activity LoadAv Mem

aish.cs.wisc. Sun Microsy 1.2.2 Owner Idle 0.000 249

anfrom.cs.wis Sun Microsy 1.2.2 Owner Idle 0.030 249

babe.cs.wisc. Sun Microsy 1.2.2 Claimed Busy 1.120 123

...

Page 16: The Roadmap to New Releases

16www.cs.wisc.edu/condor

Condor File Transfer

› Condor will transfer job files from the submit machine to the execute machine

› Files to send and/or receive specified at submit time

› Transfer is atomic All files are transferred, or transfer fails

› Appeared in v6.2 only in Condor for Windows

Page 17: The Roadmap to New Releases

17www.cs.wisc.edu/condor

File Transfer, cont.

› Example: transfer_input_files = x, y, z … transfer_output_files = a, b, c …. transfer_files = [ ALWAYS | ONEXIT ]

› Note: Condor can automatically figure out output files Default: Send back any new/changed files

Page 18: The Roadmap to New Releases

18www.cs.wisc.edu/condor

Remote I/O Socket› Job can request that the condor_starter

process on the execute machine create a Remote I/O Socket

› Used for online access of file on submit machine – without Standard Universe. Use in Vanilla, Java, …

› Libraries provided for Java and for C, e.g. :Java: FileInputStream -> ChirpInputStream

C : open() -> chirp_open()

Page 19: The Roadmap to New Releases

Job

Fork

startershadow

HomeFile

System

I/O Library

I/O Server I/O Proxy

Secure Remote I/O

Local System Calls

Local I/O(Chirp)

Execution SiteSubmission Site

Page 20: The Roadmap to New Releases

20www.cs.wisc.edu/condor

Job Policy Expressions› User can supply job policy

expressions in the submit file.› Can be used to describe a successful

run.on_exit_remove = <expression>on_exit_hold = <expression>periodic_remove = <expression>periodic_hold = <expression>

Page 21: The Roadmap to New Releases

21www.cs.wisc.edu/condor

Job Policy Examples› Do not remove if exits with a signal:

on_exit_remove = ExitBySignal == False› Place on hold if exits with nonzero status or ran

for less than an hour: on_exit_hold = ((ExitBySignal==False) &&

(ExitSignal != 0)) || ((ServerStartTime – JobStartDate) < 3600)

› Place on hold if job has spent more than 50% of its time suspended:

periodic_hold = CumulativeSuspensionTime > (RemoteWallClockTime / 2.0)

Page 22: The Roadmap to New Releases

22www.cs.wisc.edu/condor

Firewall Support› Port Restrictions

In condor_config file can specify: LOWPORT = x HIGHPORT = y All dynamic ports will be between x and y

inclusive

› Condor + Firewalls/Private Networks: Who: Se-Chang Son Time: 9am-12pm Weds Where: rm 3387

Page 23: The Roadmap to New Releases

23www.cs.wisc.edu/condor

Condor on Windows› On both NT and Win2k› New universes added: MPI, Java,

Scheduler (and Globus in the works!)› DAGMan ported› CondorView ported› Run shadow + DAGMan as the user

Allows submission from directories on shared filesystems

Page 24: The Roadmap to New Releases

24www.cs.wisc.edu/condor

And more…

› Unix Man pages

› Fetch/consolidate log files remotely

› ClassAd chaining

› Many DAGMan improvements

› Bug fixes, etc…

Page 25: The Roadmap to New Releases

25www.cs.wisc.edu/condor

What’s Next?Future Directions

› Increased focus on standalone tools built with Condor Technology DAGMan NeST PFS HawkEye Condor-G …

Page 26: The Roadmap to New Releases

26www.cs.wisc.edu/condor

What’s Next?

› Big Item: More focus on being a service

provider than just an end-user tool: Developer APIs / libraries SOAP access to services XML representations of user logs,

ClassAds, accounting info, etc.

Page 27: The Roadmap to New Releases

27www.cs.wisc.edu/condor

More what’s next…

› Condor on Windows Increased support from Microsoft

Research Remote I/O Complete Shared Filesystem support Condor-G

› MPI Scheduling Improvements

Page 28: The Roadmap to New Releases

28www.cs.wisc.edu/condor

More what’s next…› New version of ClassAds into Condor

Conditionals !! • if/then/else

Aggregates (lists, nested classads) Built-in functions

• String operations, pattern matching, time operators, unit conversions

Clean implementations in C++ and Java ClassAd collections

Page 29: The Roadmap to New Releases

29www.cs.wisc.edu/condor

More what’s next…

› Re-write of the condor_schedd Performance enhancements and

lowered resource requirements (particularly RAM)

› Re-write of the checkpoint server Add secure communication NEST technology infusion Enhanced support for multiple servers Store meta-data along with checkpoint

files

Page 30: The Roadmap to New Releases

30www.cs.wisc.edu/condor

Thank you for coming to Paradyn/Condor

Week!


Recommended