LegionRYAN BARTLETT, TIMOTHY
VIRGILLO
What is Legion? The Legion project was born with the determination to build, test, deploy and ultimately transfer to industry, a robust, scalable, Grid computing software infrastructure
An object-based metasystems software that was created at the University of Virginia
A single, coherent virtual machine that addresses key grid issues such as scalability, programming ease, fault tolerance, security, and site autonomy
The user can sit at a terminal and manipulate objects on several processors, but has the illusion of working on a single powerful computer.
1. Site Autonomy - resources are owned and controlled by an array of organizations
2. Extensible Core – The components that comprise Legion’s Core are designed to be replaceable and extensible
3. Scalable Architecture - Completely distributed system to allow for the scalability needed to handle millions of hosts
4. Easy to Use - hide the complexity of the system to create the illusion of working with a single powerful computer
5. High Performance – Large degree of parallelism, requiring parallelization of tasks, data and their arbitrary combinations
10 Key Design Objectives
6. Single Persistent Name space - one name space for file and data access
7. Security - provide mechanisms to allow users to manage the security of their own objects
8. Heterogeneous Resource Management - cross platform to support different types of hardware and software
9. Multiple Language Support - integrate different types of source languages
10. Fault Tolerance - deal with host, communication links, and disk failures with the dynamic reconfiguration
The Legion Grid is comprised of an array of organizations’ resources
To ensure organizations may be willing to participate in Legion, and contribute their resources, each organization must be assured control of their own resources
Site Autonomy
Cannot relace host operating systems◦ Encourages orginizations to contribute resouces to the grid by not requireing resources be dedicated
solely for Legion
Cannot make changes to the interconnection network◦ Legion cannot assume that all resource networks will be within the user's control
Cannot insist that it be run as "root"◦ For security reasons, Legion does not require root access to its resources to function
Design Constraints
Everything is an object◦ Defined as an active process that responds to member function invocations from other objects in the system
Objects are independent and abstracted from the logical address space
Objects communicate with each other through non-blocking method calls
Legion handles the message format and high-level protocol for object interaction, but not the programming language or communication protocol
Each object maintains a local binding cache
Objects may be one of two different states: Active or Inert
Legion Object
Active Objects◦ Run as a process that is ready to accept
member function invocations◦ Object state is maintained in the address space
of the process
Inert Objects ◦ represented by an object-persistent
representation (OPR)◦ OPR is a set of associated bytes residing in
stable storage in the Legion System◦ OPR contains the information that enables the
object to move from Inert to Active
Object States: Inert or Active
Replicating an object ◦ create an Object Address with multiple physical addresses in its list◦ Assign address semantics◦ Bind the LOID of the object to this Object Address
Object Replication
Legion Object Address (LOA)◦ physical address or set of addresses in the case of replicated objects
Legion Object Identifiers (LOIDs)
Context Names◦ Mapped by a directory service called Context Space ◦ human readable strings◦ each Context Name is mapped to a LOID
Legion’s 3 Level Naming System
Contexts A typical context space has a well known root context which in turn “points” to other contexts, forming a directed graph
Support operations that lookup a single string, return all mappings, add a mapping, and delete a mapping
Object Address Element contains 2 basic parts
◦ 32 bit address type field◦ 256 bits of address specific
information
Object Addresses An Object Address
◦ a list of Object Address Elements◦ semantic information that describes how to utilize the
list
Represents an arbitrary communication endpoint, such as a TCP socket
Every Legion object is named by a Legion Object Identifier (LOID)
Legion Object Identifiers (LOIDs)◦ location independent identifiers◦ includes an RSA public key (public-key cryptosystem)◦ each LOID is mapped to a LOA
LegionClass is responsible for handing out unique Class Identifiers to each new class
LOID’s
Bindings from LOID’s to Object Addresses are implemented as triples
Bindings Bindings are first class entities that can be passed around the system and cached within objects
A binding Consists of:◦ LOID◦ Object Address◦ Time that the binding becomes
invalid
Binding Agents are responsible for returning a binding to an Object Address for the object that the LOID names
The persistent state of each Legion Object contains the Object Address of its Binding Agent
Binding Agents◦ objects that map LOIDs with LOAs
Context Objects◦ objects that map Context Names with LOIDS
Host Objects◦ Represents processes
Implementation Objects◦ Executable to handle creation or activation of an object◦ Is transferred from a class object to a host object to
enable the host to create processes with the correct characteristics
Vault◦ Represents persistent storage, for the purpose of
maintaining state, in OPR’s, of the inert Legion objects supported by the vault
Legion Core Objects
Every Legion object is defined and managed by its Class object
Class objects are given System-Level responsibility◦ Create new instances◦ Schedule them for execution◦ Activate/Deactivate objects◦ Provide information about their current location to client objects that wish to communicate with them
Classes and Metaclasses
Millions and MILLIONS of hosts and TRILLIONS of objects, yo
Legion is designed to be decentralized and fully distributed
Applications at the client
Legion Object shared across the Legion System
Scalable Architecture
Every object publishes an interface◦ Inheritable◦ Extendable◦ Specializable
As technology changes and improves, resources in the Legion Grid can be changed or replaced without hindering the system
Extensible Core
High Performance via Resource Selection◦ Choosing hosts with lowest load or greatest
processing power◦ User-level scheduling agents
High Performance via Parallism◦ Support libraries such as MPI◦ Support parallel languages such as MPL◦ Offer wrap parallel component◦ Exporting the run-time library interface to
library, toolkit, and compiler writers
High Performance
Scheduling policies are chosen by the user
Users can create their own schedulers for specific applications
The Legion Scheduling Model
Problems:
Installing Legion without causing significant risk to the system it is installed on
How to protect and control resources
Solutions:
Legion does not require any special privileges or "root" access
Legion allow users to choose what types and levels of security they want for their own objects
In addition, every Legion object contains a function called "MayI"
Security
Public-key cryptography based on RSAREF 2.0.
Three message-layer security modes: private (encrypted communication), protected (fast digested communication with unforgeable secrets to ensure authentic replies to message calls), and no security.
Caching secret-keys for faster encryption of multiple messages between communicating parties.
Auto-encrypted bearer credentials with free-form rights. Propagation of security modes and certificates through calling trees (e.g., if a caller demands encryption, all downstream calls will use it automatically).
Security
Security Drop-in addition of MayI functionality to existing objects.
Persistent authentication objects that serve as the representation for users in a trust domain.
Secure legion shell to allow users to login to their authentication objects and obtain associated credentials and environment information.
Isolation and protection of objects using local OS accounts.
Easily checked Process Control Daemon for granting limited OS privileges to Legion Host Objects.
Context space configured with access control for multiple users.
Automatic failure detection and recovery
◦ Hosts, jobs, and queues automaticall back up their current state to prevent loss of information
◦ Dynamic configuration allows processes to change resources without interupting operations
◦ If a host is lost or unavailable, the job is automatically migrated to another host
Fault Tolerance
Of the early grid-computing solutions, Legion is unique in that it took an object-orientated approach
It metamorphosed from an academic project to a commercial vendor with Avaki
Avaki pushed the LOID naming conventions as an industry secure naming protocol in 2002, which Compaq, Hewlet Packard, IBM, Platform Computing, and Sun Microsystem all welcomed
IBM adopted the System for their Life Sciences research
Though the platform in its commercial state is proprietary, it can be assumed that the Legion->Avaki->Sybase->SAS ownership chain has continued the growth and expansion of the system
Contribution
Scalability claims refers to communication traffic required as part of the implementation model◦ LOID binding lookups from objects to Binding Agents◦ Binding Agent traffic required to satisfy object binding requests
Assumes that most accesses will be local◦ Same organization◦ Within a department or university campus
Inter user-level object communication inside of an application may or may not contain a bottleneck
◦ User implementation may have a centralized object that acts as shared memory for a large number of workers
Resource starvation results in increasingly poor performance
Drawbacks?
Comparison of Globus against Legion with Matrix Multiplication using the MPI libraries
Performance
Too many requirements and decisions are placed on the shoulders of the resource owners and users. It contradicts the overall goal of Legion being easy to use
It aspires to be multi-organizational, but lacks easy scalability across organizations
Performance
Fault Tolerance is not explicitly covered in the available documentation, and Avaki continued to develop the code-base and likely solved these issues, but that is commercial and proprietary.
Performance measurements may be volatile as it cannot be predicted how Legion scales across hosts
◦ Bottlenecks may occur at the application level
Personal Opinions for Improvements
Reasonable, well phrased questions?