67
CHAPTER 5
ONTOLOGY BASED WS-POLICY MATCHMAKING
SYSTEM
5.1 GENERAL
In this chapter, we discuss the complexity arising from diversified
nature of Grid infrastructure as every participating resource has its own usage
and access policies. For example, a typical Grid resource may not want to
contribute 100% of its resources such as CPU power, to Grid and also it may
want to involve in Grid operation only 7 hours in a day. Such constraints of a
resource shall be called its usage policy. Similarly, a resource may allow the
user X to use only 50% of the storage, and does not allow user Y to use the
resource. These statements refer to the access control policy of a resource.
The existing resource monitoring and discovery mechanism in Grid performs
capability matching of available Grid resources against the user’s
requirements for job execution. It does not enforce the resource usage policies
before submitting job to it. The existing web service policy language such as
WS-Policy specifications (Bajaj 2006) are not suitable for Grid environment
as they lack appropriate syntax for expressing Grid resource usage policies.
Also, there is no standard language developed for this purpose. Hence,
building a policy matching system to verify Grid resource usage policy
becomes difficult. Hence, a large community of resource providers are
reluctant to participate in Grid.
68
In this chapter, we address this issue by including necessary
attributes in WS-Policy specification to represent resource usage policies and
built a Policy Matching System (PMS) that verifies whether the resource
usage policy is met by the user, before submitting the user’s job to it.
We also exploit the concept of ontology to enhance the policy
matching mechanism. Drawing inspiration from Grid Scheduling Ontology
developed by CoreGRID team (Wieder el a. 2008), a Grid Policy Ontology is
proposed to semantically represent the policy constructs and their relationship
between them. This representation is used to compare the usage policies
against the user demands on the basis of their semantics. As a prototype
implementation, We have modified the Apache implementation of WS-Policy
to compare the Grid resource usage policy represented in extended WS-Policy
on the basis of semantics using the policy ontology. We then integrated the
policy matching system with Gridway metascheduler to implement it in real
Grid environment. In short, the important issues addressed by proposed policy
matching mechamism as follows:
Expression of Resource Usage Policy
Developing a Policy Matching Mechanism
Implementing in Grid Environment
5.2 EXPRESSION OF RESOURCE USAGE POLICY
Currently, there is no standard specification developed for
expressing resource usage policies in Grid environment. This is because, the
resources are owned by autonomous institutions who have their own
constraints and accordingly policies will be differing with each other
resources. However, policies are generally formulated using attributes
referring to various QoS factors such as bandwidth, resource configuration
information such as hard disk capacity, RAM availability, number of nodes
69
etc. Hence, by introducing proper syntax and elements in the web service
policy language such as WS-Policy, it is possible to express resource usage
policy for a Grid resource. For example, let us consider the following usage
constraint of the Grid resource A.
P1:Provide 80% of storage disk to Grid Jobs
P2:Provide 40GB of RAM to Grid experiment
P3: Available bandwidth for Grid Network is 100MBPS
P4: Provide 47 CPUs to Grid from 9AM to 9PM
To express the above usage constraints, the attributes of hard disk
capacity, RAM capacity, bandwidth and the number of nodes are required. In
this chapter, we introduce appropriate XML elements referring the attributes
to the WS-Policy language constructs, to express the usage policy of the
resource A.
5.3 POLICY MATCHING MECHANISM
The second issue of developing policy matching mechanism shall
be addressed by suitable matchmaking algorithm that verifies the usage policy
against the user demands before submitting job to it. The policy matching
algorithm consists of suitable APIs to understand the XML elements that refer
to the attributes and compares with that of user’s needs. In the proposed
matching mechanism, the policy verification is done based on semantics of
the policy constructs. The matchmaking system determines the semantic
similarity of the requested requirements against the resource usage policies
and discovers suitable resources for job submission. To determine the
semantic similarity, a Grid policy ontology is used. Ontologies are used to
capture knowledge about some domain of interest. Ontology describes the
concepts in the domain and also the relationships that hold between those
concepts.
70
The Grid policy ontology is developed using the attributes needed
to express the Grid resource usage policy. They are described hierarchically
and their relationship between those attributes is established. This ontological
representation supports semantic interpretation of attributes in question using
suitable ontology inference engine. The policy matching engine parses the
resource usage policy and extracts the attributes to be verified. They are then
compared against the user’s demand based on their semantics by referring to
Grid policy ontology. This semantic comparison of usage policy would
provide better matching rather than that of keyword verification of policies. It
determines the resources whose usage policy does not exactly matching with
the user’s demands but are closer to what has been requested. This allows
better utilization of the resources and allows both the user and resource
provider to negotiate about the execution of job in the resource.
5.4 IMPLEMENTING IN GRID
The third issue we address is the integration of proposed policy
matching mechanism in a practical Grid environment. Generally, a Grid
comprises Grid resources built using a Grid middleware such as Globus,
gLite. User jobs will be scheduled across these resources through a Grid
scheduler such as Gridway, Gridbus broker. The Grid schedulers are able to
aggregate the Grid resource information such as their configuration
information (hard disk, CPU availability, available RAM, etc) by contacting
the Grid information system of respective Grid resources.
These details are used by the scheduler to determine suitable
resources for a specific user’s job. It matches the capability needed to execute
the user’s job against that of the available resources and select suitable
resources for job submission. However, the currently existing schedulers only
determine capable resources for the job and do not verify the local resource
usage policies before submitting the job to it. In this work, the policy
71
matching system is integrated with Gridway scheduler so that it obtains the
capable resources from the scheduler and selects the resources whose resource
usage policies are exact and closely match with that of the user’s
requirements. These lists of shortlisted resources are then provided to the
scheduler for job submission. With this approach, the resource usage policies
would be very well be enforced at the meta scheduling level as shown in
Figure 5.1.
Figure 5.1 Policy Matching System in Grid
5.5 EXPRESSION OF RESOURCE USAGE POLICY
Currently, policies are expressed using various policy languages
such as XACML, WSPL, and also sometimes an XML document. No
standard has been followed to express policy information in Grid environment
to represent the resource usage policies. We suggest using WS-POLICY
72
specification to express policy information thereby making it possible to
address interoperability issues across various Grids and Grid resources. WS-
Policy provides a flexible and extensible grammar for expressing the
capabilities, requirements, and general characteristics of entities in an XML
Web services-based system. WS-Policy defines a framework and a model for
the expression of these properties and more sophisticated conditional
assertions. WS-Policy defines a policy to be a collection of one or more
policy assertions. Some assertions specify traditional requirements and
capabilities that will ultimately manifest on the wire (for example,
authentication scheme, transport protocol selection). Some assertions specify
requirements and capabilities that have no wire manifestation yet are critical
to proper service selection and usage (for example, privacy policy, QoS
characteristics). WS-Policy provides a single policy grammar to allow both
kinds of assertions to be reasoned about in a consistent manner. Considering
the widespread use of WS-Policy in web based business environment, we
believe that the use of WS-Policy in Grid would encourage the participation
of many resource providers in Grid.
Although WS-Policy language possesses needed constructs and
elements to express policies for web services, it lacks appropriate XML
elements to express usage policies for Grid resources. This is because, in Grid
environment, computational resources, and storage resources are available as
services. Also, applications hosted in a resource are available as services.
Hence, in order to express the capabilities and usage of those Grid services, it
needs some enhancement to the existing policy language constructs.
As mentioned in the earlier section, usage policy is generally
formulated using attributes such as bandwidth, CPU information. If we have
necessary XML elements to refer to those attributes in WS-Policy language
framework, it is possible to express Grid resource usage policy. With this
73
motivation, we developed an XML schema defining necessary elements with
which the resource usage policies can be expressed. Consider the following
scheme file in which information such as Memory, Architecture and other
resource configuration attributes are defined as XML elements. This schema
shall be extended by including more and more attributes when needed.
With the schema in hand, the Grid resource usage policy can be
expressed using WS-Policy specifications. For instance, the resource usage
policy of a resource, say, resource A can be described as follows: The
highlighted statements explain the resource usage policy attributes and their
values that the resource A contributed to the Grid operation.
<xs:schema
targetNamespace="http://Gridlab.mit.edu/GridUsagepolicy.xsd"
xmlns:tns="http://Gridlab.mit.edu/GridUsagepolicy.xsd"
xmlns:xs=http://www.w3.org/2001/XMLSchema
elementFormDefault="qualified" blockDefault="#all" >
<xs:element name="MaximumRam" pe="tns:OperatorContentType" />
<xs:element name=" HardDisk " type="tns:OperatorContentType" />
<xs:element name=" HardDisk__percentage”
type="tns:OperatorContentType" />
<xs:element name="Numb_of_Nodes" type="tns:OperatorContentType" />
<xs:element name="Machine_type" type="tns:OperatorContentType" />
<xs:element name="Release" type="tns:OperatorContentType" />
<xs:element name="Bandwidth" type="tns:OperatorContentType"
<xs:element name="Cost" type="tns:OperatorContentType"
<xs:element name="Availability" type="tns:OperatorContentType"
/>
74
The highlighted section of the above policy represents usage policy
constraints of a resource. With this policy, the resource announces that it can
offer 40GB of its RAM capacity, 80% of total hard disk, 47 number of
computing nodes to Grid. It also claims that the bandwidth capability is about
100MBPS. This policy will be differing with every resource depending on
their administrative limitations. However, defining such policies and
enforcing it will enable better utilization of the resources.
5.6 POLICY MATCHMAKING MECHANISM
The main objective of the policy matchmaking mechanism is to
verify the resource usage policy against the user’s demands before submitting
job to the resource. The matchmaking algorithm proposed for policy
<wsp:Policy xml:base="uri:thisBase" wsu:Id="UsagePolicy_A"
xmlns:wsp="http://schemas.xmlsoap.org/ws/2004/09/policy"
xmlns:sec="http://schemas.xmlsoap.org/ws/2005/07/securitypolicy"
xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-
wssecurity-utility-1.0.xsd"
xmlns:xs="http://Gridlab.mit.edu/GridUsagepolicy.xsd ">
<xs:MaximumRam >40</xs:MaximumRam >
<xs: HardDisk_percentage>80</ xs: HardDisk__percentage>
<xs: Bandwidth>100 </xs: Bandwidth>
<xs: Numb_of_Nodes>47</xs: Numb_of_Nodes>
….
….
<sec:SecurityToken>
<sec:TokenType>sec:X509v3</sec:TokenType>
</sec:SecurityToken>
</wsp:Policy>
75
verification refers to background policy ontology to determine the semantic
similarity between the user’s demand and the usage policy attributes. Figure
5.2 shows the functional modules present in the policy matchmaking system.
Policy ReaderSemantic
Comparator
PolicyOntology
Resources
To User
Grid Scheduler
User
Figure 5.2 Policy Matchmaking System
The Policy Reader component reads the resource usage policies of
every resource expressed using WS-Policy. Since these policies are written
using newly introduced XML elements, it has necessary APIs to understand
and parse them for further processing. Similarly, it reads the user’s
requirements over the job execution. WS-Policy specification has been
implemented by Apache and it has necessary APIs and function calls to read,
compare and validate WS-Policies. Since in our approach, we have included
new attributes to express Grid resource usage policies, some of the APIs have
been suitably extended to read the newly added XML elements.
The parsed attributes will then be sent to Semantic Comparator
component. This component relies on the implementation of WS-Policy
policy comparator API. However, the conventional implementation of this
76
API matches the keywords present in the policy and does not look into their
semantics. In the proposed system, the comparator API has been modified in
such a way that it refers to the policy ontology to determine the semantic
similarity between the attributes. It uses Algernon inference engine to interact
with the ontology. Algernon provides versatile queries and rules which can be
executed onto the ontology for information retrieval. The API has been
implemented so that it tries to compare the attributes for exact validation of
the usage policy. If the policy is not verified, then it determines the degree of
similarity of the policy with respect to the user’s requirements. Several degree
of semantic match has been described in the literature. Exact match refers to
the one that exactly matches the request against the available services.
Subsume match is the one in which the requested capability is less than that of
the available ones. Plugin match refers to the kind of match in which the
requested capability is more than that of the available ones. The policy
matchmaking system publishes this degree of match to the user as a result of
policy verification, to allow for negotiation of policy attributes. However, if
the policy has been exactly validated, the list of corresponding resources will
be given to the Grid scheduler. The Grid scheduler then schedules job to these
resources depending on their availability at the moment.
The other component, policy ontology, is used to support semantic
based verification of resource usage policy against user’s requirements. The
policy ontology uses semantic web’s approach of making information
understandable by computers. The ontology representation has more
expressive power than that of conventional XML representation. In our
approach, to determine the semantic similarity between terms related to policy
attributes are defined in a hierarchical manner. We use Web Ontology
Language (OWL) to construct ontology in which relationship between every
attribute is very well established. We used Algernon inference engine for
interacting with the ontology for understanding the meaning between the
77
attributes. For example, Figure 5.3 shows the portion of security ontology
proposed by Naval Research Lab, Washington D.C. This ontology is used for
determining degree of semantic closeness between the attributes related to
security aspects.
Figure 5.3 NRL Security Ontology
(Source: chacs.nrl.navy.mil/projects/4SEA/ontology.html)
5.7 POLICY MATCHMAKING ALGORITHM
With all these components, the matchmaking algorithm compares
the user’s requirements against the resource usage policies and suggests the
Grid scheduler about the resources to which job can be submitted. It takes the
user’s requirements and resource usage policy of every resource as input. It
determines the degree of match between the inputs and obtains the list of
resources that satisfies user’s requirements. This list will be sent to Grid
scheduler for job submission to it. This policy matchmaking algorithm is
listed in algorithm 5.1.
78
Algorithm: Policy Matchmaking Algorithm
Input : Users Requirements UP, Resource Usage Policy RP
Attributes Att
Output :Degree_of_Match M [{Exact , Plugin , Subsume }
Resource List RL
Parse UP and extract UP (Att1,Att2…Attn)
Parse RP and extract RP (Att1,Att2…Attn)
i=n
for each parsed UP (Att1,Att2…Attn) do
if UP (Att1 RP (Att1) then i--;
if UP (Att2 RP (Att2) then i--;
if UP (Attn RP (Attn) then i--;
RL = R
M=“Exact”
end if
else if
if UP (Att1) RP (Att1) then i--;
if UP (Att2) RP (Att2) then i--;
if UP (Attn RP (Attn) then i--;
M =“Subsume”
end else if
else
if UP (Att1) RP (Att1) then i--;
if UP (Att2) RP (Att2) then i--;
if UP (Attn) RP (Attn) then i--;
M=“Plugin”
end for
Algorithm 5.1 Policy Matchmaking Algorithm
79
5.8 TESTING WITH GRID SCHEDULER
The policy matchmaking system proposed in this work is a
standalone component. In order to implement in Grid environment, we
integrate it with Grid scheduler. In Grid, Grid scheduler accepts user’s
requirements and jobs, discovers suitable resources for job execution, submits
jobs to the selected resources, monitors the execution and delivers the result
to the user. It does all these operations transparent to the user hiding the
complexity of managing diverse resources across Grid. Conventional Grid
scheduler lacks the mechanism of verifying resource usage policy while
submitting job to the resource. Hence, integrating policy matchmaking system
with Grid scheduler will complement the resource selection and scheduling
operation of the scheduler. The main issue in integration is that at what point
of scheduler, the policy matching system should be integrated. Because, every
scheduler adopts its own strategies for resource discovery and scheduling
policies for job execution in a resource.
We selected Gridway metascheduler for integrating policy
matchmaking algorithm to it. Gridway acts as a personal resource broker to
build a sorted list of candidate resources, performs job submission to the
selected resource and monitors. It also performs dynamic rescheduling when
performance of job execution gets slowdown or remote failures are detected
to adapt to the changing conditions of the Grid environment. To the most of
all, it can discover suitable resources available across the Grid depending on
the user’s requirements.
Figure 5.4 shows the architecture of Gridway metascheduler and its
various components (Huedo et al. 2005). Gridway is built over Globus
middleware and it supports Globus 4.0 and Globus 2.x middleware. The
request manager provides command line tool as well as API interface to the
80
user to submit job along with its configuration file, or job template, which
contains all the necessary parameters for its execution and also resource
requirements. The dispatch manager periodically wakes up at each scheduling
interval, and tries to submit pending and rescheduled jobs to Grid resources.
The information manager is invoked by the dispatch manager for obtaining
information about the available Grid resources matching the job requirements.
The attributes needed for resource discovery and selection must be collected
from the information services in the Grid testbed, typically the Globus
Monitoring and Discovery System (MDS). The submission manager is
responsible for the execution of the job during its lifetime, i.e. until it is done
or stopped. The transfer manager governs the transfer of job related files from
the user end to the resources. It gathers the results of execution and delivers
them to the user.
Figure 5.4 Gridway Scheduler
Gridway employs keyword based matching of users’ requirements
against the available resources and forms a list of suitable resources for the
81
submitted job as shown in Figure 5.5. If more than one resources are found
suitable to the job execution, then Gridway selects one among those by
applying simple ranking criteria which is equal to 2*CPU + FreeMB. This
ranking equation determines the efficient resource among the suitable
resources at the moment.
Job Submit
<job
template>
Gathers Available
Resource
Matches
Against JobReq
R1R2 R3
Performs Matchmaking
Selects and submit
Figure 5.5 Conventional Gridway Scheduling
In our approach, the conventional flow of Gridway is modified so
that it consults the policy matchmaking system before selecting a resource for
submitting job. Figure 5.6 shows the modified flow of Gridway
metascheduler after integration of PMS in to it.
In this flow, the Gridway employs its resource discovery strategy
and discovers suitable resources capable of executing user’s job. These lists of
resources are then sent to PMS for policy validation. The policy matchmaking
82
system verifies the usage policy of capable resources against the user’s
demands over QoS and security related attributes such as security mechanism
implemented. It then identifies the resource whose usage policy matches
exactly with the user’s demands for job submission to Gridway. If the policy
matches exactly with the user demands, the PMS determines the degree of
closeness to allow for negotiation of policy attributes by the user and the
resource provider.
Job Submit
<job template>
Gathers Available
Resource
Matches
Against JobReq
R1R2 R3
Invokes Scheduling Operation
Performs Matchmaking
Selects and submit
PMSInvokes PMS
Policy
OntologySelects Resource
After policy validation
Access Resource
Usage Policy
refers
User Policy
requirements
Figure 5.6 Modified Gridway Scheduling
With this approach, Gridway metascheduler can verify the usage
policies of capable resources before submitting job to it. This enables better
utilization of the resources while ensuring their administrative policies not
violated.
83
5.9 IMPLEMENTATION
This work has been implemented at Grid Computing Lab of Anna
University Chennai. As a prototype implementation, we made a survey of
several practical usage policies and arrived at a handful of relevant attributes.
A schema file has been developed defining these attributes for
experimentation purpose. We use three Grid resources as shown in Figure 5.7
in our lab. We created policies using WS-Policy schema along with newly
defined schema in a single file to represent usage policies of those resources.
The location of the policy in respective resource is assumed to be known to
Grid scheduler for accessing them for validation. There is always a possibility
of keeping usage policies of the resources in a common repository. However,
this approach does not allow freedom to the resource provider for changing
the policy over a period. To our knowledge, policies are administration
specific which may change from time to time depending on the change in
infrastructure and other aspects. Hence, we believe that keeping the policies at
the resource site allows the resource provider to have full control over the
policies that they want to enforce.
We consider three Grid resources, namely R1, R2 and R3 each
associated with a resource usage policy, namely A, B, and C, respectively.
The policy has been created using WS-Schema specification and also with
newly added scheme to express policy related to resource usage. For instance,
the usage policy of the resource A is such that “It Provides 80% of storage
disk to Grid Jobs and 40GB of RAM to Grid experiment”. The WS-Policy is
as follows:
84
ABC
Policy
Reader
Semantic
Comparator Policy
Ontology
User1.Submits job
2.Gathers ResourceInformation
3.Sends Capableresources to PMS
4. Reads Policy andExtract attributes
6.VerifiesPolicy
5. Submits Policy requirements
7. Sends List of resourcesMatchinguser’s policy
8. Submits Job
PMS
Gridway
Resource Usage
Policy
R1R2
R3
Resources
Figure 5.7 Experimental Setup
<wsp:Policy xml:base="uri:thisBase" wsu:Id="myPolicy"
xmlns:wsp="http://schemas.xmlsoap.org/ws/2004/09/policy"
xmlns:sec="http://schemas.xmlsoap.org/ws/2005/07/securitypolicy"
xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-
wssecurity-utility-1.0.xsd"
xmlns:xs="http://Gridlab.mit.edu/GridUsagepolicy.xsd ">
<sec:SecurityToken>
<sec:TokenType>sec:X509v3</sec:TokenType>
</sec:SecurityToken>
<xs:MaximumRam >40</xs:MaximumRam >
<xs: HardDisk_percentage>80</ xs: HardDisk__percentage>
<sec:Integrity wsp:Optional="false" >
<sec:MessageParts>
Dialect="http://schemas.xmlsoap.org/ws/2002/12/wsse#soap" >
</sec:MessageParts>
</sec:Integrity></wsp:Policy>
85
Policy ontology has been created using Protégé ontology editor.
Several attributes related to usage policy were represented hierarchically in
the ontology and their relationship has been established using OWL
constructs as shown in Figure 5.8a and Figure 5.8b.
Figure 5.8a Concepts of Computational Grid Resources represented in
Ontology
Further, a portion of NRL Security Ontology was used to
determining semantic similarity between security related attributes. This
policy file is stored in a location which is assumed to be known to both
Gridway and PMS. Globus middleware is installed in every resource and in
another resource called G, the Gridway metascheduler is installed. Gridway is
properly configured to work with the underlying three resources and ensures
it aggregates their information for scheduling operation.
86
Figure 5.8b Resource Usage Policies considered
The user submits job to Gridway using the standard means of job
submission provided by Gridway. A sample job template that can be
submitted to Gridway as follows:
EXECUTABLE=job.sh
REQUIREMENTS=ARCH=”X86_32” & FREE_MEM_MB=”10MB”
STDIN_FILE="Inp.txt"
STDOUT_FILE=Out.txt
Gridway gathers the underlying resource information using
appropriate middleware access drivers. It then discovers suitable resources
that can execute the user’s job. It creates a list of capable resources for
87
scheduling jobs to it. The gw_rm_host_match.c of Gridway resource manager
performs the matchmaking operation and arrives at a list of capable resources
to which job can be submitted. We included necessary coding segment to
integrate PMS which takes the list of capable resources and parses their usage
policies. It extracts the policy attributes and sends it to semantic comparator
which determines the semantic similarity of usage policy attributes with that
of the user policy attributes. It refers the policy ontology and obtains the
degree of closeness between them and forms a list of resources whose usage
policy matches with that of user’s. This list is provided to Gridway for job
submission.
We used Apache implementation of WS-Policy for parsing,
validating, comparing WS-Policies. However, the compare APIs of
conventional implementation matches keyword present in the WS-Policy.
Hence, we introduced a piece of code in compare API of
PolicyComparator.java program so that it invokes another java class for
semantic comparison of policy attributes. This java class has been developed
using Algernon APIs to execute appropriate queries over the ontology for
information retrieval. This class can form several different Algernon queries
based on the resource usage policy attribute to be compared with that of the
user demand. The query determines the similarity between the two attributes
and infers whether they are the closely matching ones. For example, if the
security algorithm in resource provider ends user 256bit key length can be
compromised by the user who uses 128bit security algorithm. The
matchmaking mechanism, in its best case, tries to find out the exact match.
However, if the exact match is not found, it tries to determine other closely
related matches such as subsume, and plugin. This information will be helpful
to both the user and resource provider to negotiate the service access.
88
The proposed knowledge layer components supports ontology
representation of Grid resources which in turn enables discovery of suitable
Grid resources based on the semantics of the request. This feature
complements the Grid scheduler to find out closely related resource and can
suggest to the users leading to flexible way of scheduling application to the
resources. Further, ontology can also be used to represent the resource usage
policies and in such cases, the discovery mechanism needs to be extended to
discover the resources to consider the policies before making scheduling
decision. Protégé-OWL libraries allow us to modify the structure of ontology
template dynamically and hence it is possible to extend the template when
new ‘concepts’ are to be included.