RISK ASSESSMENT AND ADAPTIVE GROUP
TESTING OF SEMANTIC WEB SERVICES
XIAOYING BAI
Department of Computer Science and Technology, Tsinghua University and
State Key Laboratory of Software Development Environment
BeiHang University, Beijing, 100084, [email protected]
RON S. KENETT
Department of Statistics and Applied Mathematics
Universita Di Torino, Italy
WEI YU
State Key Laboratory of Software Development Environment
BeiHang University, Beijing, 100084, [email protected]
Received 15 July 2011
Revised 10 September 2011
Accepted 17 September 2011
Testing is necessary to ensure the quality of web services that are loosely coupled, dynamicbound and integrated through standard protocols. Exhaustive testing of web services is usually
impossible due to unavailable source code, diversi¯ed user requirements and large number of
possible service combinations delivered by the open platform. This paper proposes a risk-based
approach for selecting and prioritizing test cases for testing service-based systems. We speciallyaddress the problem in the context of semantic web services. Semantic web services introduce
semantics to service integration and interoperation using ontology models and speci¯cations.
Semantic errors are considered more di±cult to detect than syntactic errors. Due to the com-
plexity of conceptual uniformity, it is hard to ensure the completeness, consistency and uni¯edquality of ontology model. A failure of the semantic service-based software may result from
many factors such as misused data, unsuccessful service binding, and unexpected usage sce-
narios. This work analyzes the two factors of risk estimation: failure probability and importance,from three aspects: ontology data, service and composite service. With this approach, test cases
are associated to semantic features, and are scheduled based on the risks of their target features.
Risk assessment is used to control the process of Web Services progressive group testing,
including test case ranking, test case selection and service ruling out. This paper discusses thecontrol architecture and adaptive measurement mechanism for adaptive group testing. As a
International Journal of Software Engineeringand Knowledge Engineering
Vol. 22, No. 5 (2012) 595�620
#.c World Scienti¯c Publishing CompanyDOI: 10.1142/S0218194012500167
595
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
statistical testing technique, the proposed approach aims to detect, as early as possible, theproblems with highest impact on the users.
Keywords: Semantic web services; risk-based testing; adaptive testing; progressive grouptesting.
1. Introduction
Service Oriented Architecture (SOA), and its Web Services (WS) implementations,
introduce an open architecture for integrating heterogeneous software through
standard internet protocols [15, 29]. From the providers' perspective, proprietary in-
house components are encapsulated into standard programmable interfaces and
delivered as reusable services for public access and invocation. From the consumers'
perspective, applications are built following a model-driven approach where business
processes are translated into control °ows and data °ows, of which the constituent
functions can be automatically bound to existing services discovered by service
brokers. In this way, large-scale software reuse of Internet available resources is
enabled, providing support for agile and fast response to dynamic changing business
requirements. SOA is believed to be the current major trend of software paradigm
shift.
However, due to the potential instability, unreliability, and unpredictability of
the open environment in SOA and WS, these developments present challenging
quality issues compared with traditional software. Traditional software is built and
maintained within a trusted organization. In contrast, service-based software is
characterized by dynamic discovery and composition of loosely coupled services that
are published by independent providers. A system can be constructed, on-the-°y, by
integrating reusable services through standard protocols [26]. For example, a housing
map application can be the integration of two independent services: Google Map
service and housing rental services. In many cases, the constituent data and func-
tional services of a composite application are out of the control of the application
builder. As a consequence, service-based software has a potentially higher probability
to fail compared with in-house developed software. Moreover, as services are open to
all Internet users, the provider may not envision all the usage scenarios and track the
usage status at runtime. Hence, a failure in the service may a®ect a wide range of
consumers and result in unpredictable consequence. For example, Gmail reported a
failure of \service unavailable due to outage in contacts system" on 8 November 2008
for 1.5 hours ��� millions of customers were a®ected. In an early research of software
reliability, Kenett and Pollak [18] discovered that software reliability decays over-
time due to the side e®ects of bug correction and software evolution. In a service-
based system, such decaying process may not be aware to the consumers until a
failure occurs.
Testing is thus important to ensure the functionality and quality of individual
services as well as composed integrated services. Proper testing can ensure that the
selected services best satisfy the users' needs and that services that are dynamically
596 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
composed can interoperate with each other. However, testing is usually expensive
and con¯dence in a speci¯c service is hard to achieve, especially in an open Internet
environment. The users may have multiple dimensions of expected features, prop-
erties and functional points that result in a large number of test cases. It is both time
and resource consuming to test a large set of test cases on the large number of service
candidates.
To overcome these issues, the concept of group testing was introduced to services
testing [3, 4, 33, 34]. With this approach, test cases are categorized into groups and
activated in groups. In each group testing stage, the failed services are eliminated
through a prede¯ned ruling-out strategy. E®ective test case ranking and service
ruling-out strategies removes a large number of unreliable services at the early stage
of testing, and thus reduce the total number of executed tests. Measurement is
essential for test ranking and selection. In particular, Bayesian techniques such as
Bayesian inference and Bayesian Networks can help for estimation and prediction. In
our previous work, Kenett et al. [19, 20] present the applications of statistics and
DOE (Design of Experiments) methods in modern industry in general, and in soft-
ware development in particular.
This paper introduces a risk-based approach to evaluate service qualitatively and
rank test cases. With this approach, test cases are prioritized and scheduled based on
the risks of their target software features. Intuitively, a risky feature deserves more
testing e®ort and has a high priority to test. When time and resources are limited,
test engineers can select a subset of test cases representing the highest risk targets, in
order to achieve good enough quality with an a®ordable testing e®ort. An important
issue in the risk-based approach is the measurement of risk. In most cases, risk factors
are estimated subjectively based on experts' experiences. Hence, di®erent people may
produce di®erent results and the quality of risk assessment is hard to control. Some
researchers began to practice the application of ontology modeling and reasoning to
operational and ¯nancial risk management, combining statistical modeling of qual-
itative and quantitative information on a SOA platform [31] and [37]. This paper
presents an objective risk measurement based on ontology and service analysis. A
Bayesian network (BN) is constructed to model the complex relationships between
ontology classes. Failure probability and importance are estimated at three levels ���ontology, service and service work°ow ��� based on their dependency and usage
relationships.
Services are changed continuously online (\The Perpetual Beta" principle in Web
2.0 [26]). The composition and collaboration of services are also built on-demand. As
a result, testing could not be fully acknowledged beforehand and has to be adaptive
so that tests are selected and composed as the services change. The paper introduces
the adaptive testing framework so that it can monitor the changes in service artifacts
and dependencies, and reactively reassess their risks and reschedule the test cases for
group testing.
More and more researchers are beginning to realize the unique requirements
of WS testing and propose innovative methods and techniques from various
Risk Assessment and Adaptive Group Testing of Semantic Web Services 597
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
perspectives, such as collaborative testing architecture, test case generation, dis-
tributed test execution, test con¯dence and model checking [10]. Most of the current
WS testing research focuses on the application and adaptation of traditional testing
techniques. We address here the uniqueness of WS and discuss WS testing from a
system engineering and interdisciplinary research perspective. Compared with
existing work in this area, this paper covers the following aspects:
. It proposes an objective method to assess software risks quantitatively based on
both the static structure and dynamic behavior of service-based systems.
. It analyzes the quality issues of semantic WS and measures the risks of the
ontology-based software services based on semantic analysis using stochastic
models like Bayesian networks.
. It improves WS progressive testing with e®ective test ranking and selection
techniques.
The rest of this paper is organized as follows. Section 2 introduces the background
technology including risk-based testing, web services group testing, and semantic
web services. Section 3 de¯nes the problem of adaptive WS testing. Section 4 ana-
lyzes the characteristics of semantic WS and o®ers a method for estimating and
predicating failure probability and importance. Section 5 proposes the methods of
adaptive measurement and adaptation rules. These techniques are the basis for
incorporating a dynamic mechanism into the WS group testing schema. Section 6
presents the metrics and the experiments used to evaluate the proposed approach.
Section 7 concludes the research and discusses future work.
2. Background
2.1. Risk-based testing
Testing is expensive, especially for today's software with its growing size and com-
plexities. Exhaustive testing is not feasible due to the limitations in time and
resources. A key testing strategy is to improve test e±ciency by selecting and
planning a subset of tests with a high probability to ¯nd defects. However, selective
testing usually faces di±culties in answering questions like \What should we test ¯rst
given our limited resources?" and \When can we stop testing?".
Software risk assessment identi¯es the most demanding and important software
aspects, and provides the basis for test selection and prioritization. In general, the
risk of a software feature is de¯ned by two factors: the probability to fail, and the
consequence of the failure. That is,
RiskðfÞ ¼ P ðfÞ � CðfÞ ð1Þ
where f is a software feature, RiskðfÞ is its risk exposure, P ðfÞ is the failure prob-
ability and CðfÞ is the cost of the failure.
598 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
Intuitively, a software feature is risky if it has a high probability to fail or its
failure will result in serious consequence. Hence, a risky feature deserves more testing
e®ort and has a high priority to test. Risk-based testing was proposed to select and
schedule test cases based on software risk analysis [1, 2, 12, 20, 25, 27, 28, 32]. The
general process of risk-based testing is as follows:
(1) Identify risk indicators;
(2) Evaluate and measure the failure probability of software features;
(3) Evaluate and measure the failure consequence of software features;
(4) Associate test cases to their target software features;
(5) Rank test cases based on the risks of their target features. Test cases for risky
features should be ranked to be exercised earlier; and
(6) De¯ne risk-related coverage to control the testing process and test exit criteria.
Kenett and Tapiero proposed a convergence between risk engineering and quality
control from statistical perspective [19, 21]. In software engineering, risk-based
software testing is gaining attention since the late 1990s. Amland [1] established a
generic risk-based testing approach based on the Karolak's risk management process
model [17]. Bach [2] identi¯ed the general categories of risks during software system
development including complexity, change, dependency, distribution, third-party,
etc. Practices and case studies show that, by spending more time on critical func-
tions, testing can bene¯t from the risk-based approach in two ways: reduced resource
consumption and improved quality.
2.2. Web services group testing
Group testing technique was originally developed at Bell Laboratories for e±ciently
inspecting products [30]. The approach was further expanded to general cases [13]. It
is routinely applied in testing large numbers of blood samples to speed up the test
and reduce the cost [14]. In this case, a negative test result of the group under test
indicates that all the individuals in the group do not have the disease; otherwise, at
least one of them is a®ected. Group testing has been used in many areas such as
medical, chemical and electrical testing, coding, etc., using either combinational or
probabilistic mathematical models.
WS progressive group testing is proposed to enable heuristic-based selective
testing in an open service environment [33, 34, 4]. We de¯ne test potency as its
capability to ¯nd bugs or defects. Test cases are ranked and organized hierarchically
according to their potency to detect defects, from low potency to high. Test cases are
exercised layer-by-layer, following a hierarchical structure, at groups of services.
Ruling out strategies are de¯ned so that web services that fail at one layer cannot
enter the next testing layer. Test cases with high potency are exercised ¯rst with the
purpose to remove as many web services as early as possible.
Group testing is by nature a selective-testing strategy, which is bene¯cial in terms
of reduced number of test runs and shortened test time. The key challenge in
Risk Assessment and Adaptive Group Testing of Semantic Web Services 599
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
progressive group testing is the ranking and selection of test cases. Bai et al. [3]
further re¯ned the model and proposed a dependency-based approach. Two test
cases tc1 and tc2 are dependent if a service that fails tc1 will also fail tc2. For any
service, a test case will be exercised only if the service passes all its dependent test
cases. We further propose the windowing technique that organizes web services in
windows [34] and incorporates an adaptive mechanism [4, 7]. It follows software
cybernetics theories [9, 11] to dynamically adjust the window size, determine web
service ranking and derive test case ranking.
Several challenges need to be addressed in setting up progressive group testing
strategies, including:
(1) Estimation of probabilities;
(2) Speci¯cation of dependencies;
(3) Dynamic updating of estimates; and
(4) Sensitivity evaluation of group testing rule parameters.
2.3. Semantic web services
WS was initiated as W3C standards to enable interoperation between heterogeneous
web-applications through self-descriptive programmable interfaces [38]. It de¯nes the
XML-encoded protocols to support service communication, registration, discovery,
composition, and collaboration, such as SOAP (Simple Object Access Protocol),
WSDL (Web Service Description Language), UDDI (Universal Description, Dis-
covery and Integration), BPEL (Business Process Execution Language), and so on.
These speci¯cations de¯ne the expected structure and behavior of the service-based
system, which can be used to understand the system at an abstraction level. The WS
standards enable the analysis, veri¯cation and validation of the software behavior
before service deployment, at runtime and during online evolution.
This paper analyzes risks based on WS semantic model. It takes the speci¯cations
as the models of the system, particularly OWL-S (Ontology Web Language for
Services) [39] as service composition model. Semantic Web is a new form of web
content in which the semantics of information and services are de¯ned to be un-
derstood by computers [8]. Ontology techniques are widely used to provide a uni¯ed
conceptual model of Web semantics [16, 23, 31]. For example, Spies [31] and the
MUSING project [37] introduce a comprehensive approach of ontology engineering
to ¯nancial and operational risks [22]. OWL-S is an OWL-based semantic markup
language. It provides a semantic model for composite services. It speci¯es the
intended system behavior in terms of inputs, outputs, process, pre-/post- conditions
and constraints using ontologies. Figure 1 shows OWL-S ontology model. The Ser-
viceProcess ontology is modeled as a work°ow of processes including atomic, simple,
and composite processes. Two components are used to de¯ne OWL-S process model:
the Process Ontology describes the service IOPE (inputs, outputs, preconditions and
600 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
e®ects) properties; and the Process Control Ontology describes the process control
constructs such as sequence, iterate, if-then-else, and split.
3. Problem Statement
Given a set of services S ¼ fsig and a set of test cases T ¼ ftig, selective testing is
the process of ¯nding an ordered set of test cases to detect bugs as early as possible,
and as many as possible. Suppose that 8 s 2 S, BðsÞ ¼ fbig is the set of bugs in the
Fig. 1. OWL-S ontology model.
Risk Assessment and Adaptive Group Testing of Semantic Web Services 601
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
service, T ðsÞ � T is the set of test cases for the service s, 8 b 2 BðsÞ, 9T ðbÞ � T ðsÞ sothat 8 ti 2 T ðbÞ, ti can detect b. Ideally, the potency of a test case t, }ðtÞ, is de¯ned as
the capability of a test case to detect bugs. That is,
}ðtÞ ¼ jBðtÞjPjBðsiÞj
ð2Þ
where BðtÞ is the set of bugs that test case t can detect, jBðtÞj is the number of bugs
t can detect, andP jBðsiÞj is the total number of bugs in the system. The problem is
to ¯nd an ordered subset T 0 of T so that 8 ti; tj 2 T 0, 0 � i; j � n, if i < j, then
}ðtiÞ > }ðtjÞ .However, it is usually hard to obtain the accurate number of bugs present in the
software that the test case can detect. In this work, we transform the bug distri-
bution problem to a failure probability so that, rather than measuring the number of
bugs in a service, we measure the failure probability of the service. We further
consider the failures' impact and combine the two factors into a risk indicator for
ranking services. In this way, testing is a risk mitigation process. Suppose that
RiskðsÞ ¼ P ðsÞ � CðsÞ is the risk of a service, then the potency of a test case is
de¯ned as:
}ðtÞ ¼P
RiskðtsiÞPRiskðsiÞ
ð3Þ
where tsi is the set of services that t tests.
WS testing and bug detection are like a \moving target" shooting problem. As
testing progresses, bugs are continuously detected and removed. As a result, the bug
distribution and service quality change. The service and composite service under test
may also change as well. Therefore, adaptive mechanism is necessary during testing
process to re-calculate ranking and ordering of test cases in reaction to the dynamic
changes in services, test cases potencies and bugs.
To de¯ne the process, we make the following simpli¯ed assumptions:
(1) A service can have at most n bugs;
(2) All the bugs are independent; and
(3) A bug is removed immediately after being detected.
The general process of risk-based test selection is as follows:
(1) Calculate the risks of each service, and order the services in a sequence fsig suchthat 8 si; sj 2 S, 0 � i; j � n, if i < j, then RiskðsiÞ > RiskðsjÞ.
(2) Select the sets of test cases for each service and order the test sets in sequence
fTijTi ¼ T ðsiÞg and 8Ti;Tj � T , 0 � i; j � n, if i < j, then RiskðsiÞ > RiskðsjÞ.(3) Rank test cases in each test set according to their potencies. That is, Ti ¼ ftijg
such that 8 tij; tik 2 Ti, 0 � j; k � n, if j < k, then }ðtijÞ > }ðtikÞ.
602 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
(4) Select the service si with the highest risk and select the corresponding set of test
cases Ti. Exercise the test cases in Ti in sequence.
(5) Re-calculate service risks and test case potencies.
(6) Repeat steps 4–5 until certain criteria are met. We can de¯ne di®erent exit
criteria such as the percentage of services (test cases) covered, the number of
bugs detected, etc.
4. Risk Assessment
Risks are analyzed baesd on WS semantic model. Semantics introduce additional
risks to WS. The ontology may be de¯ned, used, and maintained by di®erent parties.
A service may de¯ne the inputs/outputs of its interface functions as instances of
ontology classes in a domain model that is out of the control of the service provider
and consumer. For example, AAWS (Amazon Associate Web Services) is an open
service platform for online shopping. Its WSDL service interface provides 19
operations with complicated data structure de¯nition. By translating the WSDL
data de¯nition to ontology de¯nition for semantic-based service analysis, we iden-
ti¯ed 514 classes (including ontology and property classes) and 1096 dependency
relationships between the classes. Figure 2 gives an example to show the usage of an
ontology class in di®erent service contexts. In this example, ontology classes are
de¯ned for the publication domain. A class Book is de¯ned as a subclass of Publi-
cation. Di®erent BookStore services may use Book ontology to specify input para-
meters of the operation BookDetails() in the interface BookQuery. An application
builder de¯nes a business process BookPurchase as a work°ow of services. We can see
from the example that a domain model can be used by various services and
Fig. 2. Example of ontology usage in di®erent contexts.
Risk Assessment and Adaptive Group Testing of Semantic Web Services 603
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
work°ows. Therefore, the quality of the domain model has signi¯cant impacts in the
services and applications in the domain.
Moreover, ontologies introduce complex relationships among data and services
which result in increased possibility of misuse of the ontology classes. For example, in
publication domain, two ontology classes CourseBook and EditBook inherit from
Book ontology. From domain modeler perspective, the two categories of books are
used for di®erent purposes and are mutual exclusive. However, from the user per-
spective, such as a courseware application builder, the two classes could be over-
lapped because an edited book can also be used as course reading materials for
graduate students. Such con°icting views will result in a possible misuse of the
ontology classes when developing education software systems.
Based on the analysis of semantic services, the two factors of risks ��� failure
probability and importance ��� are assessed at three layers: domain ontology, service
and composite service.
4.1. Failure probability estimation
The failure probability of the service-based system is estimated as follows:
(1) Estimate the initial failure probability of each ontology class in the domain.
(2) Adjust the estimation of each class by taking its dependencies into consideration.
(3) Estimate the initial failure probability of a service.
(4) Adjust the service's estimation by taking into consideration the failure proba-
bility of its input/out parameters de¯ned by the domain ontology.
(5) Given the failure probability of a set of functions and their input/output
parameters, estimate the failure probability of the composite service based on its
control construct analysis.
Here, we use Bayesian Statistics to analyze the initial failure probability p of each
ontology class and service function. Given a certain artifact a, suppose that we test a
for n times with x times failure. Suppose that X is the distribution of x and that X
follows binomial distribution, that is, X � bðn; pÞ. Suppose that the failure proba-
bility follows uniform distribution, that is p � Uð0; 1Þ and its density function is
�ðpÞ ¼ 1ð0 < p < 1Þ. Then, the posterior probability P ðpjXÞ follows �ðxþ 1;n�xþ 1Þ and the expectation of the posterior with respect to p is the estimator of the
initial failure probability as follows:
EP ðpjXÞðpÞ ¼xþ 1
nþ 2ð4Þ
4.1.1. Ontology analysis
A domain usually contains a large number of ontology classes with complex rela-
tionships. An error in a class may propagate to others along their dependencies.
604 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
Hence, the initial failure probability of an ontology is adjusted with its dependency
analysis.
In this research, dependency relationships are identi¯ed from following three
perspectives:
(1) Inheritance. A subclass inherits all the properties of its super-class and can
extend the super-class with its own de¯nitions. Generically, a subclass has more
restrictions than its super class.
(2) Collection Computing. An ontology class represents a concept that can be
instantiated by a set of instances. Based on set theory, ontology classes may
have following relationships:
. Equivalence. Equivalent classes must have precisely the same instances.
That is, for any two classes C1 and C2 and an instance c, if C1 � C2 and
c 2 C1, it implies that c 2 C2, and vice versa.
. Disjointness. The disjointness of a set of classes guarantees that an indi-
vidual of one class cannot simultaneously be an instance of another speci¯ed
class. That is, for any two classes C1 and C2, if they are two disjoint classes,
then C1 \ C2 ¼ ;.(3) Containment. The relationship above can be nested de¯ned to form a complex
ontology class. The complex class and its nested ontology have the containment
relationship.
In addition, an ontology class may have properties. The property of an ontology
class can also be de¯ned as a class and has relationships listed above. A dependency
graph is de¯ned to model the dependency relationships among the ontologies. In this
directed graph, a node represents a class of ontology or property and a link represents
a dependency between two classes. Di®erent types of links are de¯ned to represent
various types of dependency relationships. Figure 3 illustrates an example depen-
dency graph.
The dependence graph (DG) is transformed into a Bayesian Network (BN) for
inferencing failure probability of each class. The nodes in the BN represent ontology
Fig. 3. Ontology dependency graph.
Risk Assessment and Adaptive Group Testing of Semantic Web Services 605
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
classes and links represent dependency relationships. Depending on class dependency
types, transformation rules are de¯ned as follows:
(1) An ontology class in DG is mapped directly to a node in a BN.
(2) Property classes in DG are not shown in the BN. However, the failure proba-
bility of those property classes are used to estimate ontology classes that the
property belongs to. As an error in a property will cause errors in its ontology
class, we use the product of all the properties as the a®ecting factor. The
adjusted failure probability of the ontology class is de¯ned as follows:
PadjðoÞ ¼ 1� ð1� P ðoÞÞ �Yj¼1::n
ð1� P ðpjÞÞ ð5Þ
where
. P ðoÞ is the estimated failure probability of ontology o and Padj is the adjusted
probability taking relationships into considerations.
. pj 2 PropðoÞ where PropðoÞ is the set of properties of o of length n and P ðpjÞ isthe failure probability of property class pj.
(3) For two classes with Inheritance relationship, that is, Inheritðo1; o2Þ where o1is the parent of o2, a directed link is added to BN between o1 and o2 starting
from o1 and ending at o2 to denote that o2 is a®ected by o1.
(4) For two classes with CollectionComputing relationship, that is, Equivðo1; o2Þ orDisjoinðo1; o2Þ, then. two nodes o 0
1 and o 02 are added to BN with P ðo1Þ ¼ P ðo 0
1Þ and P ðo2Þ ¼ P ðo 02Þ;
and
. two links are added from o1 to o 02 and from o2 to o 0
1 to denote the mutual
dependence relationship.
Once BN is created, the standard BN formulas can be used to calculate the
probabilities as follows:
P ðoijEcÞ ¼P ðoi;EcÞP ðEcÞ
ð6Þ
where Ec is the current evidence (or the current observed nodes) and oi is the node of
ontology class.
4.1.2. Service analysis
Ontologies can be used to de¯ne inputs, outputs, operation, process and collabora-
tion of services [35]. An ontology class can be misused in many ways such as di®erent
scopes, restrictions, properties, and relationships, which may cause failures in the
service-based software. It is necessary to trace how an ontology is used in a diversi¯ed
context so as to facilitate software analysis, quality control and maintenance. For
606 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
example, in case an error is detected in an ontology de¯nition, all the a®ected atomic
and composite services can be located by tracing the usage of the ontology in those
software artifacts. Given an ontology domain D, we de¯ne OntðaÞ ¼ foig as the set
of ontology classes f�ijoi 2 Dg used in an artifact a; and ArtðoÞ ¼ faig is the set of
service artifacts that are a®ected by an ontology class o, o 2 D and ai could be any
type of service artifacts such as message, operation, interface, service endpoint and
work°ow.
The failure probability of a service is calculated by multiplying probability of its
functions and its ontologies, as follows:
P ðsÞ ¼ 1� ð1� PfðsÞÞ �Y
i¼1;...;n
ð1� PadjðoiÞÞ ð7Þ
where
. P ðsÞ is the failure probability of a service.
. PfðsÞ is the failure probability of service functionality.
. oi 2 OntðsÞ is the set of ontology classes used in a service de¯nition.
. PadjðoiÞ is the adjusted failure probability of each ontology class.
4.1.3. Composite service analysis
A composite service de¯nes the work°ow of a set of constituent services. For
example, in OWL-S, each composite process holds a ControlConstruct which is one
of the following: Sequence, Split, Split� Join, Any�Order, Iterate, If � Then�Else, and Choice. The Constructs can contain each other recursively, so the
composite process can model all of the possible work°ows of WS.
The failure probability of the composite service depends on that of each constituent
service and the control constructs over them [6]. A service may be conditioned (e.g.
If � Then� Else) or unconditioned (e.g. Sequence) executed based on the control
constructs. For unconditioned execution, the failure of each service in a construct will
result in a construct failure, hence the product formula is used to calculate the con-
struct failure probability. For conditioned construct, we use weighted sum formula
where weight denotes the execution probability of each branch.
We use cc to denote control construct, and SðccÞ ¼ fsig is the set of services in cc.
�i is the execution probability of a service si andP
�i ¼ 1. Table 1 shows the
formulas to calculate.
4.2. Importance estimation
We use importance measurement to estimate service failure consequences. A failure
in important services may result in a high loss, thus the importance of a service
Risk Assessment and Adaptive Group Testing of Semantic Web Services 607
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
implies the severity level of its failures. The importance of an element is evaluated
from two perspectives:
(1) Based on dependence analysis, that is, the more an element is dependent upon
by others, the more important it is.
(2) Based on usage analysis, that is, the more an element is used in various contexts,
the more important it is.
4.2.1. Dependence-based estimation
Given a domainD, the importance of ontology class o is calculated as a weighted sum
of the number of its dependent classes, including both directed and indirected de-
pendent classes. For any two classes o1 and o2, if there is a path between them inDG,
we de¯ne the distance Disðo1; o2Þ between them as the length of links between them.
Assume that there exists at least one dependence relationship in the domain, that is,
9 o1; o2 2 D such that Depðo1; o2Þ. Then, the dependence-based importance estima-
tion of an ontology class is calculated as follows:
DdaðoÞ ¼X
e1�ijDepiðoÞj ð8Þ
DdrðoÞ ¼CdaðoÞ
maxDCdaðojÞð9Þ
where
. o 2 D is an ontology class in the domain and CdaðoÞ is the absolute importance of o
while CdrðoÞ is the relative importance of o using a dependence-based approach.
. DepiðoÞ ¼ fojg is the set of ontology classes that are dependent upon o with
distance i, that is, 8 oj 2 D, Depðo; ojÞ and Disðo; ojÞ ¼ i.
. jDepiðoÞj is the length of the set, that is, the number of dependent ontology classes.
. maxDCdaðojÞ is the maximum absolute importance 8 oj 2 D.
Table 1. Failure probability of control construct.
Category Graphic expression OWL-S example P ðccÞ
Unconditioned Sequence, Split, Any-Order P ðccÞ ¼ 1�Qi¼1;...;nð1� P ðsiÞÞ
ConditionedIf-The-Else, Choice,
Iterate, While-RepeatP ðccÞ ¼ P
�iP ðsiÞ
608 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
4.2.2. Usage-based estimation
As shown in Fig. 2, an ontology class can be instantiated in various services and a
service can be integrated in various business process. The usage model tracks how an
ontology or a service is used in di®erent contexts and measures the importance of the
element as a weighted sum of the count of the context. Suppose that, for an element e
(e could be an ontology class or a service), ContextðeÞ ¼ fctig is the set of contexts
where e is used. Assume that the element is used in at least one context for at least
once, then the importance can be measured as follows:
CuðeÞ ¼PjContextðeÞj
i¼1 wiNumðe; ctiÞPjContextðeÞji¼1 Numðe; ctiÞ
ð10Þ
where
. CuðeÞ is the importance of e using the usage-based approach.
. jContextðeÞj is the number of contexts that e is used in.
. wi is the weight for a context cti 2 ContextðeÞ.
. Numðe; ctiÞ is the number of e's usage in a context cti.
Given a large number of measured elements, we can also use the statistic models
to normalize the values, such as the Bayesian model for a collective choice as follows:
CubðeÞ ¼1N
PNi¼1 CuðeiÞ þ CuðeÞ
1N
PNi¼1 jContextðeiÞj þ jContextðeÞj
ð11Þ
where
. E ¼ feig is the set of measured elements.
. N ¼ jEj is the number of measured elements.
. jContextðeÞj is the number of contexts that an element e is used in.
5. Risk-Based Adaptive Group Testing
A key technique of WS group testing is to rank test cases and exercise them pro-
gressively. However, as testing progresses, changes can occur in the service-based
system in various ways:
. Services could change, as providers maintain software, update its functionalities
and improve its quality.
. The service composition could change. An application builder may select di®erent
service providers for a constituent service and may change the work°ow of a
composite service to meet changed requirements of business processes.
. The risks of software could change due to changes in services and service
compositions.
Risk Assessment and Adaptive Group Testing of Semantic Web Services 609
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
. The potency of test cases could change due to changes in services and service
compositions.
. The quality preference could change. For example, for a safety-critical or mission-
critical usage context, it may be required to have a comprehensive coverage of test
cases for a service to be accepted. While otherwise, less strict criteria can be used to
reduce the time and cost of testing.
To accommodate these changes, adaptation is necessary to adjust continuously
the measurement of software and test cases, and rules for test cases selection, pri-
oritization and service evaluation. In our previous research, we proposed an adaptive
testing framework based on software cybernetics theory. Software cybernetics
applies cybernetics to control the behaviors of software systems and/or software
development activities [11]. The purpose is to control software development process
based on mathematics. This is a new discipline with initial results in process man-
agement, requirement acquisition, software integration, and software aging. Cai
et al. introduced the Controlled Markov Chain model to software testing, reliability
assessment, and fault-tolerance [9]. As shown in Fig. 4, we applied feedback control
model for adaptive WS testing. Controller controls the testing activities such as test
generation, test selection, test execution, test case ranking, and WS ranking. Opti-
mizer changes the parameters of controller based on test and evaluation results from
the feedback loop. Speci¯cally, the optimizer changes test case ranking so that highly
ranked test cases will be used ¯rst.
The generic architecture is further re¯ned for group testing with windowing
mechanism. With group testing, WS are simultaneously tested in bulk. With the
windowing mechanism, it breaks WS into subsets called windows and testing is
exercised window by window [34]. Windowing allows for re-ranking of test cases and
the re-organization of test hierarchy of the group testing at each window. It improves
test e®ectiveness with fewer but more potent tests and °exibly adjusts the test
strategies across windows. In the windowing approach, the controller controls:
(1) Selection of a group of WS of the window size.
(2) Potency calculation of each test case.
(3) Ranking and classi¯cation of test cases.
(4) Organization of test cases into test hierarchies.
(5) Testing of WS with the test cases through the test hierarchies.
Fig. 4. Control architecture of adaptive testing.
610 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
(6) Ranking of WS based on their testing statistics.
(7) Ruling out of WS that do not meet the passing criteria.
The optimizer optimizes the test strategies from the following aspects:
(1) It adjusts the window size, and the window size control the rate of re-ranking. If
changes are few, the refresh rate can be low.
(2) It adjusts the potency of test cases based on recent test results to that the
controller can re-rank and re-organize the test cases and choose the most potent
test cases for testing on the new window.
(3) It adjusts the WS ranking algorithm and rankings, and decides on the WS rule
out criteria.
Figure 5 shows the architecture of the controller and optimizer. Di®erent strat-
egies can be de¯ned to re-rank test cases and adjust window size.
This paper proposes a risk-based approach to introduce a dynamic mechanism in
order to enable adaptive risk assessment and test case ranking based on runtime
monitoring and pro¯ling of target services and systems.
Figure 6 presents an overview of the proposed approach. As shown in Fig. 6, risks
of the services are identi¯ed in two ways, static and dynamic analysis. Static analysis
is based on service dependencies and usage topologies. As services may be recom-
posed online, dynamic analysis is introduced to detect the changes at runtime and
recalculate the service risks based on runtime pro¯ling of service composition, con-
¯guration, and operation. Test cases are associated to their target features under test
and are ranked based on the risks of the services. With support of runtime moni-
toring, information can be gathered on the ontology dependencies, services usage,
and service work°ows. Pro¯ling and statistical analysis of logged information can
Fig. 5. The adaptive WS group testing architecture.
Risk Assessment and Adaptive Group Testing of Semantic Web Services 611
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
facilitate detecting the changes in the system and adjust the measurement of risks
and test case potencies. Rules are de¯ned to control the testing process. In the generic
WS group testing process, rules are de¯ned to control the following testing activities:
. Risk levels to categorize the test cases and arrange them hierarchically into
di®erent layers.
. The strategies of ranking the test cases, such as cost, potency, criticality, depen-
dency, etc.
. The strategies for ruling out services after each layer of testing.
. The strategies of ranking the services, such as importance, failure rates on the test
case, etc.
. The entry and exit criteria for each layer of group testing.
For example, in an experiment, a sensor is instrumented in the process engine of a
composite service [5] to monitor service calls. The number and sequence of calls to
external services are recorded. Table 2 lists the typical sequence of service
Table 2. Example service composition pro¯le (number of invocations in the pro-
¯le).
Time interval Execution sequence s1 s2 s3 s4 s5 s6 s7
fs1; s2; s4; s5g½t1; t2 fs1; s2; s5; s4g 100 80 20 80 80 20 ���
fs1; s3; s6g½t2; t3 fs1; s2; s4g 50 25 25 25 ��� 25 ���
fs1; s3; s6gfs1; s2; s4; s7g
½t3; t4 fs1; s2; s7; s4g 200 140 60 140 ��� 60 140fs1; s3; s6g
Fig. 6. Approach overview.
612 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
invocations and the number of invocations to each service in three time intervals.
Observation of logged data shows that:
. In interval ½t1; t2, s2 and s3 are conditioned executed after s1 and the execution
probability of each service are �s2 ¼ 0:8 and �s3 ¼ 0:2. s4 and s5 are executed in
parallel after s2. s6 is executed after s3.
. In interval ½t2; t3, service s5 becomes unavailable and there is no subsequent in-
vocation of s5 after s2. In addition, the call distribution between s2 and s3 is
changed from 0:8 :0:2 to 0:5 :0:5.
. In interval ½t3; t4, service s7 is newly bound to the system and invoked after s2 in
parallel to s4. The call distribution between s2 and s3 is resumed to 0.7:0.3.
The changing process of the composition structure is shown in Fig. 7. Similar to
the monitored composition changes, system can also detect changes in ontology
domain and ontology usage. Such changes trigger a reassessment of the two factors of
risk: failure probabilities and importance. Considering this example, Table 3 shows
the changes in the risk of the composite service.
In this application, testing is controlled as a risk mitigation process. That is, the
test cases that have a high probability to detect a risky bug should be exercised ¯rst
so that the risky bugs can be detected and removed early and that the risk of
the whole system can be reduced. As bugs are detected and removed, the rules of the
strategies and criteria are also adapted to re°ect the changes in the risks of the
services. The rule adaptation is by nature a problem of dynamic planning. Each layer
Fig. 7. Example service composition changes.
Table 3. Example of adaptive risk assessment.
Time interval Risk factors s1 s2 s3 s4 s5 s6 s7 Composite service
½t1; t2 P ðsÞ 0.1 0.3 0.1 0.3 0.7 0.3 ��� 0.654
CðsÞ 1 0.8 0.2 0.3 0.3 0.3 ���½t2; t3 P ðsÞ 0.1 0.3 0.1 0.3 ��� 0.3 ��� 0.840
CðsÞ 1 0.5 0.5 0.5 ��� 0.5 ���½t1; t2 P ðsÞ 0.1 0.3 0.1 0.3 ��� 0.3 0.3 0.751
CðsÞ 1 0.7 0.3 0.3 ��� 0.3 0.3
Risk Assessment and Adaptive Group Testing of Semantic Web Services 613
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
in the WS progressive group testing is a stage in decision making, and the goal is to
select a set of test case with maximum potential risks.
6. Experiments and Evaluation
6.1. Experiment setup
The BookFinder OWL ontology is used to illustrate OWL features. Suppose a WS
provides book-searching services, accepts queries, searches the repository, and
returns the required book information. The user can submit many query conditions
such as ISBN, title, authors, publisher, category, cover type, publish year, edition
etc. The OWL-S ¯le (book¯nder.owl, simply as book¯nder) de¯nes the atomic service
model, as shown in Fig. 8. The service has an input of the class type Book, declared in
the domain ontology, and an output of string type. The OWL ¯le (bibtex.owl, simply
as bibtex) de¯nes the domain ontology related to the service. The overall class
hierarchy in bibtex ontology is shown in Fig. 9.
6.2. Evaluation
Suppose that T ¼ ftig is the set of test cases, TE ¼ ftig � T is the set of exercised
test cases, B ¼ fbugig is the set of all bugs in the system, and BD ¼ bugi � B is
the set of bugs that has been detected. To evaluate the test results, we de¯ne the
following two metrics:
(1) Test Cost. The average number of test cases required to detect a bug in a
system.
CostðT Þ ¼ jTEjjBDj ð12Þ
Fig. 8. The example BookFinder service.
614 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
(2) Test E±ciency. Given a number of exercised test cases, the ratio between the
percentage of bugs detected and the percentage of test cases exercised.
EffectðT ;TEÞ ¼ jBDjjBj
� jTEjjT j ð13Þ
The goal of testing is to detect as many bugs as possible with as few as possible
test cases [24]. That is, low test cost and high test e±ciency are preferred.
Fig. 9. The bibtex ontology example.
(a)
Fig. 10. The Bayesian network.
Risk Assessment and Adaptive Group Testing of Semantic Web Services 615
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
Two evaluation experiments were exercised on case studies. For simplicity, we
only consider bugs in ontology classes. Assume that each ontology class has exactly
one bug, which could be detected by one test case. Test cases are categorized based
on their target ontology classes. Each ontology class is assigned 200 test cases. All
test cases are independent and they are designed for exactly one ontology class.
In Experiment 1, eight ontology classes are identi¯ed; the corresponding BN
network is shown in Fig. 10(a). We simulate the joint distribution of BN. Table 4
shows the calculated risk of each node during iterations of nodes risk analysis and test
(b)
Fig. 10. (Continued)
Table 4. Adaptive risk assessment of the ontology classes in experiment 1.
Nodes 1 2 3 4 5 6 7 8
Round1 0.3000 0.5000 0.8000 0.5300 0.4130 0.8060 0.7612 0.8418
Round2 0.3015 0.5036 0.8000 0.5478 0.4148 0.8617 0.7723 observedRound3 0.3052 0.5124 0.8000 0.5918 0.4192 observed 0.7723 observed
Round4 0.3052 0.5124 0.8000 0.5918 0.4192 observed observed observed
Round5 0.3052 0.5124 observed 0.5918 0.4592 observed 0.7723 observed
Round6 0.3396 0.5943 observed observed 0.4130 observed observed observedRound7 0.3333 observed observed observed 0.4130 observed observed observed
Round8 0.3333 observed observed observed observed observed observed observed
616 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
selection. Figure 11 shows results of the experiment which compares the proposed
risk-based approach (RBT) with random testing approach (RT). In the Figure, the
horizontal axis lists the number of bugs detected. The curves show the trend of test
cost (Fig. 11(a)) and test e±ciency (Fig. 11(b)) with increasing number of bugs
detected. Here, the RT approach is estimated with two probability levels, 80% and
90%; that is, to detect n bugs with a certain probability, the number of test cases
required.
In Experiment 2, 16 ontology classes are identi¯ed; corresponding BN network is
shown in Fig. 10(b). Taking experiment 1 as a training process for experiment 2, we
introduce the adaptation mechanism to rank test cases based on their defect de-
tection history. Figure 12 shows the experiment results which compare test cost
(Fig. 12(a)) and test e±ciency (Fig. 12(b)) with increasing number of bugs detected.
From the experiment we can see that the risk-based approach can greatly reduce
test cost and improve test e±ciency. The learning process and the adaptation
mechanism can greatly improve the test quality.
(a) Cost comparison
(b) E±ciency comparison
Fig. 11. Comparison of test cost and test e®ectiveness between RBT and RT of experiment 1.
Risk Assessment and Adaptive Group Testing of Semantic Web Services 617
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
7. Conclusion and Outlook
Testing is critical to assure quality properties of a service-based system so that
services can deliver their promise. To reduce test cost and improve test e±ciency, this
paper proposes a risk-based approach to ranking and selecting test cases for con-
trolling the process of WS group testing. An adaptation mechanism is also intro-
duced so that the risks can be dynamically measured, and control rules can be
dynamically adjusted online. Motivated by unique testing issues, the work shows the
convergence among various disciplines including statistical, service-oriented com-
puting, and semantic engineering. Some preliminary results illustrate the feasibility
and advantages of the proposed approach.
Future work include: (1) application and experiments with a real application
domain; (2) service behavior analysis to identify typical fault models and risky sce-
narios; (3) model enhancement to achieve better test evaluation results by using
more sophisticated mathematical models and reasoning techniques.
Acknowledgments
This research is supported by the National Science Foundation China (No.
60603035), the Open Fund of the State Key Laboratory of Software Development
Environment (No. SKLSDE-2009KF-2-0X) of Beijing University of Aeronautics and
(a) Cost comparison
(b) E±ciency comparison
Fig. 12. Comparison of test cost and test e®ectiveness between RBT and RT of experiment 2.
618 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
Astronautics, and the National Basic Research Program of China (973 Program)
under Grant (No. 2005CB321901).The second author was partially supported by the
FP6 MUSING project (IST-FP6 27097).
References
1. S. Amland, Risk-based testing: Risk analysis fundamentals and metrics for softwaretesting including a ¯nancial application case study, Journal of Systems and Software53(3) (2000) 287�295.
2. J. Bach, Heuristic risk-based testing, STQE Magazine 1(6) (1999).3. X. Bai, Z. Cao and Y. Chen, Design of a trustworthy service broker and dependence-based
progressive group testing, Int. J. Simulation and Process Modeling 3(1/2) (2007) 66�79.4. X. Bai, Y. Chen and Z. Shao, Adaptive Web Services testing, IWSC, 2007, pp. 233�236.5. X. Bai, S. Lee, R. Liu, W. T. Tsai and Y. Chen, Collaborative Web Services monitoring
with active service broker, COMPSAC, 2008, pp. 84�91.6. L. Wang, X. Bai, Y. Chen and L. Zhou, A hierarchical reliability model of service-based
software system, COMPSAC 1 (2009) 199�208.7. X. Bai and R. S. Kenett, Risk-based adaptive group testing of Web Services, COMPSAC
2 (2009) 485�490.8. T. Berners-Lee, J. Handler and O. Lassila, The Semantic Web, Sci. Am. May 2001.
(Revised 2008)9. K.-Y. Cai, Optimal software testing and adaptive software testing in the context of
software cybernetics, Information and Software Technology 44 (2002) 841�844.10. G. Canfora and M. Penta, Servcie-oriented architectures testing: A survey. Lecture Notes
in Computer Science, Vol. 5413 (2009), pp. 78�105.11. J. W. Cangussu, S. D. Miller, K. Y. Cai and A. P. Mathur, Software cybernetics, in
Encyclopedia of Computer Science and Engineering, to be published by John Wiley &Sons.
12. Y. Chen and R. L. Probert, A risk-based regression test selection strategy, in 14th ISSRE,2003.
13. R. Dorfman, The detection of defective members of large population, Annals of Mathe-matical Statistics 14 (1964) 436�440.
14. F. M. Finucan, The blood testing problem, Applied Statistics 13 (1964) 43�50.15. M. P. Papazoglou and D. Georgakopoulos, Service-oriented computing, Commun. ACM
46(10) (2003) 25�28.16. A. Gomez-Perez, M. Fernandez-Lopez and O. Corcho, Ontological Engineering (Springer,
2005).17. D. V. Karolak, Software Engineering Risk Management (IEEE Computer Society Press,
1996).18. R. S. Kenett and M. Pollak, A semi-parametric approach to testing for reliability growth
with an application to software systems, IEEE Transactions on Reliability 35(3) (1986)304�311.
19. R. S. Kenett and S. Zacks, Modern Industrial Statistics: Design and Control of Qualityand Reliability (Duxbury Press, San Francisco, 1998).
20. R. S. Kenett and E. Baker, Process Improvement and CMMI for Systems and Software(Auerbach Publications, 2010).
21. R. S. Kenett and C. Tapiero, Quality and risk: Convergence and perspectives, http://ssrn.com/abstract=1433490, QPRC (2009).
22. R. S. Kenett and Y. Raanan, Operational risk management: A practical approach tointelligent data analysis, Printed and online versions (John Wiley & Sons, 2010).
Risk Assessment and Adaptive Group Testing of Semantic Web Services 619
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.
23. D. Martin et al., Bringing semantics to Web Services: The OWL-S approach, SWSWPC,2004.
24. G. Myers, The Art of Software Testing (John Wiley & Sons, 1979).25. I. Ottevanger, A risk-based test strategy, STARWest, 1999.26. Tim O'Reilly, What is Web 2.0, http://oreilly.com/web2/archive/what-is-web-20.html
(2005).27. F. Redmill, Exploring risk-based testing and its implications, Software Testing, Veri¯-
cation and Reliability 14 (2004) 3�15.28. L. H. Rosenberg, R. Stapko and A. Gallo, Risk-based object oriented testing, in 24th
SWE, 1999.29. M. P. Singh and M. N. Huhns, Service-Oriented Computing (John Wiley & Sons, 2005).30. M. Sobel and P. A. Groll, Group testing to eliminate all defectives in a binomial sample,
Bell System Technical Journal 38(5) (1959) 1179�1252.31. M. Spies, An ontology modeling perspective on business reporting, Information Systems,
2009.32. H. Stallbaum, A. Metzger and K. Pohl, An automated technique for risk-based test case
generation and prioritization, ICSE (2008), pp. 67�70.33. W. T. Tsai, Y. Chen, Z. Cao, X. Bai, H. Huang and R. Paul, Testing Web Services using
progressive group testing, Advanced Workshop on Content Computing, 2004,pp. 314�322.
34. W. T. Tsai, X. Bai, Y. Chen and X. Zhou, Web Services group testing with windowingmechanisms, SOSE, 2005, pp. 213�218.
35. W. T. Tsai, Q. Huang, J. Xu, Y. Chen and R. Paul, Ontology-based dynamic processcollaboration in service-oriented architecture, SOCA, 2007, pp. 39�46.
36. G. S. Watson, A study of the group screening method, Technometrics 3(3) (1961)371�388.
37. MUSING, Multi-Industry Semantic-Based Business Intelligence Solutions, available at:http://www.musing.eu.
38. Web Services Architecture[s], W3C Working Draft, http://www.w3.org/TR/ws-arch/,Nov. 14, 2002.
39. W3C, \OWL-S: Semantic Markup for Web Services," http://www.w3.org/Submission/OWL-S/.
620 X. Bai, R. S. Kenett & W. Yu
Int.
J. S
oft.
Eng
. Kno
wl.
Eng
. 201
2.22
:595
-620
. Dow
nloa
ded
from
ww
w.w
orld
scie
ntif
ic.c
omby
PE
KIN
G U
NIV
ER
SIT
Y o
n 01
/19/
13. F
or p
erso
nal u
se o
nly.