Comparing Web Service Performance
WS Test 1.1 Benchmark Results for.NET 2.0, .NET 1.1, Sun JWSDP 1.5 and IBM WebSphere 6.0
Introduction
WSTest is a Web Service benchmark created by Sun Microsystems and augmented by Microsoft. The
benchmark tests various web service operations across varying SOAP object sizes. Sun’s original WSTest
1.0 benchmark kit can be downloaded from:
http://java.sun.com/developer/codesamples/webservices.html#Performance
WSTest 1.1 is a Microsoft implementation of WSTest 1.0 that mirrors the Sun implementation of the
EchoVoid, EchoStruct and EchoList tests. In addition, it includes one new test, GetOrder, that tests a more
complex object type that simulates a purchase order. This report details the performance of .NET Beta2
(build v2.0.50215) vs. NET 1.1, Sun’s latest Java Web Services Developer Pack (JWSDP 1.5) with the Sun
ONE Enterprise HTTP Server 6.1 (SP3), and IBM Websphere 6.0 with the IBM HTTP Server 6.0. All tests
were conducted on 32-bit Windows Server 2003 RTM.
The WSTest benchmark does not exercise full platform capabilities. Elements like data access, security
checking, encryption and digital signatures, and so on, are all outside the scope of this benchmark
(although it could be adapted in the future to incorporate such technologies). For example, many web
services are secured, either via transport-level security such as TLS, or via SOAP-based security in WS-
Security. Encryption and digital signature operations will constitute more work performed for each web
service operation, and change these results. In summary, be careful not to draw conclusions more
broadly than the data presented here indicate. With that said, however, the WSTest 1.1 benchmark does
represent a good benchmark in terms of comparing the underlying SOAP networking and SOAP XML
serialization/deserialization performance across the platforms tested. Even broader benchmarks that use
complex applications that include security, data access, pooling, logging and so on, can only provide a
general indication of potential performance. To answer the question, "Which platform will perform best
for my application?", customers need to perform their own testing. Microsoft publishes the source code
to this benchmark, and all of its published .NET benchmarks, to encourage customers to perform such
testing.
Sample Code can be downloaded from:
http://www.microsoft.com/downloads/details.aspx?FamilyId=84EDAA77-551F-4124-B398-
C610884FC6F5&displaylang=en
Configurations Tested
The configurations tested include:
1. SUN JWSDP1.5/Sun HTTP Server 6.1
2. IBM WebSphere 6.0/IBM HTTP Server 6.0
3. .NET 1.1/IIS 6.0
4. .NET 2.0/IIS 6.0
Extensive time was taken to properly tune all configurations following vendor best practices and iterative
testing to achieve best results for each configuration. In all cases, tests achieved close to 100% CPU
saturation under peak throughput client loads (meaning no significant bottlenecks other than CPU
saturation), indicating proper tuning of the middle tier across all products. In all configurations, tracing,
logging, authentication and session state tracking is turned off for all products. Http 1.1 KeepAlives are
enabled and set to a sufficient number to achieve best results. Java heap sizes and thread settings were
tuned to achieve maximum performance. All tests were conducted on the same hardware, an HP DL585 2-
processor AMD Opteron @1.8 GHz and 4 MB RAM with gigabit networking.
Discussion of Test Methodologies
Sun’s original WSTest 1.0 uses a custom driver program (one for Java, one for .NET) to drive load against
the backend Web Services deployed on the server. Their original published results were flawed in that
they did not properly configure the MaxConnections setting in the .NET Machine.Config for the .NET
version of the benchmark driver program, which essentially throttled the driver to two network
connections to the server, preventing full saturation of the server. Corrected results were published by
Microsoft shortly thereafter using the java custom driver program to drive load, and can be viewed at:
http://download.microsoft.com/download/1/9/b/19bc8aa7-05fa-4e86-a612-
c2cc181e4ee6/sun_ws_benchmark_response.pdf
Sun’s original test methodology used a single client machine to drive load against the server using no
think time between client requests. The driver program established a connection to the server across a
limited number of threads (16), and the threads re-used this connection without closing/re-establishing
connections between requests. This methodology is not ideal, however, because it does not simulate real-
world usage of a Web Service. Typically in a real-world scenario, many different clients make requests to
the web service, and the server must handle load across many physically distributed clients as they
establish new connections to the Web Service. For the results published in this paper, we have used two
test methodologies for a more complete picture of Web Service performance. Both use Mercury
LoadRunner’s SOAP client to drive load across all products tested. By using Mercury LoadRunner instead of
a custom driver program, the tests are more accurate in that all products are tested using the exact same
test harness. Furthermore, Mercury LoadRunner is capable of driving load using many distributed client
machines, while Sun’s custom driver program was meant to be run from a single client machine.
The two test methodologies employed achieve different results for the products tested. In the single-client
configuration that mirrors Sun’s original methodology, client threads re-use existing network connections
on each iteration, and the client thread does not close out its connection after executing its SOAP request.
This test methodology does not realistically simulate usage for thousands of clients, each connecting from
a unique IP address, executing a request, and then closing its connection to the server. Therefore, in
addition to this methodology, we also tested the same products using 50 physical client machines, with a
1 second think time before a request, and a connection close at the end of each request. This more
realistically simulates real-world usage of a deployed web service, with the server having to handle many
more open connections and physical IPs through its network infrastructure and keepalive system.
Finally, the test results included in this paper use Sun’s recommended SunOne Enterprise HTTP Server 6.1
instead of TomCat, which was used in Sun’s original tests. Sun recommends the use of this HTTP server in
published materials for enterprise scenarios given its more advanced management capabilities. Also, we
tested the latest IBM WebSphere 6.0 application server, configured to use IBM’s HTTP Server. Both the
IBM HTTP Server and the SunOne Enterprise HTTP Server are derived from Apache.
As with other .NET benchmark tests, full source code is published so that customers and other vendors
can replicate the tests and verify/comment on the results, or use the tests on other platforms or against
other products.
WSTest 1.1 Overview
WSTest 1.1 consists of 4 distinct web service methods. The methods isolate the SOAP network
stack/serialization performance of each platform tested. The web service calls included in WSTest 1.1
include:
• EchoVoid – sends and receives an empty message, with no deserialization/serialization
• EchoStruct –receives an array of any length as the input parameter with each element
consisting of a structure. The list is then sent back to the client as the return type. The repeating structures within the list contain one element each of type integer, float and string datatypes. The longer the listsize, the more work required in deserialization and re-serializing the passed SOAP object to/from XML.
• EchoList-sends and receives a linked list of any size, each element consists of the same
structure used in EchoStruct.
• GetOrder – this function simulates a request for a complete purchase order, taking two integer
input parameters with an order object returned from the Web Service over SOAP. The order object is a more complex structure that includes order header information, a customer sub-class with shipping address and billing address structures, as well as any number of lineitem objects. For this test, the web services return exactly 50 lineitems for each request as part of the returned order object.
In the results published in this paper, peak throughput is reported as the number of web service requests
handled per second by each product (TPS). Peak throughput is achieved at peak CPU saturation of the
server, but not over-saturation. In other words, at peak throughput each product is able to process all
incoming requests using available CPU cycles without queuing any requests. In such cases, response times
are all in the millisecond range since no requests are being queued. Care was taken in testing to ensure
that the peak throughput reported represents the top of the throughput curve for each product as client
loads were steadily increased.
System Configuration Web Service Host Serve
HP DL585 with 2 x 1.8 GHZ AMD Opteron Processors 4 GB RAM Gigabit Networking Windows Server 2003 Enterprise Edition (32-bit)
In isolation the following software platforms were tested on the above hardware:
• .NET 1.1
• .NET 2.0 Beta2 (v 2.0.50215 available on MSDN)
• JWSDP 1.5 Update 2 with SunOne HTTP Server 6.1
• IBM WebSphere 6.0 with IBM HTTP Server 6.0
Client TestBed
50 physical Dell client computers configured with 512 MB RAM and one Intel Celeron CPU @ 500 MHZ, and
gigabit networking.
Each client is running Mercury LoadRunner agents, using Mercury’s SOAP driver program. Each client
machine is capable of generating many concurrent virtual users (threads), with the overall test bed
capable of generating well over 10,000 concurrent user requests to the backend system. In the more
realistic multi-client tests, each virtual user is configured with a one second think time and all 50 clients
are used. In the less-realistic single client test, a single client computer is used running 16 threads, with
no think time between requests. In both test methodologies, care was taken to ensure servers were
operating at full saturation and peak throughput was accurately captured. Because the think time used in
the multi-client test is one second, the peak TPS rate reported almost exactly matches the number of
virtual users being supported simultaneously by the web service host machine. In other words, at 1000
TPS approximately 1000 virtual users (20 per client machine) with a one-second think time are driving
load against the server.
Results Using 50 Physical Clients Methodology
(1 second think time, many simulated users--varies for each product to achieve peak throughput for each)
Note GetOrder is not included using this single-client methodology as it results in a client-side Mercury
LoadRunner bottleneck, preventing full saturation of all the tested products. This client-side bottleneck is
not present when using 50 distributed clients to drive load. Interestingly, only the GetOrder method is
affected by a client side bottleneck using the single-client test.
The results indicate that Web Service performance is roughly 25% better in .NET 2.0 than .NET 1.1 when
the soap object size is large and hence deserialization/serialization operations are more intensive. The
difference is even more dramatic for smaller SOAP message sizes, with the echostruct size 20 test
showing .NET 2.0 beta2 performance to be roughly 40% better than .NET 1.1. In all cases using the 50-
client methodology, .NET outperforms both Sun JWSDP 1.5 and IBM Websphere 6.0, often by wide
margins.
Notes on Web Container
The Web containers chosen for these tests and used with Sun’s JWSDP 1.5 and IBM Websphere 6.0 are
the http servers recommended by each vendor for enterprise deployments (Sun ONE HTTP Server and
IBM HTTP Server respectively). With the code published, customers can configure and run the tests using
different web servers and/or application servers. For example, TomCat can be used with JWSDP very
easily, and as well IBM’s in-process HTTP listener (port 9080) can be used in lieu of the IBM HTTP Server.
It should be noted however, that neither of these products are the products officially recommended by the
vendor for actual enterprise deployments, in part because of lacking management features and in part due
to the in-process nature of the java code with respect to the actual web server. In the case of .NET, the
default process model is used, meaning in all cases there is full process-isolation between the .NET web
service logic and the actual web server itself. This is important for reliability/crash protection. In all cases,
a single JVM instance/.NET Worker process is sufficient to achieve full server saturation across both test
methodologies and all tests.
Conclusion
Web Service performance has been significantly improved in .NET 2.0 vs. .NET 1.1. The WSTest 1.1
Benchmark kit, downloadable from MSDN, is a good tool to use to measure web service performance of
different application server platforms, and to judge raw capacity of backend hardware for processing web
service requests. Customers are encouraged to download the kit (inclusive of Mercury Scripts) and
perform their own tests.
Appendix – Tuning Parameters
.NET 1.1 and 2.0
IIS 6.0
IIS access logging turned off IIS Authentication set to anonymous
.NET via Web.Config
Authentication set to none to match java authentication Session state turned off
JWSDP 1.5/Sun ONE HTTP Server
JWSDP 1.5
In Server.XML the following options are used
-xmx1024 (-xmx 512/-xms512 yield the same results) -xms1024 "-server"
Sun ONE HTTP Server 6.1
Session state turned off Access logging turned off
The following options are used in Magnus.config:
RqThrottle128 KeepAliveThreads2 KeepAliveQueryMeanTime5000 KeepAliveQueryMaxSleepTime5000 SecurityOff ListenQ2000 Access Logging commented out KeepAlive3000
IBM WebSphere 6.0
WebSphere
Java Process Definition: -xmx1024 and -xms1024 (-xmx 512/-xms512 yield the same results) No access logging No Session state Inbound Http channel using 3000 keepalives Web container set to 50 min and max threads
IBM HTTP Server (via httpd.conf)
Timeout 300 KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 10 <IfModule mpm_winnt.c> ThreadLimit 2048 ThreadsPerChild 250 MaxRequestsPerChild 0
</IfModule>