Network World Clear Choice Test: WAN Acceleration
Published
in Network World, 13 August 2007
Test Methodology
Version 2007081301. Copyright
2006-2007 by Network Test Inc. Vendors are encouraged to comment on this
document and any other aspect of test methodology. Network Test reserves the
right to change test parameters at any time.
A PDF version of this document is available here: http://networktest.com/wa07/wa07meth.pdf
This document describes benchmarking procedures for WAN
acceleration devices. Test results are tentatively scheduled for publication in
Network World in August 2007.
Given that Network WorldÕs readership is comprised largely
of corporate network managers, a key focus of these tests will be suitability
of WAN acceleration devices for use in enterprise settings. These tests will
assess devices using the following metrics:
This document is organized as follows. This section
introduces the tests to be conducted. Section 2 describes the test bed. Section
3 describes the tests to be performed. Section 4 provides a change log.
To assess the effectiveness of WAN acceleration in an
enterprise context, we have constructed a test bed that carries enterprise
traffic and simulates many aspects of enterprise WAN behavior.
The figure below illustrates the logical test bed. Bogus Corp. has a hub-and-spoke network
connecting its Boston headquarters with data centers in Portsmouth, NH and El
Segundo, CA, and branch offices in Newton, MA and San Francisco, CA.
This network covers all four permutations of low and high
bandwidth and latency. Dedicated T3 (45-Mbit/s) circuits connect the Boston
headquarters with the Portsmouth and El Segundo data centers.[1]
The links between the Boston headquarters and the Newton and San Francisco
branch offices use a VDSL service rate-controlled at 1.5 Mbit/s.
Application traffic between all offices consists of:
CIFS
MAPI (Exchange servers and Outlook clients)
HTTP
SIP/RTP (for QoS tests)
UDP/IP background traffic (for QoS tests)
HTTPS (optional, see ÒOptional SSL HandlingÓ below)
This section discusses the devices to be used on the test
bed.
Each participating vendor is required to supply the
following:
WAN links introduce reduced bandwidth and delay. Our test
bed recreates these conditions using the Spirent
Converged Network Impairment Emulator (SCNIE) between all locations. In addition to standard impairment
functions, SCNIE is the first emulator to implement the TIA-921 standard for
measured impairments over time. The TIA-921 impairment model is based on actual
network conditions measured by service providers.
The following table describes the bandwidth and delay
characteristics of each link.
From BOS toÉ |
Bandwidth |
Round-trip delay (0.5n
applied equally in each direction) |
FIFO buffer size
(bytes)[2]
|
POR |
45 Mbit/s |
15 ms |
1,400,000 |
NEW |
1.5 Mbit/s |
15 ms |
48,000 |
LAX |
45 Mbit/s |
100 ms |
1,400,000 |
SFO |
1.5 Mbit/s |
100 ms |
48,000 |
Note that we introduce bandwidth and delay restrictions
only, not packet loss and/or jitter. While these latter two conditions exist on
many WAN circuits, developing a meaningful multi-variable model that factors
for these conditions would greatly increase the number of test permutations. We
hope to model all these factors in future tests, but for now bandwidth and
delay will be the factors used in WAN emulation.
We use real Windows servers and clients to offer CIFS, MAPI,
and HTTP traffic. The standard server platform is Windows Advanced Server 2003
R2 running IIS6 and Exchange Server 2003. The standard client platform is
Windows XP Professional SP2 and Office 2007.
To automate the execution of data transfers, we use Visual
Basic scripts custom-developed for this project. Each client runs Microsoft
.Net 2.0 and Office 2007 Primary Interop Assemblies (PIA) to support the
scripts.
To test TCP connection scalability and generate HTTPS
traffic, we plan to use the Spirent
Avalanche and Reflector traffic generator/analyzers. Our Avalanche and
Reflector appliances can generate up to 4 million concurrent TCP connections. Please
advise if your system has a higher rated capacity.
To assess audio quality for VoIP traffic in the QoS tests,
we use the GL
Communications Voice Quality Testing (VQT) tool suite.
To generate background traffic in the QoS tests, we use the
Spirent SmartBits traffic generator/analyzer and SpirentÕs SmartWindow
application.
A growing number of WAN acceleration devices support
optimization of SSL traffic. We plan to conduct performance tests with HTTPS
traffic on those devices that support SSL. Not all devices yet support SSL
optimization. In the interest of ensuring apples-to-apples comparisons, the
main test article will discuss only
those features supported by all products. We plan to discuss SSL results
in a sidebar article accompanying the main test.
Vendors MAY, at their option, configure their devices to
serve as routers at each location. If not, we will provide line-rate devices to
route traffic between sites.
Some WAN acceleration devices function as proxies and
require IP addresses; others are passive and do not. In either cases, each
device also requires an IP address for device management. We provide addressing
guidelines in the following table. Please let us know if your device does
not meet these addressing requirements.
Site |
Inline interface (if
needed) |
Management address |
Management console |
Default gateway |
Headquarters (BOS) |
10.0.0.2/24 |
10.0.0.3/24 |
10.0.0.10/24 |
10.0.0.1/24 |
Data center (PRT) |
10.1.0.2/24 |
10.1.0.3/24 |
10.0.0.10/24 |
10.1.0.1/24 |
Branch office (NEW) |
10.2.0.2/24 |
10.2.0.3/24 |
10.0.0.10/24 |
10.2.0.1/24 |
Data center (LAX) |
10.3.0.2/24 |
10.3.0.3/24 |
10.0.0.10/24 |
10.3.0.1/24 |
Branch office (SFO) |
10.4.0.2/24 |
10.4.0.3/24 |
10.0.0.10/24 |
10.4.0.1/24 |
This section describes procedures used to assess devices in
terms of functionality, manageability, performance, and usability.
Given that not all WAN acceleration devices work the same
way, our assessment of WAN acceleration functionality will attempt to provide a
taxonomy of device features. The questions we plan to answer in assessing
functionality include the following:
While increased performance is the nominal reason for
deploying WAN acceleration devices, the addition of any new platform to the
network inevitably raises questions about manageability. While network management
is a huge topic, we plan to focus on configuration and monitoring tasks
specific to WAN acceleration. Among the management criteria to be evaluated:
While functionality, manageability, and usability are all
important criteria in selecting a WAN acceleration device, improving
performance is the key attraction. We
assess device performance in several ways, measuring latency and bandwidth
reduction, QoS handling, and concurrent connection scalability.
Traffic types are:
CIFS (File transfers and directory listings)
MAPI (Outlook and Exchange)
HTTP (Home pages of amazon.com, boston.com, caltech.edu,
cnn.com, and news.bbc.co.uk)
SIP/RTP voice over IP traffic (used only in QoS tests)
UDP/IP background traffic (used only in QoS tests)
HTTPS (optional, if supported)
For each of the traffic types above except VoIP, we will
measure the effective reduction in delay and bandwidth.
All tests run concurrently between the Boston headquarters
and the four branch sites.
The CIFS tests involve the transfer of 750 Word 2003 (not
Word 2007) files per each of two clients at each T3 site and 25 Word 2003 files
per each of two clients at each T1 site. The Word files range in size from
roughly 25 kbytes to 1 Mbyte. The file contents are ÒwordsÓ comprising random
characters, with a random word length averaging approximately five characters.
Clients concurrently run a ÒCIFS-PullÓ and ÒCIFS-PushÓ test
in which they download and upload files, respectively, from a server in Boston.
In the CIFS-Pull case, clients perform the following
operations:
In the CIFS-Push case, clients perform the following
operations:
We run the CIFS tests a total of four times:
1. Baseline test with no acceleration enabled and no DUT
inline
2. Acceleration enabled, a ÒcoldÓ run to allow the DUT to
learn the traffic pattern and possibly cache data
3. Acceleration enabled, a ÒwarmÓ run after the DUT has
learned the traffic pattern and cached data
4. A Ò10 percentÓ run in which 10 percent of the files to be
transferred have been changed
In the MAPI tests, Outlook 2007 clients on T3 links create
240 messages of random length and with a random number of Word 2003 file
attachments; for clients on T1 links, each creates 10 messages. All messages
are destined to all other clients at all sites.
At test startup, all Outlook clients are in offline mode. A
Visual Basic script running on each client causes it to go online, sending all
messages to the Exchange server in Boston and then on to their destinations.
In the HTTP tests, Spirent Reflector emulates Web servers
and Spirent Avalanche emulates Internet Explorer Web clients. In all tests,
clients retrieve an 11-kbyte object from one of eight Web servers configured at
the headquarters site.
We conduct the test twice: once with 248 total users and
again with 2,480 total users. The following table lists the distribution of
users:
Test |
LAX clients |
NEW clients |
PRT clients |
SFO clients |
248 total users |
120 users |
4 users |
120 users |
4 users |
2,480 total users |
1,200 users |
40 users |
1,200 users |
4 users |
The HTTP tests are identical to the HTTP tests except that
clients retrieve objects over SSL connections.
Not all devices under test support SSL proxying. Thus,
results from this test will appear in a sidebar and will not be used in scoring
results.
In this test we deliberately oversubscribe a link with
low-priority UDP/IP traffic while simultaneously attempting to place
high-priority VoIP calls. Vendors should not use static bandwidth allocation (aka strict priority) to reserve
bandwidth for VoIP traffic; the final step of our procedure is a check against
TDM-like approaches.
The background traffic consists of UDP/IP packets with a
destination port of 111, generated by SpirentÕs SmartBits traffic
generator/analyzer. Note that the packets do not have an NFS header; they are
simply correctly formed UDP/IP packets.
Devices should use
diff-serv code points for prioritization (if supported). Further, devices should
re-mark all incoming packets with new
DSCPs; for this test, assume that the DSCP markings applied by hosts cannot be
trusted.
In this test, the WAN acceleration device should re-mark
VoIP packets with a DSCP value of 40. The device should re-mark UDP/IP
background packets with a DSCP value of 20. We will verify these settings using
a protocol analyzer to capture and decode traffic.
This test uses the following procedure:
This test will determine the maximum number of TCP
connections one pair of WAN acceleration devices can handle.
We using the Spirent Avalanche and Reflector test
instruments to generate traffic and follow this procedure:
1. Using HTTP 1.1, each client emulated by Avalanche
requests a 1-kbyte object from an IIS Web server emulated by Reflector.
2. After receiving the object, the client waits 60 seconds
before requesting the next object. This large client-side latency allows the
buildup of a large number of concurrent connections between clients and
servers.
3. Using the procedure described in the previous step, we
ramp up the number of connections made to the servers. Our two pairs of
Avalanches and Reflectors can request up to 4 million concurrent connections.
The Avalanche load specification for this test is
Òconnections.Ó This load profile uses a fairly coarse-grained stair-step
pattern, setting up as many as 4 million connection attempts. We attempt to
measure to the nearest 1,000 concurrent connections.
The following table lists sample load profile phases for a
test with 4 million concurrent connections. Note that the actual counts we use
depend on the DUTÕs capability.
|
Phase 0 |
Phase 1 |
Phase 2 |
Phase 3 |
Label |
Delay |
Stair Step |
Steady State |
Ramp Down |
Pattern |
Flat |
Stair |
Stair |
Flat |
Time Scale |
Default |
Default |
Default |
Default |
Repetitions |
NA |
10 |
1 |
NA |
Height |
0 |
400,000 |
0 |
0 |
Ramp Time |
0 |
300 |
0 |
0 |
Steady Time |
8 |
28 |
64 |
16 |
The metric for this test is maximum concurrent TCP
connection capacity, sustained over a 60-second steady-state period.
While usability assessments are inherently subjective, we
also make an effort to make quantitative as well as qualitative judgments about
each DUTÕs ease of deployment and maintenance. Among the usability criteria we
plan to use:
Above and beyond these criteria are intrinsically subjective
criteria. If it takes us nine steps on each of five menus to perform a task
that should be available on one screen, weÕll say so. At the same time, we
bring no preconceived notions of ÒgoodÓ or ÒbadÓ UI designs to this project. In
the subjective ratings, like all other tests, the ultimate goal is describing how
well the DUT helps the network manager accelerate traffic across the WAN.
Version 2007081301
Test published; changed title to include publication date
Version 2007070301
Section 2.1: Added UDP/IP as background traffic
Section 2.2.2: Deleted jitter from description, noted that
WAN impairment tool introduces rate control and delay only
Added FIFO queue values
Section 2.2.3: Deleted LoadSim reference; added reference to
Outlook client; deleted Asterisk reference; added references to VB scripts,
Office 2007, .Net 2.0, and Office 2007 PIA; added SmartBits reference
Section 2.2.4: Noted that SSL testing will be conducted on
all products but not counted in scoring
Section 2.2.5: Changed IPv4 addressing from /24 to /16 at
each site
Section 3.3: Added UDP/IP as background traffic
Section 3.3.1: Added detailed descriptions of CIFS-Pull,
CIFS-Push, MAPI, HTTP, and HTTPS tests
Section 3.3.2: Changed background traffic from HTTP to
UDP/IP from SmartBits
Section 3.3.3: Restated objective as test between single
pair of devices; deleted concurrent connection testing between all sites
Version 2007010201
Initial public release
Version 20061222
Prerelease copy
[1] Note, however, that WAN acceleration device interfaces may be copper gigabit or fast Ethernet, as appropriate.
[2] Core routers typically offer at least 250 ms of buffering capacity to deal with transient network congestion. (See, for example, the series of articles on router buffer sizes in the July 2005 issue of ACM SIGCOMM Computer Communications Review.) The FIFO queue sizes here represent 250 ms for each link speed. Failure to define FIFO queuing in the WAN impairment tool may introduce significant and unintended packet loss.