VERITAS Storage Replicator for VolumeManager ™ 3.0.2
Configuration Notes
Solaris
May 2000100-001528
Disclaimer
The information contained in this publication is subject to change without notice.
VERITAS Software Corporation makes no warranty of any kind with regard to this
manual, including, but not limited to, the implied warranties of merchantability and
fitness for a particular purpose. VERITAS Software Corporation shall not be liable for
errors contained herein or for incidental or consequential damages in connection with the
furnishing, performance, or use of this manual.
Copyright
Copyright © 2000 VERITAS Software Corporation. All rights reserved. VERITAS is a
registered trademark of VERITAS Software Corporation in the US and other countries.
The VERITAS logo and VERITAS Storage Replicator for Volume Manager are trademarks
of VERITAS Software Corporation. All other trademarks or registered trademarks are the
property of their respective owners.
Printed in the USA, May 2000.
VERITAS Software Corporation
1600 Plymouth St.
Mountain View, CA 94043
Phone 650–335–8000
Fax 650–335–8050
www.veritas.com
Contents
Chapter 1. Effects of Configuration on Performance . . . . . . . . . . . . . . . . . . . . . . . . . .5
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Application Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 2. Configuring for Efficient Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
RLINKs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Synchronous versus Asynchronous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Latency and SRL Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Effects on Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Choosing an Appropriate Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
SRL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
SRL Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
SRL Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
SRL Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Peak Usage Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Initialization Period Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Secondary Backup Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Downtime Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Additional Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Buffer Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
iii
Readback Buffer Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Write Buffer Space on the Primary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Buffer Space on the Secondary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
iv Storage Replicator for Volume Manager Configuration Notes
Effects of Configuration on Performance
1 IntroductionThis chapter discusses some of the issues involved in configuring a Storage Replicator
(SRVM) Replicated Data Set (RDS). To set up an efficient SRVM configuration, it is
necessary to understand how the configuration, along with the design of SRVM, can
combine to affect SRVM and application performance. The following section describes the
flow of control in handling a write request within SRVM. Subsequent sections discuss
how these design details can affect performance.
Throughout this document, the term “application” refers to whichever program writes
directly to the raw volume. So in the case of a database using a file system mounted on a
volume, the file system is the application; if the database writes directly to a raw volume,
then it is considered the application.
DesignOn the Primary side, writes enter SRVM through the normal volume interfaces. However,
rather than performing I/O operations directly on a volume, SRVM passes the request up
to the Replicated Volume Group (RVG) containing the volume.
Figure 1 on page 6 shows the flow of control for a typical SRVM configuration containing
two remote sites, one connected via an asynchronous RLINK, the other via a synchronous
one. When a write operation is passed to the RVG, the data must first be copied into a
kernel memory buffer. SRVM then writes the data and some header information, to the
Storage Replicator Log (SRL), and waits for the write to complete. As shown in Figure 1,
this completes Phase 1 of the operation, which must be executed for all write requests.
Phase 2 is divided into synchronous and asynchronous components, since an RVG may
have one or more associated RLINKs. Each RLINK may operate independently in
synchronous or asynchronous mode. This phase is responsible for sending the write
request to all RLINKs and writing it to the Primary data volume. When the synchronous
component has completed, the write is considered complete and may terminate. The
asynchronous component might complete at a later time. Until both components are
complete, the kernel memory buffer cannot be freed because the data might still need to
be sent to remote nodes.
5
Design
Figure 1. SRVM Flow of Control
DataVolume
LogVolume
DataVolume
KernelBuffer
Asynchronous RLINKKernelBuffer
DataVolume
Synchronous RLINKKernelBuffer
Phase 1 Phase 2
= Synchronous component
(Remote)
(Remote)
= Asynchronous component
Flow of control for a write request on an SRVM RDS containing two remote sites, one
connected via an asynchronous RLINK, the other via a synchronous one.
6 Storage Replicator for Volume Manager Configuration Notes
Design
The synchronous component consists of sending a write request to each RLINK operating
in synchronous mode, and then waiting for an acknowledgment that the request was
received. The acknowledgment does not indicate that the request has been committed to
the Secondary data volume. It only means that it is in a buffer on the Secondary system,
and eventually will be committed to disk, barring a system failure. When all RLINKs
operating in synchronous mode have acknowledged receiving the request, the
synchronous component, and the overall write request, is complete. If all RLINKs are in
asynchronous mode, the synchronous component becomes null, which means that the
write latency consists solely of the time to write the SRL.
The asynchronous component works similarly to the synchronous component. The
difference is that the write request may complete before the asynchronous component
does. This component consists of sending a write request to each RLINK operating in
asynchronous mode, and then waiting for an acknowledgment that the request was
received. Additionally, the asynchronous component is responsible for writing the request
to the Primary data volume. This operation is performed asynchronously to avoid adding
the penalty of a second full disk write to the overall write latency. Because the log write,
but not the data write, is performed synchronously, the SRL becomes the final arbiter as to
the correct contents of the data volume in the case of a system failure.
Finally, an RLINK operating in asynchronous mode may be behind for various reasons,
such as network outages or a burst of writes which exceed available network bandwidth.
RLINKs that are behind are not handled by the asynchronous component, but by a
separate asynchronous thread. Because the write requests for these RLINKs are no longer
guaranteed to be held in memory, the asynchronous thread has the ability to read them
back off the SRL. This allows the system to release resources for requests that can not be
satisfied immediately.
Note that there are actually two synchronous modes, FAIL and OVERRIDE. RLINKs with
synchronous=OVERRIDE , referred to as soft synchronous RLINKs, change to
asynchronous mode during any type of disconnect or pause. RLINKs with
synchronous=FAIL , referred to as hard synchronous RLINKs, fail incoming writes if they
cannot be replicated immediately because of disconnect or pause.
Chapter 1, Effects of Configuration on Performance 7
Application Bandwidth
Application BandwidthBefore attempting to configure an RDS can be made, it is necessary to know the data
throughput that must be supported. That is the rate at which the application can be
expected to write data. Only write operations are of concern here: read operations are
always satisfied locally with very little SRVM interference.To perform the analyses
described in later sections, a profile of application bandwidth is required. For an
application with relatively constant bandwidth, the profile could take the form of certain
values, such as:
◆ Average application bandwidth
◆ Peak application bandwidth
◆ Length of peak application bandwidth period
For a more volatile application, a table of measured usages over specified intervals may be
needed.
Because matching application bandwidth to disk capacity is not an issue unique to
replication, it is not discussed here. It is assumed that an application is already running,
and that the VERITAS Volume ManagerTM has been used to configure data volumes to
support the bandwidth needs of the application. In this case, the application bandwidth
characteristics may already have been measured.
If the application characteristics are not known, they can be measured by running the
application and using a tool to measure data written to all the volumes to be replicated. If
the application will be writing to a file system rather than a raw data volume, be careful to
include in the measurement all the metadata written by the file system itself. This can add
a substantial amount to the total amount of replicated data. For example, if a database is
using a file system mounted on a replicated volume, a tool such as vxstat (see
vxstat (1M)) will correctly measure the total data written to the volume, while a tool that
monitors the database and measures its requests will fail to include those made by the
underlying file system.
It is also important to consider both peak and average bandwidth created by the
application. These numbers can be used to determine the type of network connection
needed. For synchronous RLINKs, the network must support the peak application
bandwidth. For asynchronous RLINKs that are not required to keep pace with the
Primary, the network only needs to support the overall average application bandwidth.
8 Storage Replicator for Volume Manager Configuration Notes
Application Bandwidth
Finally, once the measurements are made, the numbers calculated as the peak and average
bandwidths should be close to the largest obtained over the measurement period, not the
averages or medians. For example, assume that measurements are made over a 30-day
period, yielding 30 daily peaks and 30 daily averages, and then the average of each of
these is chosen as the application peak and average respectively. If the network is then
sized based on these values, then for half the time there will be insufficient network
capacity to keep up with the application. Instead, the numbers chosen should be close to
the highest obtained over the period, unless there is reason to doubt that they are valid or
typical.
Chapter 1, Effects of Configuration on Performance 9
Application Bandwidth
10 Storage Replicator for Volume Manager Configuration Notes
Configuring for Efficient Operation
2 OverviewThis chapter discusses many of the decisions that must be made when setting up an RDS.
Emphasis is on how each component can affect performance. The discussion assumes an
understanding of the design details described in Chapter 1. Each major component must
be configured properly and is discussed in turn. The components include RLINKs, the
network, the SRL, and the Secondary.
In an ideal configuration, replication proceeds at the same pace that the application
generates data. As a result, all Secondary sites remain relatively up to date. For this to
occur, each component within the configuration must be able to keep up with the
incoming data. This includes the SRL, local and remote data volumes, and the network
connection. The goal in configuring SRVM is that it must be able to handle temporary
bottlenecks, such as an occasional burst of updates, or an occasional network problem. If
one of the components cannot keep up with the update rate over the long term, SRVM
will not work.
The type of problem experienced depends on whether or not the lagging component is on
the critical path. The problems likely to be caused by each component are discussed in
more detail below. In general, the two most likely problems to occur are: (1) application
slowdown due to increased write latency, and (2) overflow of the SRL. If a component on
a critical path cannot keep up, additional latency may be added to each write. This in turn
leads to poor application performance. If the component is not on a critical path, the
application writes may proceed at their normal pace, with the excess accumulating in the
SRL, and possibly causing an overflow. So, it is important to examine each component in
turn to ensure that its bandwidth is sufficient to support the expected application
bandwidth.
11
RLINKs
RLINKs
Synchronous versus Asynchronous
The decision as to whether to use synchronous or asynchronous RLINKs should not be
made without a full understanding of the effects of this choice on system performance.
The relative merits of using synchronous or asynchronous RLINKs become apparent
when the underlying implementation, described in Chapter 1, is understood.
Synchronous RLINKs have the advantage that all writes are guaranteed to reach the
Secondary before completing. For some applications, this may simply be a requirement
that cannot be circumvented – in this case, performance is not a factor in the decision. For
applications where the choice is not so clear, however, this section discusses some of the
performance implications of choosing synchronous operations.
As illustrated in Figure 1 on page 6, all write requests first result in a write to the SRL. It is
only after this write completes that replication begins. Since synchronous RLINKs require
that the data reach the Secondary and be acknowledged before the write completes, this
makes the latency for a write equal to:
SRL latency + Network round trip latency
Thus, synchronous RLINKs can significantly decrease application performance by adding
the network round trip to the latency of each write request.
Asynchronous RLINKs avoid increasing the per-write latency by sending the data to the
Secondary after the write completes, thus removing the network round trip latency from
the equation. The most obvious disadvantage of this is that there is no guarantee that a
write which appears complete to the application has actually been replicated.
A more subtle effect of asynchronous RLINKs is that while application throughput should
increase due to decreased write latency, overall replication performance may decrease.
This occurs because if the asynchronous RLINK cannot keep up with incoming data, it
must begin freeing memory that is holding unsent requests for use by incoming requests.
When it is finally ready to send the old requests, they must first be read back from the
SRL. So while synchronous RLINKs always have their data available in memory,
asynchronous ones frequently have to read it off the SRL. Consequently, their
performance might suffer because of the delay of the added read. The need to perform
readbacks also has a negative impact on SRL performance. For synchronous RLINKs, the
SRL is only used for sequential writes and yields excellent performance. For
asynchronous RLINKs, however, the writes may be interspersed with occasional reads
from an earlier part of the SRL, and so performance suffers due to the increased disk head
movement.
12 Storage Replicator for Volume Manager Configuration Notes
RLINKs
Whether this readback slowdown effect occurs depends on whether the RLINK is able to
keep up with incoming data, and, when it cannot, on whether the available memory
buffer is large enough to hold the excess. If the RLINK always keeps up, or if it only falls
behind for short periods during which the excess is small enough to fit in memory,
readback will not be a problem. [See Section 2.6 for information on tuning the size of
SRVM and VERITAS Volume Manager memory buffers.] If readback is a problem, striping
the SRL volume over several disks using mid-sized stripes (for example, 10 times the
average write size), should aid performance, Unfortunately, this conflicts with the tactic of
striping using small stripes to improve SRL bandwidth as discussed in“SRL Bandwidth”
on page 16.
If synchronous RLINKs are to be used, another factor to consider is that hard synchronous
RLINKs throttle incoming write requests while catching up from a checkpoint. This
means that if, after the Secondary data volumes have been initialized, the RLINK takes ten
hours to catch up, any application waiting for writes to complete will hang for ten hours if
the RLINK is in hard synchronous mode. Thus, for all practical purposes, it is necessary to
either shut down the application or temporarily set the RLINK to soft synchronous or
asynchronous modes until the Secondary has caught up after a checkpoint.
Latency and SRL Protection
RLINKs have parameters, latencyprot and srlprot , available that provide a
compromise between synchronous and asynchronous characteristics. These parameters
allow the RLINK to fall behind, but limit the extent to which it does so.
When latencyprot is enabled, the RLINK is only allowed to fall behind by a predefined
number of requests, a high-water mark. Once this user-defined high-water mark is
reached, throttling is triggered. This forces all incoming requests to be delayed until the
RLINK has caught up to within another predefined number of requests, the low-water
mark. Thus, the average write latency seen by the application increases. However, the
behavior may appear different than with a synchronous RLINK, depending on the spread
between the high-water mark and low-water mark. A large spread causes occasional long
delays in write requests, which may appear to be application hangs, as the SRL drains
down to the low-water mark. Most other write requests will remain unaffected. A smaller
range will spread the delays more evenly over write requests, resulting in smaller but
more frequent delays. For most cases, a smaller spread is probably preferable.
The other relevant parameter, srlprot , is used to prevent the SRL from overflowing, and
has an effect similar to latencyprot . When srlprot is enabled and SRVM detects that
a write request would cause the SRL to overflow, the request and all subsequent requests
are delayed until the SRL has drained to 95% full. The parameters for this feature are not
user-tunable, so the expected behavior is that a large delay results for any writes made
while the SRL is draining. All other writes are unaffected.
Chapter 2, Configuring for Efficient Operation 13
Network
Network
Effects on Performance
All replicated write requests must eventually travel over the network to one or more
Secondary nodes. Whether or not this trip is on the critical path depends on the
configuration of the RLINKs in the RVG.
Since synchronous RLINKs require that data reach the Secondary node before the write
can complete, the network is always part of the critical path for synchronous RLINKs.
This means that for any period during which application bandwidth exceeds network
capacity, write latency increases.
Conversely, asynchronous RLINKs do not impose this requirement, so write requests are
delayed if network capacity is insufficient. Instead, excess requests accumulate on the
SRL, as long as the SRL is large enough to hold them. If there is a chronic shortfall in
network capacity, the SRL will eventually overflow. However, this setup does allow the
SRL to be used as a buffer to handle temporary shortfalls in network capacity, such as
periods of peak usage, provided that these periods are followed by periods during which
the RLINK can catch up as the SRL drains. If a configuration is planned with this
functionality in mind, you must be aware that Secondary sites will frequently be
significantly out of date.
Asynchronous RLINKs have several parameters that can change the behavior described
above, by placing the network round-trip on the critical path in certain situations. The
latencyprot and srlprot features, when enabled, can both have this effect. These
features are discussed fully in “Latency and SRL Protection” on page 13.
14 Storage Replicator for Volume Manager Configuration Notes
Network
Choosing an Appropriate Bandwidth
The network bandwidth depends on the type of connection between the Primary and
Secondary nodes, and the use of the connection. The type of connection determines the
maximum bandwidth available between the two locations, for example, a T3 line provides
45 Mb/second.
The other important factor to consider is whether the available connection will be used by
any other applications, or be exclusively reserved for SRVM. If other applications will be
using the same line, it is important to be aware of the bandwidth requirements of these
applications and subtract them from the total network bandwidth. If any applications
sharing the line have variations in their usage pattern, it is also necessary to consider
whether their times of peak usage are likely to coincide with SRVM’s peaks.
Additionally, overhead added by SRVM and the various underlying protocols reduces
effective bandwidth by a small amount, typically 3 to 5%.
To avoid problems caused by insufficient network bandwidth, the following general
principles should be applied:
◆ If synchronous RLINKs will be in use, the network bandwidth must at least match the
application bandwidth during its peak usage period. This leaves excess capacity
during non-peak periods, which is useful to allow initialization of new volumes using
checkpoints as described in “Peak Usage Constraint” on page 19.
◆ If only asynchronous RLINKs will be used, and you have the option of allowing the
Secondary to fall behind during peak usage, then the network bandwidth only needs
to match the overall average application bandwidth. This might require the
application to be shut down during initialization procedures, because there will be no
excess network capacity to handle the extra traffic generated by the catchup from the
checkpoint.
◆ If asynchronous RLINKs will be used with latencyprot enabled to avoid falling too
far behind, the requirements depend on how far the RLINK will be allowed to fall
behind. RLINKs with a small high-water mark should be treated as synchronous
RLINKs and therefore should have a network bandwidth sufficient to match the
application bandwidth during its peak usage period. RLINKs with a relatively large
high-water mark (that is, enough to allow the RLINK to fall behind by several hours,
or even a day), may get by with a bandwidth that only matches the average
application bandwidth, and thus be allowed to fall far behind during peak usage
periods.
Chapter 2, Configuring for Efficient Operation 15
SRL
SRL
SRL Layout
It is critical that there be no overlap between the physical disks comprising the SRL and
those comprising the data volumes, because all write requests to SRVM result in a write to
both the SRL and the requested data volume. Any such overlap is guaranteed to lead to
major performance problems, as the disk head thrashes between the SRL and data
sections of the disk. Slowdowns of over 100% can be expected. Note that the SRL on the
Secondary is not used as frequently, and so its placement is not considered important.
It is highly recommended that the SRL be mirrored to improve its reliability. The loss of the
SRL immediately puts all RLINKs into the STALE state. The only way to recover from this
is to perform a full resynchronization, which is a time-consuming procedure to be avoided
whenever possible. The risk of this failure can be minimized by mirroring the SRL.
SRL Bandwidth
The SRL is on the critical path for all writes, regardless of the RLINK configuration. This is
because, as illustrated in Figure 2 on page 17, all write requests perform and complete a
write to the SRL before any replication occurs. This makes it critical to ensure that the SRL
capacity is sufficient for the application.
Due to the design of SRVM, it may be difficult for a volume functioning as an SRL to keep
pace. Figure 2 illustrates two keys points that can affect SRL volume performance:
◆ An RVG can contain multiple data volumes but only a single SRL volume.
◆ All writes to any data volume in the RVG also result in a write to the SRL volume.
This means that while writes may be spread across multiple data volumes in the RVG, all
of these writes will be concentrated on a single SRL volume. This makes it easy for the
SRL to become a bottleneck. The problem is partially mitigated by the fact that writes to
the SRL volume are sequential, while those to the data volumes are more likely to be
random. As a result, the SRL volume essentially gets a head start on each write. If a large
percentage of the application’s accesses are reads, then much of the data volumes’
capacity will be used in satisfying reads, which do not affect the SRL volume, so the SRL
volume should be able to keep up.
16 Storage Replicator for Volume Manager Configuration Notes
SRL
Figure 2. Configuring for Efficient Operation
DataVolumes
SRLVolume
DataVolume
Asynchronous RLINKKernelBuffer
DataVolumes
Synchronous RLINKKernelBuffer
Phase 1 Phase 2
(Remote)
(Remote)
BufferKernel
BufferKernel
BufferKernel
DataVolume
DataVolume
= Synchronous component
= Asynchronous component
Flow of data when multiple write requests to different data volumes are being processed. Note
how writes are concentrated on the SRL volume, then distributed among data volumes. (For
clarity, multiple data buffers and data volumes are not shown on Secondary.
Chapter 2, Configuring for Efficient Operation 17
SRL
However, for some applications it is certainly possible that the overall bandwidth of
writes to the data volumes may exceed the physical capacity of the disk containing the
SRL volume. Since the latency of every write includes the time taken to write to the SRL
volume, this situation would cause the SRL volume to become a bottleneck, and increase
the latency of each write. In this case, it may be necessary to use standard Volume
Manager procedures to stripe the SRL volume over several physical disks to increase the
available bandwidth. The stripe size should be of the same order of magnitude as a typical
write, so that consecutive writes often end up on different physical disks.
If it is determined that the SRL is a bottleneck, but the situation is not alleviated through
the use of striping or some other solution, then the application bandwidth measured in
“Application Bandwidth” on page 8 becomes irrelevant, and the SRL bandwidth can be
used in its place when sizing the remaining components.
SRL Sizing
The size of the SRL affects the likelihood that it will overflow. When the SRL overflows for
a particular RLINK, that RLINK is marked STALE, and the corresponding remote RVG
becomes out of date until a full resynchronization with the Primary is performed. Since
this is a time-consuming process, and also renders the Secondary useless until it is
completed, SRL overflows are to be avoided whenever possible.
The SRL size needs to be large enough to satisfy four constraints:
◆ It must not overflow for asynchronous RLINKs during periods of peak usage when
replication over the RLINK may fall far behind the application.
◆ It must not overflow while a Secondary RVG is being initialized.
◆ It must not overflow while a Secondary RVG is being restored.
◆ It must not overflow during extended outages (network or Secondary node).
To determine the size needed for the SRL volume, you should determine the size required
to satisfy each of these constraints individually. Then, choose a value at least equal to the
maximum so that all will be satisfied. The information needed to perform this analysis,
presented below, includes:
◆ The maximum expected downtime for Secondary nodes
◆ The maximum expected downtime for the network connection
◆ The method for initializing Secondary data volumes with data from Primary data
volumes. If the application will be shut down to perform the initialization, then the
SRL will not grow and the method is unimportant. Otherwise, this information could
include: the time required to copy the data over a network, or the time required to
copy it to a tape or disk, to send the copy to the Secondary site, and to load the data
onto the Secondary data volumes.
18 Storage Replicator for Volume Manager Configuration Notes
SRL
Note If Automatic Synchronization Option is used to initialize the Secondary, the
previous paragraph is not a concern.
If Secondary backup will be performed to avoid full resynchronization in case of
Secondary data volume failure, the information needed also includes:
◆ The frequency of Secondary backups
◆ The maximum expected delay to detect and repair a failed Secondary data volume
◆ The expected time to reload backups onto the repaired Secondary data volume
Peak Usage Constraint
For some configurations, it might be common for replication to fall behind the application
during some periods and catch up during others. For example, an RLINK might fall
behind during business hours and catch up overnight if its peak bandwidth requirements
exceed the network bandwidth. Of course, for synchronous RLINKs, this does not apply,
as a shortfall in network capacity would cause each application write to be delayed, so the
application would run more slowly, but would not get ahead of replication.
For asynchronous RLINKs, the only limit to how far replication can fall behind is the size
of the SRL. If it is known that the application’s peak bandwidth requirements will exceed
the available network bandwidth, then it becomes important to consider this factor when
sizing the SRL.
Assuming that data is available providing the typical application bandwidth over a series
of intervals of equal length, it is simple to calculate the SRL size needed to support this
usage pattern:
1. Calculate the network capacity over the given interval (BWN).
2. For each interval n, calculate SRL growth (LGn), and the excess of application
bandwidth (BWAP) over network bandwidth (LGn = BWAP(n) – BWN).
3. For each interval, accumulate all the SRL growth values to find the cumulative SRL
size (LS):
The largest value obtained for any LSn is the value that should be used for SRL size as
determined by the peak usage constraint.
Σi=1...n
LGiLSn =
Chapter 2, Configuring for Efficient Operation 19
SRL
Table 1 shows an example of this calculation. The second column contains the maximum
likely application bandwidth per hour obtained by measuring the application as
discussed in “Application Bandwidth” on page 8. Column 4 shows, for each hour, how
much excess data the application generates that cannot be sent over the network. Column
5 shows the sums obtained for each interval. Since the largest sum is 37 GB, the SRL
would need to be at least this large for this application.
Note that several factors can reduce the maximum size to which the SRL can grow during
the peak usage period. Among these are:
◆ The latencyprot characteristic can be enabled to restrict the amount by which the
RLINK can grow.
◆ The network bandwidth can be increased to handle the full application bandwidth.
Table 1. Example Calculation of SRL Size Required to Support Peak Usage Period
Hour ending Application(GB/hour)
Network(GB/hour)
SRL Growth(GB)
CumulativeSRL Size (GB)
8 a.m. 6 5 1 1
9 10 5 5 6
10 15 5 10 16
11 15 5 10 26
12 p.m. 10 5 5 31
1 2 5 -3 28
2 6 5 1 29
3 8 5 3 32
4 8 5 3 35
5 7 5 2 37
6 3 5 -2 35
20 Storage Replicator for Volume Manager Configuration Notes
SRL
Initialization Period Constraint
This section applies if not using the Automatic Synchronization Option. When a new
Secondary RVG is brought online, its data volumes must be initialized to match those on
the Primary unless the Primary is also starting from scratch. If the application on the
Primary can be shut down while data is copied to the Secondary, this operation becomes
trivial and the SRL size is irrelevant. However, in most cases, it will be necessary to copy
existing data from the Primary to the Secondary while the application is still running on
the Primary.
The following procedure, referred to as a Primary checkpoint, is used in this case:
1. Start a checkpoint on the Primary RVG.
2. Copy all Primary data volumes.
3. End the checkpoint.
4. Transmit the data to the Secondary site.
5. Load the Secondary data volumes with the data.
6. Start replicating from the start of the checkpoint.
If the total amount of data is small relative to network speed, then step 2, step 4, and
step 5 may be accomplished as one by copying the Primary data volumes over the
network to the Secondary data volumes. However, for large databases, it is likely to be
faster to copy the Primary data volumes to tape in step 2 and ship the tapes via a courier
in step 4. For distant locations,step 4 may take almost a day if an overnight courier is
used. For large databases, writing and reading the tapes in step 2 and step 5 may also add
significant delays. (Another option would be to copy the data directly to disks, ship the
disks, and import them on the Secondary.)
During the entire initialization period between step 1 and step 6, the application is
running, so data is accumulating on the SRL. Thus, to ensure that the SRVM does not
overflow during this period, it is necessary that the SRL be sized to hold as much data as
the application could write during the initialization period. After the initialization period,
this data will gradually be replicated and the Secondary will eventually catch up to the
Primary.
Chapter 2, Configuring for Efficient Operation 21
SRL
Note that until the Secondary catches up, it will be inconsistent and out-of-date with
respect to the Primary. This is an unavoidable consequence of the requirement that the
application continue to run during this period.
To perform the initialization period calculation, first obtain an estimate of the expected
time to perform step 1 through step 6. Although the first time an initialization is
performed, it may be possible to schedule it for a slow period such as a night or weekend,
it is possible that a future initialization could be necessary during a busy period due to the
need to resynchronize a Secondary after a failure. If this could be the case, then the
calculation should use worst-case numbers for application bandwidth during the
initialization period. If the site requirements will always allow an initialization to be
performed at the most convenient time, then best-case values for application bandwidth
can be used.
In either case, given the application profile obtained in “Application Bandwidth” on
page 8, it should be a simple matter to determine the maximum amount of data that could
be generated by the application over the time period expected for an initialization. Since
all this data must be available on the SRL at the end of initialization to bring the
Secondary up to date, the SRL must be at least this large.
Secondary Backup Constraint
SRVM provides a mechanism to perform periodic backups of the Secondary data
volumes. In case of a problem that would otherwise require a full resynchronization using
a Primary checkpoint, as described in“Initialization Period Constraint” on page 21, a
Secondary backup, if available, can be used to get the Secondary back on line much more
quickly. An example of such a case would be the failure of an unmirrored Secondary data
volume.
A Secondary backup is made by defining a Secondary checkpoint and then making a copy
of all the Secondary data volumes. Should a failure occur, the Secondary data volumes can
be restored from this local copy, and then replication can proceed from the original
checkpoint, thus replaying all the data from the checkpoint to the present.
The constraint introduced by this process is that the SRL must be large enough to hold all
the data between the most recent checkpoint and the present. This depends largely on
three factors:
◆ The application data bandwidth.
◆ The SRL size.
◆ The frequency of Secondary backups.
22 Storage Replicator for Volume Manager Configuration Notes
SRL
Thus, given an application data bandwidth and frequency of Secondary backups, it is
possible to come up with a minimal SRL size. Realistically, an extra margin should be
added to an estimate arrived at using these figures to cover other possible delays,
including:
◆ Maximum delay before a data volume failure will be detected by a system
administrator.
◆ Maximum delay to repair or replace the failed drive.
◆ Delay to reload disk with data from the backup tape.
To arrive at an estimate of the SRL size needed to support this constraint, first determine
the total time period the SRL needs to support by adding the period planned between
Secondary backups to the time expected for the three factors mentioned above. Then, use
the application bandwidth data to determine, for the worst case, the amount of data the
application could generate over this time period.
Downtime Constraint
When the network connection to a Secondary node, or the Secondary node itself, goes
down, the RLINK on the Primary node detects the broken connection and responds. For
an RLINK in hard synchronous mode, the response is to fail all subsequent write requests
until the connection is restored. In this case, the SRL will not grow, so the downtime
constraint is irrelevant. For all other types of RLINKs, incoming write requests
accumulate in the SRL until the connection is restored. Thus, the SRL must be large
enough to hold the maximum output that the application could be expected to generate
over the maximum possible downtime.
Maximum downtimes may be difficult to estimate. In some cases, there may be vendor
guarantees that failed hardware or network connections will be repaired within some
period. Of course, if the repair is not completed within the guaranteed period, the SRL
will overflow despite any guarantee, so it would be a good idea to add a safety margin to
any such estimate.
To arrive at an estimate of the SRL size needed to support this constraint, first obtain
estimates for the maximum downtimes which the Secondary node and network
connections could reasonably be expected to incur. Then, use the application bandwidth
data to determine, for the worst case, the amount of data the application could generate
over this time period.
Chapter 2, Configuring for Efficient Operation 23
SRL
Additional Factors
Once estimates of required SRL size have been obtained under each of the constraints
described above, several additional factors must be considered.
For the initialization period, downtime and Secondary backup constraints, it is not
unlikely that any of these situations could be immediately followed by a period of peak
usage. In this case, the Secondary could continue to fall further behind rather than
catching up during the peak usage period. As a result, it might be necessary to add the
size obtained from the peak usage constraint to the maximum size obtained using the
other constraints. Note that this applies even for soft synchronous RLINKs, which are not
normally affected by the peak usage constraint, because after a disconnect, they act as
asynchronous RLINKs until caught up.
Of course, it is also possible that other situations could occur requiring addition of the
constraints. For example, an initialization period could be immediately followed by a long
network failure, or a network failure could be followed by a Secondary node failure.
Whether and to what degree to plan for unlikely occurrences requires weighing the cost of
additional storage against the cost of additional downtime caused by a SRL overflow.
Once an estimate has been computed, one more adjustment must be made to account for
the fact that all data written to the SRL also includes some header information. This
adjustment must take into account the typical size of write requests. Each request uses at
least one additional disk block (512 bytes) for header information, so the adjustment
should be as follows:
If Average Write Size is: Add tHis Percentage to SRL Size:
512 bytes 100
1 K 50
2 K 25
5 K or more 10
24 Storage Replicator for Volume Manager Configuration Notes
SRL
Example
This section contains an example of how a particular site might go about calculating a
reasonable SRL size for its configuration. First, all the relevant parameters for the site
must be collected. For this site, they are as follows:
Since synchronous RLINKs will be used, the network must be sized to handle the peak
application bandwidth, so that the SRL will not grow during the peak usage period. Thus,
the peak usage constraint is not an issue, and the largest constraint is that the network
could be out for 24 hours. The data accumulating in the SRL over this period would be:
1 GB/hour x 12 hours 12 GB + 1/4 GB/hour x 12 hours3 GB = 15 GB
Since the 24-hour downtime is already an extreme case, no additional adjustment will be
made to handle other constraints. An adjustment of 25% is made to handle header
information. The result shows that the SRL should be at least 18.75 GB.
Application peak write bandwidth 1 GB/hour
Duration of peak 8 am - 8 pm
Application off-peak write bandwith 250 MB/hour
Average write size 2 KB
Number of Secondary sites 1
Type of RLINK soft synchronous
Initialization Period
application shutdown no
copy data to tape 3 hours
send tapes to Secondary site 4 hours
load data 3 hours
Total 10 hours
Maximum downtime for Secondary node 4 hours
Maximum downtime for network 24 hours
Secondary backup not used
Chapter 2, Configuring for Efficient Operation 25
Buffer Space
Buffer SpaceWhen a write request is made, an SRVM data buffer is allocated to it. The amount of buffer
space available affects SRVM performance, which can affect performance for the
underlying Volume Manager volumes. You can use the following tunables to allocate
buffer space on the Primary and Secondary according to your requirements:
◆ voliomem_max_readbackpool_sz
◆ voliomem_maxpool_sz
◆ voliomem_max_nmcompool_sz
These tunables can be modified by adding lines to the /etc/system file. For details on
changing the SRVM tunables, see Chapter 5, “Administering SRVM,” in the VERITASStorage Replicator for Volume Manager Administrator’s Guide. The following sections
describe each of the above tunable.
Readback Buffer Space
When a write request is made, an SRVM data buffer is allocated to it. The data buffer is
not released until the data has been written to the Primary and sent to all synchronous
Secondary data volumes. When the buffer space becomes low, several effects are possible,
depending on the configuration. SRVM will begin to free some buffers before sending the
data across the asynchronous RLINKs. This frees up more space for incoming write
requests so that they will not be delayed. The cost is that it forces the freed requests to be
read back from the SRL later, when an RLINK is ready to send them. As discussed in
“Synchronous versus Asynchronous” on page 12, the need to perform readback may have
a slight impact on write latency because it makes the SRL perform more non-sequential
I/O.
The amount of buffer space available for these readbacks is defined by the tunable,
voliomem_max_readbackpool_sz , which defaults to 4MB. To enable more readbacks
at the same time, increase the value of voliomem_max_readbackpool_sz . You may
need to increase this value if you have multiple asynchronous RLINKs. If multiple RVGs
are present on a node, this value can be increased according to your requirements.
26 Storage Replicator for Volume Manager Configuration Notes
Buffer Space
Write Buffer Space on the Primary
The amount of buffer space that can be allocated within the operating system to handle
incoming writes is defined by the tunable, voliomem_maxpool_sz , which defaults to
4MB. If the available buffer space is too small, writes are held up. SRVM must free old
buffer space to allow new writes to be processed. The freed requests are read back from
the SRL when an RLINK is ready to send them to the Secondary. If
voliomem_maxpool_sz is large enough to hold the incoming writes, you can avoid
reading back old buffers from the SRL when necessary. To increase the number of
concurrent writes, or to reduce the number of readbacks from the Storage Replicator Log
(SRL), increase the value of voliomem_maxpool_sz .
Buffer Space on the Secondary
Secondary data volumes are not directly on the critical path—any individual write on the
Primary can complete before the write to the Secondary data volumes completes, even for
synchronous RLINKs. The following feedback mechanisms limits the amount by which
Secondary data volumes can fall behind.
This mechanism involves a limit on the amount of memory that is allocated on a
Secondary node to handle incoming requests from the Primary node. Once this limit is
reached, the Secondary rejects incoming requests until existing requests complete their
writes to the Secondary data volumes and free their memory. Since this appears to the
Primary as an inability to send requests over the network, the consequences are identical
to those pertaining to insufficient network bandwidth. Thus, the results depend on
whether synchronous or asynchronous RLINKs are in use. For asynchronous RLINKs,
there may be no limit to how far Secondary data volumes can fall behind unless the
mechanisms discussed in “Latency and SRL Protection” on page 13 are in force.
The amount of buffer space available for requests coming in to the Secondary over the
network is determined by the SRVM tunable, voliomem_max_nmcompool_sz , which
defaults to 4 MB. Since this value is global, and therefore restricts all Secondary RVGs on a
node, it may also be useful to increase it if multiple Secondary RVGs will be present on the
Secondary node. If there is a high volume of requests, increase
voliomem_max_nmcompool_sz .
Chapter 2, Configuring for Efficient Operation 27
Buffer Space
28 Storage Replicator for Volume Manager Configuration Notes
Glossary
hard synchronous
A characteristic of an RLINK, which, when set, indicates that if the RLINK is disconnected
or paused, any incoming write requests will be failed.
high-water mark
A parameter associated with an RLINK, used only when latencyprot is enabled. In this
case, when the RLINK falls behind by this number of requests, throttling is triggered, so
all incoming write requests are delayed until the number of requests behind drops to the
low-water mark.
low-water mark
A parameter associated with an RLINK, used only when latencyprot is enabled. In this
case, when throttling is triggered, it remains in effect (no new write requests are
processed) until the number of requests the RLINK is behind drops to this number.
RDS
Replicated Data Set
RVG
Replicated Volume Group
soft synchronous
A characteristic of an RLINK, which, when set, indicates that if the RLINK is disconnected
or paused, then the RLINK switches to asynchronous mode.
SRVM
Storage Replicator for Volume Manager
SRL
Storage Replicator Log
29
30 Storage Replicator for Volume Manager Configuration Notes