+ All Categories

Dag

Date post: 27-Oct-2015
Category:
Upload: rajeev-srivastava
View: 105 times
Download: 2 times
Share this document with a friend
Description:
About DAG
Popular Tags:
27
My Collection This document is provided "as-is". Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. This document does not provide you with any legal rights to any intellectual property in any Microsoft product or product name. You may copy and use Terms of Use (http://technet.microsoft.com/cc300389.aspx) | Trademarks (http://www.microsoft.com/library/toolbar/3.0/trademarks/en-us.mspx)
Transcript
Page 1: Dag

My Collection

This document is provided "as-is". Information and views expressed in this document, including URL and other Internet Web site references, may change withoutnotice. This document does not provide you with any legal rights to any intellectual property in any Microsoft product or product name. You may copy and use

Terms of Use (http://technet.microsoft.com/cc300389.aspx) | Trademarks (http://www.microsoft.com/library/toolbar/3.0/trademarks/en-us.mspx)

Page 2: Dag

Table Of ContentsChapter 1

Managing High Availability and Site Resilience: Exchange 2013 HelpManaging Database Availability Groups: Exchange 2013 HelpManaging Mailbox Database Copies: Exchange 2013 HelpMonitoring Database Availability Groups: Exchange 2013 Help

Page 3: Dag

Chapter 1

Page 4: Dag

Managing High Availability and Site Resilience

Applies to: Exchange Server 2013

Topic Last Modified: 2012-11-05

After you build, validate, and deploy a Microsoft Exchange Server 2013 high availability or site resilience solution, the solution transitions from the deployment phase to theoperational phase of the overall solution lifecycle. The operational phase consists of several tasks, and all tasks are related to one of the following areas: databaseavailability groups (DAGs), mailbox database copies, performing proactive monitoring, and managing switchovers and failovers.

Contents

Database availability group management

Mailbox database copy management

Proactive monitoring

Switchovers and failovers

Database availability group management

The operational management tasks associated with DAGs include:

Creating one or more DAGs Creating a DAG is typically a one-time procedure performed during the deployment phase of the solution lifecycle. However,there may be reasons for creating DAGs that occur during the operational phase, for example:

The DAG is configured for third-party replication mode, and you want to revert to using continuous replication. You can't convert a DAG back tocontinuous replication; you need to create a DAG.You have servers in multiple domains. All members of the same DAG must also be members of the same domain.

Managing DAG membership Managing DAG members is an infrequent task typically performed during the deployment phase of the solution lifecycle.However, because of the flexibility provided by incremental deployment, managing DAG membership may also be performed throughout the solution lifecycle.Configuring DAG properties Each DAG has various properties that can be configured as needed. These properties include:

Witness server and witness directory The witness server is a server outside the DAG that acts as a quorum voter when the DAG contains an evennumber of members. The witness directory is a directory created and shared on the witness server for use by the system in maintaining a quorum.IP addresses Each DAG will have one or more IPv4 addresses, and optionally, one or more IPv6 addresses. The IP addresses assigned to the DAG areused by the DAG's underlying cluster. The number of IPv4 addresses assigned to the DAG equals the number of subnets that comprise the MAPI networkused by the DAG. You can configure the DAG to use static IP addresses or to obtain addresses automatically by using Dynamic Host ConfigurationProtocol (DHCP).Datacenter Activation Coordination mode Datacenter Activation Coordination mode is a property setting on a DAG that's designed to prevent split-brain conditions at the database level, in a scenario in which you're restoring service to a primary datacenter after a datacenter switchover has beenperformed. For more information about Datacenter Activation Coordination mode, see Datacenter Activation Coordination Mode.Alternate witness server and alternate witness directory The alternate witness server and alternate witness directory are values that you canpreconfigure as part of the planning process for a datacenter switchover. These refer to the witness server and witness directory that will be used when adatacenter switchover has been performed.Replication port By default, all DAGs use TCP port 64327 for continuous replication. You can modify the DAG to use a different TCP port for replicationby using the ReplicationPort parameter of the Set-DatabaseAvailabilityGroup cmdlet.Network discovery You can force the DAG to rediscover networks and network interfaces. This operation is used when you add or remove networks orintroduce new subnets. Rediscovery of all DAG networks can be forced by using the DiscoverNetworks parameter of the Set-DatabaseAvailabilityGroupcmdlet.Network compression By default, DAGs use compression only between DAG networks on different subnets. You can enable compression for all DAGnetworks or for seeding operations only, or you can disable compression for all DAG networks.Network encryption By default, DAGs use encryption only between DAG networks on different subnets. You can enable encryption for all DAGnetworks or for seeding operations only, or you can disable encryption for all DAG networks.

Shutting down DAG members The Exchange 2013 high availability solution is integrated with the Windows shutdown process. If an administrator orapplication initiates a shutdown of a Windows server in a DAG that has a mounted database that's replicated to one or more DAG members, the system will try toactivate another copy of the mounted databases prior to allowing the shutdown process to complete. However, this new behavior doesn't guarantee that all ofthe databases on the server being shut down will experience a lossless activation. As a result, it's a best practice to perform a server switchover prior to shuttingdown a server that's a member of a DAG.

For detailed steps about how to create a DAG, see Create a Database Availability Group. For detailed steps about how to configure DAGs and DAG properties, seeConfigure Database Availability Group Properties. For more information about each of the preceding management tasks, and about managing DAGs in general, seeManaging Database Availability Groups.

Return to top

Mailbox database copy management

The operational management tasks associated with mailbox database copies include:

Adding mailbox database copies When you add a copy of a mailbox database, continuous replication is automatically enabled between the existing databaseand the database copy.Configuring mailbox database copy properties You can configure a variety of properties, such as the database activation policy, the amount of time, if any,for replay lag and truncation lag, and the activation preference for the database copy.Suspending or resuming a mailbox database copy You can suspend a mailbox database copy in preparation for seeding, or for other forms of maintenance.You can also suspend a mailbox database copy for activation only. This configuration prevents the system from automatically activating the copy as a result of afailure, but it still allows the system to keep the database copy up to date with log shipping and replay.Updating a mailbox database copy Updating, also known as seeding, is the process in which a copy of a mailbox database is added to another Mailbox server.This becomes the baseline database for the copy. After the initial first seed of the baseline database copy, only in rare circumstances will the database need to beseeded again.Activating a mailbox database copy Activating is the process of designating a specific passive copy as the new active copy of a mailbox database. This processis referred to as a switchover. For more information, see "Switchovers and Failovers" later in this topic.Removing a mailbox database copy You can remove a mailbox database copy at any time. Occasionally, it may be necessary to remove a mailbox databasecopy. For example, you can't remove a Mailbox server from a DAG until all mailbox database copies are removed from the server. In addition, you must removeall copies of a mailbox database before you can change the path for a mailbox database.

For detailed steps about how to add a mailbox database copy, see Add a Mailbox Database Copy. For detailed steps about how to configure mailbox database copies,see Configure Mailbox Database Copy Properties. For more information about each of the preceding management tasks, and about managing mailbox database copiesin general, see Managing Mailbox Database Copies. For detailed steps about how to remove a mailbox database copy, see Remove a Mailbox Database Copy.

Return to top

Proactive monitoring

Making sure that your servers are operating reliably and that your database copies are healthy are key objectives for daily messaging operations. Exchange 2013 includesa number of features that can be used to perform a variety of health monitoring tasks for DAGs and mailbox database copies, including:

Exchange 2013 1 out of 1 rated this helpful

TechNet Products IT Resources Downloads Training Support

Page 5: Dag

Get-MailboxDatabaseCopyStatusTest-ReplicationHealthCrimson channel event logging

In addition to monitoring the health and status, it's also critical to monitor for situations that can compromise availability. For example, we recommend that youmonitor the redundancy of your replicated databases. It's critical to avoid situations where you're down to a single copy of a database. This scenario should be treatedwith the highest priority and resolved as soon as possible.

For more detailed information about monitoring the health and status of DAGs and mailbox database copies, see Monitoring Database Availability Groups.

Return to top

Switchovers and failovers

A switchover is a manual process in which an administrator manually activates one or more mailbox database copies. Switchovers, which can occur at the database orserver level, are typically performed as part of preparation for maintenance activities. Switchover management involves performing database or server switchovers asneeded. For example, if you need to perform maintenance on a Mailbox server in a DAG, you would first perform a server switchover so that the server didn't host anyactive mailbox database copies. For detailed steps about how to perform a database switchover, see Activate a Mailbox Database Copy. Switchovers can also beperformed at the datacenter level.

A failover is the automatic activation by the system of one or more database copies in reaction to a failure. For example, the loss of a disk drive in a RAID-lessenvironment will trigger a database failover. The loss of the MAPI network or a power failure will trigger a server failover.

Return to top

Page 6: Dag

Managing Database Availability Groups

Applies to: Exchange Server 2013

Topic Last Modified: 2013-01-08

A database availability group (DAG) is a set of up to 16 Microsoft Exchange Server 2013 Mailbox servers that provides automatic, database-level recovery from a database,server, or network failure. DAGs use continuous replication and a subset of Windows failover clustering technologies to provide high availability and site resilience. Mailboxservers in a DAG monitor each other for failures. When a Mailbox server is added to a DAG, it works with the other servers in the DAG to provide automatic, database-levelrecovery from database failures.

When you create a DAG, it's initially empty, and a directory object is created in Active Directory that represents the DAG. The directory object is used to store relevantinformation about the DAG, such as server membership information. When you add the first server to a DAG, a failover cluster is automatically created for the DAG. Inaddition, the infrastructure that monitors the servers for network or server failures is initiated. The failover cluster heartbeat mechanism and cluster database are then usedto track and manage information about the DAG that can change quickly, such as database mount status, replication status, and last mounted location.

Contents

Creating DAGs

DAG membership

Configuring DAG properties

DAG networks

Configuring DAG members

Performing maintenance on DAG members

Shutting down DAG members

Installing update rollups on DAG members

Creating DAGs

A DAG can be created using the New Database Availability Group wizard in the Exchange Administration Center (EAC), or by running the New-DatabaseAvailabilityGroup cmdlet in the Exchange Management Shell. When creating a DAG, you provide a name for the DAG, and optional witness server and witnessdirectory settings. In addition, one or more IP addresses are assigned to the DAG, either by using static IP addresses or by allowing the DAG to be automatically assignedthe necessary IP addresses using Dynamic Host Configuration Protocol (DHCP). You can manually assign IP addresses to the DAG by using theDatabaseAvailabilityGroupIpAddresses parameter. If you omit this parameter, the DAG attempts to obtain an IP address by using a DHCP server on your network.

For detailed steps about how to create a DAG, see Create a Database Availability Group.

When you create a DAG, an empty object representing the DAG with the name you specified and an object class of msExchMDBAvailabilityGroup is created in ActiveDirectory.

DAGs use a subset of Windows failover clustering technologies, such as the cluster heartbeat, cluster networks, and cluster database (for storing data that changes orcan change quickly, such as database state changes from active to passive or the reverse, or from mounted to dismounted or the reverse). Because DAGs rely onWindows failover clustering, they can only be created on Exchange 2013 Mailbox servers running the Windows Server 2008 R2 Enterprise or Datacenter operating systemor the Windows Server 2012 Standard or Datacenter operating system.

Note:

The failover cluster created and used by the DAG must be dedicated to the DAG. The cluster can't be used for any other high availability solution or for any otherpurpose. For example, the failover cluster can't be used to cluster other applications or services. Using a DAG's underlying failover cluster for purposes other than theDAG isn't supported.

DAG witness server and witness directory

When creating a DAG, you need to specify a name for the DAG no longer than 15 characters that's unique within the Active Directory forest. In addition, each DAG isconfigured with a witness server and witness directory. The witness server and its directory are used only when there's an even number of members in the DAG andthen only for quorum purposes. You don't need to create the witness directory in advance. Exchange automatically creates and secures the directory for you on thewitness server. The directory shouldn't be used for any purpose other than for the DAG witness server.

The requirements for the witness server are as follows:

The witness server can't be a member of the DAG.The witness server must be in the same Active Directory forest as the DAG.The witness server must be running Windows Server 2012, Windows Server 2008 R2, Windows Server 2008, Windows Server 2003 R2, or Windows Server 2003.A single server can serve as a witness for multiple DAGs. However, each DAG requires its own witness directory.

We recommend that you use an Exchange 2013 Client Access server in the Active Directory site containing the DAG. This allows the witness server and directory toremain under the control of an Exchange administrator. Regardless of what server is used as the witness server, if the Windows Firewall is enabled on the intendedwitness server, you must enable the Windows Firewall exception for File and Printer Sharing.

Important:

If the witness server you specify isn't an Exchange 2013 or Exchange 2010 server, you must add the Exchange Trusted Subsystem universal security group (USG) tothe local Administrators group on the witness server prior to creating the DAG. These security permissions are necessary to ensure that Exchange can create adirectory and share on the witness server as needed.

Neither the witness server nor the witness directory needs to be fault tolerant or use any form of redundancy or high availability. There's no need to use a clusteredfile server for the witness server or employ any other form of resiliency for the witness server. There are several reasons for this. With larger DAGs (for example, sixmembers or more), several failures are required before the witness server is needed. Because a six-member DAG can tolerate as many as two voter failures withoutlosing quorum, it would take as many as three voters failing before the witness server would be needed to maintain a quorum. Also, if there's a failure that affectsyour current witness server (for example, you lose the witness server because of a hardware failure), you can use the Set-DatabaseAvailabilityGroup cmdlet toconfigure a new witness server and witness directory (provided you have a quorum).

Note:

You can also use the Set-DatabaseAvailabilityGroup cmdlet to configure the witness server and witness directory in the original location if the witness server lostits storage or if someone changed the witness directory or share permissions.

As a best practice, in an environment where a DAG is extended across multiple datacenters (and Active Directory sites) and configured for site resilience, werecommend that you use a witness server in your primary datacenter (the datacenter containing the majority of your user population). If each datacenter has a similar

Exchange 2013 This topic has not yet been rated

TechNet Products IT Resources Downloads Training Support

Page 7: Dag

number of users, the datacenter you choose to host the witness server is considered to be the primary datacenter from the solution's perspective. If the witness serveris in the datacenter with the majority of the client population, the majority of clients retain access after a failure.

If the datacenter is remote to large user populations, this may affect your decision. You would then need to determine if there's a requirement for the primarydatacenter to remain healthy and active if there's a loss of wide are network (WAN) connectivity to the other two datacenters. In that event, the witness server shouldalso be in the primary datacenter.

Although it's supported to use a witness server in a third datacenter, we don't recommend this scenario. From an Exchange perspective, this configuration doesn'tprovide you with greater availability. It's important that you examine the critical path factors if you use a witness server in a third datacenter. For example, if the WANconnection between the primary datacenter and the second and third datacenter fails, the solution in the primary datacenter becomes unavailable.

Specifying a witness server and witness directory during DAG creation

When creating a DAG, you must provide a name for the DAG. You can optionally also specify a witness server and witness directory. If you specify a witness server,we recommend that you use an Exchange 2013 Client Access server, because this allows an Exchange administrator to be aware of the availability of the witnessserver.

When creating a DAG, the following combinations of options and behaviors are available:

You can specify only a name for the DAG, and leave the Witness server and Witness directory fields blank. In this scenario, the wizard searches for aClient Access server that doesn't have the Mailbox server installed, and it automatically creates the default directory(%SystemDrive%:\DAGFileShareWitnesses\<DAGFQDN>) and default share (<DAGFQDN>) on that server and uses that Client Access server as the witnessserver. For example, consider the witness server CAS3 on which the operating system has been installed onto drive C. The DAG DAG1 in the domaincontoso.com would use a default witness directory of C:\DAGFileShareWitnesses\DAG1.contoso.com, which would be shared as\\CAS3\DAG1.contoso.com.You can specify a name for the DAG, the witness server that you want to use, and the directory you want created and shared on the witness server.You can specify a name for the DAG and the witness server that you want to use, and leave the Witness directory field blank. In this scenario, the wizardcreates the default directory on the specified witness server.You can specify a name for the DAG, leave the Witness server field blank, and specify the directory you want created and shared on the witness server. Inthis scenario, the wizard searches for a Client Access server that doesn't have the Mailbox server installed, and it automatically creates the specified DAG onthat server, shares the directory, and uses that Client Access server as the witness server.

When a DAG is formed, it initially uses the Node Majority quorum model. When the second Mailbox server is added to the DAG, the quorum is automaticallychanged to a Node and File Share Majority quorum model. When this change occurs, the DAG's cluster begins using the witness server for maintaining quorum. Ifthe witness directory doesn't exist, Exchange automatically creates it, shares it, and provisions the share with full control permissions for the cluster name object(CNO) computer account for the DAG.

Note:

Using a file share that's part of a Distributed File System (DFS) namespace isn't supported.

If Windows Firewall is enabled on the witness server before the DAG is created, it may block the creation of the DAG. Exchange uses Windows ManagementInstrumentation (WMI) to create the directory and file share on the witness server. If Windows Firewall is enabled on the witness server and there are no firewallexceptions configured for WMI, the New-DatabaseAvailabilityGroup cmdlet fails with an error. If you specify a witness server, but not a witness directory, youreceive the following error message.

The task was unable to create the default witness directory on server <Server Name>. Please manually specify a witness directory.

If you specify a witness server and witness directory, you receive the following warning message.

Unable to access file shares on witness server 'ServerName'. Until this problem is corrected, the database availability group may be more vulnerable to failures.You can use the Set-DatabaseAvailabilityGroup cmdlet to try the operation again. Error: The network path was not found.

If Windows Firewall is enabled on the witness server after the DAG is created but before servers are added, it may block the addition or removal of DAG members. IfWindows Firewall is enabled on the witness server and there are no firewall exceptions configured for WMI, the Add-DatabaseAvailabilityGroupServer cmdletdisplays the following warning message.

Failed to create file share witness directory 'C:\DAGFileShareWitnesses\DAG_FQDN' on witness server 'ServerName'. Until this problem is corrected, thedatabase availability group may be more vulnerable to failures. You can use the Set-DatabaseAvailabilityGroup cmdlet to try the operation again. Error: WMIexception occurred on server 'ServerName': The RPC server is unavailable. (Exception from HRESULT: 0x800706BA)

To resolve the preceding error and warnings, do one of the following:

Manually create the witness directory and share on the witness server, and assign the CNO for the DAG full control for the directory and share.Enable the WMI exception in Windows Firewall.Disable Windows Firewall.

Return to top

DAG membership

After a DAG has been created, you can add servers to or remove servers from the DAG using the Manage Database Availability Group wizard in the EAC, or using theAdd-DatabaseAvailabilityGroupServer or Remove-DatabaseAvailabilityGroupServer cmdlets in the Shell. For detailed steps about how to manage DAGmembership, see Manage Database Availability Group Membership.

Note:

Each Mailbox server that's a member of a DAG is also a node in the underlying cluster used by the DAG. As a result, at any one time, a Mailbox server can be amember of only one DAG.

If the Mailbox server being added to a DAG doesn't have the failover clustering component installed, the method used to add the server (for example, the Add-DatabaseAvailabilityGroupServer cmdlet or the Manage Database Availability Group wizard) installs the failover clustering feature.

When the first Mailbox server is added to a DAG, the following occurs:

The Windows failover clustering component is installed, if it isn't already installed.A failover cluster is created using the name of the DAG. This failover cluster is used exclusively by the DAG, and the cluster must be dedicated to the DAG. Use ofthe cluster for any other purpose isn't supported.A CNO is created in the default computers container.The name and IP address of the DAG is registered as a Host (A) record in Domain Name System (DNS).The server is added to the DAG object in Active Directory.The cluster database is updated with information on the databases mounted on the added server.

Page 8: Dag

In a large or multiple site environment, especially those in which the DAG is extended to multiple Active Directory sites, you must wait for Active Directory replication ofthe DAG object containing the first DAG member to complete. If this Active Directory object isn't replicated throughout your environment, adding the second servermay cause a new cluster (and new CNO) to be created for the DAG. This is because the DAG object appears empty from the perspective of the second member beingadded, thereby causing the Add-DatabaseAvailabilityGroupServer cmdlet to create a cluster and CNO for the DAG, even though these objects already exist. To verifythat the DAG object containing the first DAG server has been replicated, use the Get-DatabaseAvailabilityGroup cmdlet on the second server being added to verify thatthe first server you added is listed as a member of the DAG.

When the second and subsequent servers are added to the DAG, the following occurs:

The server is joined to the Windows failover cluster for the DAG.The quorum model is automatically adjusted:

A Node Majority quorum model is used for DAGs with an odd number of members.A Node and File Share Majority quorum model is used for DAGs with an even number of members.

The witness directory and share are automatically created by Exchange when needed.The server is added to the DAG object in Active Directory.The cluster database is updated with information about mounted databases.

Note:

The quorum model change should happen automatically. However, if the quorum model doesn't automatically change to the proper model, you can run the Set-DatabaseAvailabilityGroup cmdlet with only the Identity parameter to correct the quorum settings for the DAG.

Pre-staging the cluster name object for a DAG

The CNO is a computer account created in Active Directory and associated with the cluster's Name resource. The cluster's Name resource is tied to the CNO, which isa Kerberos-enabled object that acts as the cluster's identity and provides the cluster's security context. The formation of the DAG's underlying cluster and the CNO forthat cluster is performed when the first member is added to the DAG. When the first server is added to the DAG, remote PowerShell contacts the Microsoft ExchangeReplication service on the Mailbox server being added. The Microsoft Exchange Replication service installs the failover clustering feature (if it isn't already installed)and begins the cluster creation process. The Microsoft Exchange Replication service runs under the LOCAL SYSTEM security context, and it's under this context inwhich cluster creation is performed.

Warning:

If your DAG members are running Windows Server 2012, you must pre-stage the CNO prior to adding the first server to the DAG.

In environments where computer account creation is restricted, or where computer accounts are created in a container other than the default computers container,you can pre-stage and provision the CNO. You create and disable a computer account for the CNO, and then either:

Assign full control of the computer account to the computer account of the first Mailbox server you're adding to the DAG.Assign full control of the computer account to the Exchange Trusted Subsystem USG.

Assigning full control of the computer account to the computer account of the first Mailbox server you're adding to the DAG ensures that the LOCAL SYSTEMsecurity context will be able to manage the pre-staged computer account. Assigning full control of the computer account to the Exchange Trusted Subsystem USGcan be used instead because the Exchange Trusted Subsystem USG contains the machine accounts of all Exchange servers in the domain.

For detailed steps about how to pre-stage and provision the CNO for a DAG, see Pre-Stage the Cluster Name Object for a Database Availability Group.

Removing servers from a DAG

Mailbox servers can be removed from a DAG by using the Manage Database Availability Group wizard in the EAC or the Remove-DatabaseAvailabilityGroupServercmdlet in the Shell. Before a Mailbox server can be removed from a DAG, all replicated mailbox databases must first be removed from the server. If you attempt toremove a Mailbox server with replicated mailbox databases from a DAG, the task fails.

There are scenarios in which you must remove a Mailbox server from a DAG before performing certain operations. These scenarios include:

Performing a server recovery operation If a Mailbox server that's a member of a DAG is lost, or otherwise fails and is unrecoverable and needsreplacement, you can perform a server recovery operation using the Setup /m:RecoverServer switch. However, before you can perform the recoveryoperation, you must first remove the server from the DAG using the Remove-DatabaseAvailabilityGroupServer cmdlet with the ConfigurationOnly parameter.Removing the database availability group There may be situations in which you need to remove a DAG (for example, when disabling third-party replicationmode). If you need to remove a DAG, you must first remove all servers from the DAG. If you attempt to remove a DAG that contains any members, the taskfails.

Return to top

Configuring DAG properties

After servers have been added to the DAG, you can use the EAC or the Shell to configure the properties of a DAG, including the witness server and witness directory usedby the DAG, and the IP addresses assigned to the DAG.

Configurable properties include:

Witness server The name of the server that you want to host the file share for the file share witness. We recommend that you specify a Client Access server asthe witness server. This enables the system to automatically configure, secure, and use the share, as needed, and enables the messaging administrator to beaware of the availability of the witness server.Witness directory The name of a directory that will be used to store file share witness data. This directory will automatically be created by the system on thespecified witness server.Database availability group IP addresses One or more IP addresses assigned to the DAG. These addresses can be configured using manually assigned static IPaddresses, or they can be automatically assigned to the DAG using a DHCP server in your organization.

The Shell enables you to configure DAG properties that aren't available in the EAC, such as DAG IP addresses, network encryption and compression settings, networkdiscovery, the TCP port used for replication, and alternate witness server and witness directory settings, and to enable Datacenter Activation Coordination mode.

For detailed steps about how to configure DAG properties, see Configure Database Availability Group Properties.

DAG network encryption

DAGs support the use of encryption by leveraging the encryption capabilities of the Windows Server operating system. DAGs use Kerberos authentication betweenExchange servers. Microsoft Kerberos security support provider (SSP) EncryptMessage and DecryptMessage APIs handle encryption of DAG network traffic. MicrosoftKerberos SSP supports multiple encryption algorithms. (For the complete list, see section 3.1.5.2, "Encryption Types" of Kerberos Protocol Extensions). The Kerberosauthentication handshake selects the strongest encryption protocol supported in the list: typically Advanced Encryption Standard (AES) 256-bit, potentially with aSHA Hash-based Message Authentication Code (HMAC) to maintain integrity of the data. For details, see HMAC.

Network encryption is a property of the DAG and not a DAG network. You can configure DAG network encryption using the Set-DatabaseAvailabilityGroup cmdletin the Shell. The possible encryption settings for DAG network communications are shown in the following table.

DAG network communication encryption settings

Page 9: Dag

Setting Description

Disabled Network encryption isn't used.

Enabled Network encryption is used on all DAG networks for replication and seeding.

InterSubnetOnly Network encryption is used on DAG networks when replicating across different subnets. This is the default setting.

SeedOnly Network encryption is used on all DAG networks for seeding only.

DAG network compression

DAGs support built-in compression. When compression is enabled, DAG network communication uses XPRESS, which is the Microsoft implementation of the LZ77algorithm. For details, see An Explanation of the Deflate Algorithm and section 3.1.4.11.1.2.1 "LZ77 Compression Algorithm" of Wire Format Protocol Specification.This is the same type of compression used in many Microsoft protocols, in particular, MAPI RPC compression between Microsoft Outlook and Exchange.

As with network encryption, network compression is also a property of the DAG and not a DAG network. You configure DAG network compression by using the Set-DatabaseAvailabilityGroup cmdlet in the Shell. The possible compression settings for DAG network communications are shown in the following table.

DAG network communication compression settings

Setting Description

Disabled Network compression isn't used.

Enabled Network compression is used on all DAG networks for replication and seeding.

InterSubnetOnly Network compression is used on DAG networks when replicating across different subnets. This is the default setting.

SeedOnly Network compression is used on all DAG networks for seeding only.

Return to top

DAG networks

A DAG network is a collection of one or more subnets used for either replication traffic or MAPI traffic. Each DAG contains a maximum of one MAPI network and zero ormore replication networks. In a single network adapter configuration, the network is used for both MAPI and replication traffic. Although a single network adapter andpath is supported, we recommend that each DAG have a minimum of two DAG networks. In a two-network configuration, one network is typically dedicated forreplication traffic, and the other network is used primarily for MAPI traffic. You can also add network adapters to each DAG member and configure additional DAGnetworks as replication networks.

Note:

When using multiple replication networks, there's no way to specify an order of precedence for network use. Exchange randomly selects a replication network fromthe group of replication networks to use for log shipping.

In Exchange 2010, manual configuration of DAG networks was necessary in many scenarios. By default in Exchange 2013, DAG networks are automatically configured bythe system. Before you can create or modify DAG networks, you must first enable manual DAG network control by running the following command:

After you've enabled manual DAG network configuration, you can use the New-DatabaseAvailabilityGroupNetwork cmdlet in the Shell to create a DAG network. Fordetailed steps about how to create a DAG network, see Create a Database Availability Group Network.

You can use the Set-DatabaseAvailabilityGroupNetwork cmdlet in the Shell to configure DAG network properties. For detailed steps about how to configure DAGnetwork properties, see Configure Database Availability Group Network Properties. Each DAG network has required and optional parameters to configure:

Network name A unique name for the DAG network of up to 128 characters.Network description An optional description for the DAG network of up to 256 characters.Network subnets One or more subnets entered using a format of IPAddress/Bitmask (for example, 192.168.1.0/24 for Internet Protocol version 4 (IPv4) subnets;2001:DB8:0:C000::/64 for Internet Protocol version 6 (IPv6) subnets).Enable replication In the EAC, select the check box to dedicate the DAG network to replication traffic and block MAPI traffic. Clear the check box to preventreplication from using the DAG network and to enable MAPI traffic. In the Shell, use the ReplicationEnabled parameter in the Set-DatabaseAvailabilityGroupNetwork cmdlet to enable and disable replication.

Note:

Disabling replication for the MAPI network doesn't guarantee that the system won't use the MAPI network for replication. When all configured replication networksare offline, failed, or otherwise unavailable, and only the MAPI network remains (which is configured as disabled for replication), the system uses the MAPI networkfor replication.

The initial DAG networks (for example, MapiDagNetwork and ReplicationDagNetwork01) created by the system are based on the subnets enumerated by the Clusterservice. Each DAG member must have the same number of network adapters, and each network adapter must have an IPv4 address (and optionally, an IPv6 address aswell) on a unique subnet. Multiple DAG members can have IPv4 addresses on the same subnet, but each network adapter and IP address pair in a specific DAG membermust be on a unique subnet. In addition, only the adapter used for the MAPI network should be configured with a default gateway. Replication networks shouldn't beconfigured with a default gateway.

For example, consider DAG1, a two-member DAG where each member has two network adapters (one dedicated for the MAPI network and the other for a replicationnetwork). Example IP address configuration settings are shown in the following table.

Example network adapter settings

Server-network adapter IP address/subnet mask Default gateway

EX1-MAPI 192.168.1.15/24 192.168.1.1

EX1-Replication 10.0.0.15/24 Not applicable

EX2-MAPI 192.168.1.16 192.168.1.1

EX2-Replication 10.0.0.16 Not applicable

In the following configuration, there are two subnets configured in the DAG: 192.168.1.0 and 10.0.0.0. When EX1 and EX2 are added to the DAG, two subnets will be

Set-DatabaseAvailabilityGroup <DAGName> -ManualDagNetworkConfiguration $true

Page 10: Dag

enumerated and two DAG networks will be created: MapiDagNetwork (192.168.1.0) and ReplicationDagNetwork01 (10.0.0.0). These networks will be configured as shownin the following table.

Enumerated DAG network settings for a single-subnet DAG

Name Subnets Interfaces MAPI access enabled Replication enabled

MapiDagNetwork 192.168.1.0/24 EX1 (192.168.1.15)EX2 (192.168.1.16)

True True

ReplicationDagNetwork01 10.0.0.0/24 EX1 (10.0.0.15)EX2 (10.0.0.16)

False True

To complete the configuration of ReplicationDagNetwork01 as the dedicated replication network, disable replication for MapiDagNetwork by running the followingcommand.

After replication is disabled for MapiDagNetwork, the Microsoft Exchange Replication service uses ReplicationDagNetwork01 for continuous replication. IfReplicationDagNetwork01 experiences a failure, the Microsoft Exchange Replication service reverts to using MapiDagNetwork for continuous replication. This is doneintentionally by the system to maintain high availability.

DAG networks and multiple subnet deployments

In the preceding example, even though there are two different subnets in use by the DAG (192.168.1.0 and 10.0.0.0), the DAG is considered a single-subnet DAGbecause each member uses the same subnet to form the MAPI network. When DAG members use different subnets for the MAPI network, the DAG is referred to as amulti-subnet DAG. In a multi-subnet DAG, the proper subnets are automatically with each DAG network.

For example, consider DAG2, a two-member DAG where each member has two network adapters (one dedicated for the MAPI network and the other for a replicationnetwork), and each DAG member is located in a separate Active Directory site, with its MAPI network on a different subnet. Example IP address configuration settingsare shown in the following table.

Example network adapter settings for a multi-subnet DAG

Server-network adapter IP address/subnet mask Default gateway

EX1-MAPI 192.168.0.15/24 192.168.0.1

EX1-Replication 10.0.0.15/24 Not applicable

EX2-MAPI 192.168.1.15 192.168.1.1

EX2-Replication 10.0.1.15 Not applicable

In the following configuration, there are four subnets configured in the DAG: 192.168.0.0, 192.168.1.0, 10.0.0.0, and 10.0.1.0. When EX1 and EX2 are added to the DAG,four subnets will be enumerated, but only two DAG networks will be created: MapiDagNetwork (192.168.0.0, 192.168.1.0) and ReplicationDagNetwork01 (10.0.0.0,10.0.1.0). These networks will be configured as shown in the following table.

Enumerated DAG network settings for a multi-subnet DAG

Name Subnets Interfaces MAPI access enabled Replication enabled

MapiDagNetwork 192.168.0.0/24192.168.1.0/24

EX1 (192.168.0.15)EX2 (192.168.1.15)

True True

ReplicationDagNetwork01 10.0.0.0/2410.0.1.0/24

EX1 (10.0.0.15)EX2 (10.0.1.15)

False True

DAG networks and iSCSI networks

By default, DAGs perform discovery of all networks detected and configured for use by the underlying cluster. This includes any Internet SCSI (iSCSI) networks in useas a result of using iSCSI storage for one or more DAG members. As a best practice, iSCSI storage should use dedicated networks and network adapters. Thesenetworks shouldn't be managed by the DAG or its cluster, or used as DAG networks (MAPI or replication). Instead, these networks should be manually disabled fromuse by the DAG, so they can be dedicated to iSCSI storage traffic. To disable iSCSI networks from being detected and used as DAG networks, configure the DAG toignore any currently detected iSCSI networks using the Set-DatabaseAvailabilityGroupNetwork cmdlet, as shown in this example:

This command will also disable the network for use by the cluster. Although the iSCSI networks will continue to appear as DAG networks, they won't be used forMAPI or replication traffic after running the above command.

Return to top

Configuring DAG members

Mailbox servers that are members of a DAG have some properties specific to high availability that should be configured as described in the following sections:

Automatic database mount dialDatabase copy automatic activation policyMaximum active databases

Automatic database mount dial

The AutoDatabaseMountDial parameter specifies the automatic database mount behavior after a database failover. You can use the Set-MailboxServer cmdlet toconfigure the AutoDatabaseMountDial parameter with any of the following values:

BestAvailability If you specify this value, the database automatically mounts immediately after a failover if the copy queue length is less than or equalto 12. The copy queue length is the number of logs recognized by the passive copy that needs to be replicated. If the copy queue length is more than 12, the

Set-DatabaseAvailabilityGroupNetwork -Identity DAG1\MapiDagNetwork -ReplicationEnabled:$false

Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork02 -ReplicationEnabled:$false -IgnoreNetwork:$true

Page 11: Dag

database doesn't automatically mount. When the copy queue length is less than or equal to 12, Exchange attempts to replicate the remaining logs to thepassive copy and mounts the database.GoodAvailability If you specify this value, the database automatically mounts immediately after a failover if the copy queue length is less than or equalto six. The copy queue length is the number of logs recognized by the passive copy that needs to be replicated. If the copy queue length is more than six, thedatabase doesn't automatically mount. When the copy queue length is less than or equal to six, Exchange attempts to replicate the remaining logs to thepassive copy and mounts the database.Lossless If you specify this value, the database doesn't automatically mount until all logs generated on the active copy have been copied to the passivecopy. This setting also causes the Active Manager best copy selection algorithm to sort potential candidates for activation based on the database copy'sactivation preference value and not its copy queue length.

The default value is GoodAvailability. If you specify either BestAvailability or GoodAvailability, and all the logs from the active copy can't be copied tothe passive copy being activated, you may lose some mailbox data. However, the Safety Net feature (which is enabled by default) helps protect against most data lossby resubmitting messages that are in the Safety Net queue.

In addition to the preceding values, you can also configure the AutoDatabaseMountDial parameter with a custom value by using ADSI Edit or Ldp.exe to modify theattribute directly in Active Directory. The AutoDatabaseMountDial parameter is represented by the msExchDataLossForAutoDatabaseMount attribute of theMailbox server object. The whole number numeric value for this attribute represents the maximum number of transaction log files you are willing to lose to mount adatabase without human intervention. If you configure the AutoDatabaseMountDial parameter with a custom value greater than 12, we recommend that you alsoincrease the duration of the Safety Net retention period to enable increased protection against a greater number of lost logs.

Example: configuring automatic database mount dial

The following example configures a Mailbox server with an AutoDatabaseMountDial setting of GoodAvailability.

Database copy automatic activation policy

The DatabaseCopyAutoActivationPolicy parameter specifies the type of automatic activation available for mailbox database copies on the selected Mailbox servers.You can use the Set-MailboxServer cmdlet to configure the DatabaseCopyAutoActivationPolicy parameter with any of the following values:

Blocked If you specify this value, databases can't be automatically activated on the selected Mailbox servers.IntrasiteOnly If you specify this value, the database copy is allowed to be activated on servers in the same Active Directory site. This prevents cross-sitefailover or activation. This property is for incoming mailbox database copies (for example, a passive copy being made an active copy). Databases can't beactivated on this Mailbox server for database copies that are active in another Active Directory site.Unrestricted If you specify this value, there are no special restrictions on activating mailbox database copies on the selected Mailbox servers.

Example: configuring database copy automatic activation policy

The following example configures a Mailbox server with a DatabaseCopyAutoActivationPolicy setting of Blocked.

Maximum active databases

The MaximumActiveDatabases parameter (also used with the Set-MailboxServer cmdlet) specifies the number of databases that can be mounted on a Mailbox server.You can configure Mailbox servers to meet your deployment requirements by ensuring that an individual Mailbox server doesn't become overloaded.

The MaximumActiveDatabases parameter is configured with a whole number numeric value. When the maximum number is reached, the database copies on theserver won't be activated if a failover or switchover occurs. If the copies are already active on a server, the server won't allow databases to be mounted.

Example: configuring maximum active databases

The following example configures a Mailbox server to support a maximum of 20 active databases.

Return to top

Performing maintenance on DAG members

Before performing any type of software or hardware maintenance on a DAG member, you should first remove the DAG member from service by using theStartDagServerMaintenance.ps1 script. This script moves all the active databases off the server and blocks active databases from moving to that server. The script alsoensures that all critical DAG support functionality that may be on the server (for example, the Primary Active Manager (PAM) role) is moved to another server andblocked from moving back to the server. Specifically, the StartDagServerMaintenance.ps1 script performs the following tasks:

Runs Suspend-MailboxDatabaseCopy with the ActivationOnly parameter to suspend each database copy hosted on the DAG member for activation.Pauses the node in the cluster, which prevents the node from being and becoming the PAM.Sets the value of the DatabaseCopyAutoActivationPolicy parameter on the DAG member to Blocked.Moves all active databases currently hosted on the DAG member to other DAG members.If the DAG member currently owns the default cluster group, the script moves the default cluster group (and therefore the PAM role) to another DAG member.

If any of the preceding tasks fails, all operations, except for successful database moves, are undone.

After the maintenance is complete and the DAG member is ready to return to service, you can use the StopDagServerMaintenance.ps1 script to take the DAG memberout of maintenance mode and put it back into production. Specifically, the StopDagServerMaintenance.ps1 script performs the following tasks:

Runs the Resume-MailboxDatabaseCopy cmdlet for each database copy hosted on the DAG member.Resumes the node in the cluster, which enables full cluster functionality for the DAG member.Sets the value of the DatabaseCopyAutoActivationPolicy parameter on the DAG member to Unrestricted.

Both scripts accept the -ServerName parameter (which can be either the host name or the fully qualified domain name (FQDN) of the DAG member) and the -WhatIfparameter. Both scripts can be run locally or remotely. The server on which the scripts are executed must have the Windows Failover Cluster Management tools installed

Set-MailboxServer -Identity EX1 -AutoDatabaseMountDial GoodAvailability

Set-MailboxServer -Identity EX1 -DatabaseCopyAutoActivationPolicy Blocked

Set-MailboxServer -Identity EX1 -MaximumActiveDatabases 20

Page 12: Dag

(RSAT-Clustering).

Return to top

Shutting down DAG members

The Exchange 2013 high availability solution is integrated with the Windows shutdown process. If an administrator or application initiates a shutdown of a Windowsserver in a DAG that has a mounted database that's replicated to one or more DAG members, the system attempts to activate another copy of the mounted databaseprior to allowing the shutdown process to complete.

However, this new behavior doesn't guarantee that all of the databases on the server being shut down will experience a lossless activation. As a result, it's a bestpractice to perform a server switchover prior to shutting down a server that's a member of a DAG.

Return to top

Installing update rollups on DAG members

Installing Microsoft Exchange Server 2013 update rollups on a server that's a member of a DAG is a relatively straightforward process. When you install an update rollupon a server that's a member of a DAG, several services are stopped during the installation, including all Exchange services and the Cluster service. The general process forapplying update rollups to a DAG member is as follows:

1. Use the StartDagServerMaintenance.ps1 script to put the DAG member in maintenance mode.2. Install the update rollup.3. Use the StopDagServerMaintenance.ps1 script to take the DAG member out of maintenance mode and put it back into production.4. Use the RedistributeActiveDatabases.ps1 script to rebalance the active database copies across the DAG.

You can download the latest update rollup for Exchange 2013 from the Microsoft Download Center.

Return to top

Page 13: Dag
Page 14: Dag

Managing Mailbox Database Copies

Applies to: Exchange Server 2013

Topic Last Modified: 2012-12-10

After a database availability group (DAG) has been created, configured, and populated with Mailbox server members, you can use the Exchange Administration Center(EAC) or the Exchange Management Shell to add mailbox database copies in a flexible and granular way.

Managing database copies

After multiple copies of a database are created, you can use the EAC or the Shell to monitor the health and status of each copy and to perform other management tasksassociated with database copies. Some of the management tasks you may need to perform include suspending or resuming a database copy, seeding a database copy,monitoring database copies, configuring database copy settings, and removing a database copy.

Suspending and resuming database copies

For a variety of reasons, such as performing planned maintenance, it may be necessary to suspend and resume continuous replication activity for a database copy. Inaddition, some administrative tasks, such as seeding, require you to first suspend a database copy. We recommend that all replication activity be suspended when thepath for the database or its log files is being changed. You can suspend and resume database copy activity by using the EAC, or by running the Suspend-MailboxDatabaseCopy and Resume-MailboxDatabaseCopy cmdlets in the Shell. For detailed steps about how to suspend or resume continuous replication activityfor a database copy, see Suspend or Resume a Mailbox Database Copy.

Log truncation doesn't occur on the active mailbox database copy when one or more passive copies are suspended. If planned maintenance activities are going totake an extended period of time (for example, several days), you may have considerable log file buildup. To prevent the log drive from filling up with transaction logs,you can remove the affected passive database copy instead of suspending it. When the planned maintenance is completed, you can re-add the passive databasecopy.

Seeding a database copy

Seeding, also known as updating, is the process in which a database, either a blank database or a copy of the production database, is added to the target copylocation on another Mailbox server in the same DAG as the active database. This becomes the baseline database for the copy maintained by that server.

Depending on the situation, seeding can be an automatic process or a manual process that you initiate. When a database copy is added, the copy will beautomatically seeded, provided that the target server and its storage are properly configured. If you want to manually seed a database copy and don't want automaticseeding to occur when creating the copy, you can use the SeedingPostponed parameter when running the Add-MailboxDatabaseCopy cmdlet.

Database copies rarely need to be reseeded after the initial seeding has occurred. But if reseeding is necessary, or if you want to manually seed a database copyinstead of having the system automatically seed the copy, these tasks can be performed by using the Update Mailbox Database Copy wizard in the EAC or by usingthe Update-MailboxDatabaseCopy cmdlet in the Shell. Before seeding a database copy, you must first suspend the mailbox database copy. For detailed steps abouthow to seed a database copy, see Update a Mailbox Database Copy.

After a manual seed operation has completed, replication for the seeded mailbox database copy is automatically resumed. If you don't want replication toautomatically resume, you can use the ManualResume parameter when running the Update-MailboxDatabaseCopy cmdlet.

Choosing what to seed

When performing a seed operation, you can choose to seed the mailbox database copy, the content index catalog for the mailbox database copy, or both thedatabase copy and the content index catalog copy. The default behavior of the Update Mailbox Database Copy wizard and the Update-MailboxDatabaseCopycmdlet is to seed both the mailbox database copy and the content index catalog copy. To seed just the mailbox database copy without seeding the content indexcatalog, use the DatabaseOnly parameter when running the Update-MailboxDatabaseCopy cmdlet. To seed just the content index catalog copy, use theCatalogOnly parameter when running the Update-MailboxDatabaseCopy cmdlet.

Selecting the seeding source

Any healthy database copy can be used as the seeding source for an additional copy of that database. This is particularly useful when you have a DAG that hasbeen extended across multiple physical locations. For example, consider a four-member DAG deployment, where two members (MBX1 and MBX2) are located inPortland, Oregon and two members (MBX3 and MBX4) are located in New York, New York. A mailbox database named DB1 is active on MBX1 and there are passivecopies of DB1 on MBX2 and MBX3. When adding a copy of DB1 to MBX4, you have the option of using the copy on MBX3 as the source for seeding. In doing so,you avoid seeding over the wide area network (WAN) link between Portland and New York.

To use a specific copy as a source for seeding when adding a new database copy, you would do the following:

Use the SeedingPostponed parameter when running the Add-MailboxDatabaseCopy cmdlet to add the database copy. If the SeedingPostponed parameterisn't used, the database copy will be explicitly seeded using the active copy of the database as the source.You can specify the source server you want to use as part of the Update Mailbox Database Copy wizard in the EAC, or you can use the SourceServerparameter when running the Update-MailboxDatabaseCopy cmdlet to specify the desired source server for seeding. In the preceding example, you wouldspecify MBX3 as the source server. If the SourceServer parameter isn't used, the database copy will be explicitly seeded from the active copy of the database.

Seeding and networks

In addition to selecting a specific source server for seeding a mailbox database copy, you can also use the Shell to specify which DAG networks to use, andoptionally override the DAG network's compression and encryption settings during the seed operation.

To specify the networks you want to use for seeding, use the Network parameter when running the Update-MailboxDatabaseCopy cmdlet and specify the DAGnetworks that you want to use. If you don't use the Network parameter, the system uses the following default behavior for selecting a network to use for theseeding operation:

If the source server and target server are on the same subnet and a replication network has been configured that includes the subnet, the replicationnetwork will be used.If the source server and target server are on different subnets, even if a replication network that contains those subnets has been configured, the client(MAPI) network will be used for seeding.If the source server and target server are in different datacenters, the client (MAPI) network will be used for seeding.

At the DAG level, DAG networks are configured for encryption and compression. The default settings are to use encryption and compression only forcommunications on different subnets. If the source and target are on different subnets and the DAG is configured with the default values for NetworkCompressionand NetworkEncryption, you can override these values by using the NetworkCompressionOverride and NetworkEncryptionOverride parameters, respectively, whenrunning the Update-MailboxDatabaseCopy cmdlet.

Exchange 2013 1 out of 1 rated this helpful

TechNet Products IT Resources Downloads Training Support

Page 15: Dag

Seeding process

When you initiate a seeding process by using the Add-MailboxDatabaseCopy or Update-MailboxDatabaseCopy cmdlets, the following tasks are performed:

1. Database properties from Active Directory are read to validate the specified database and servers, and to verify that the source and target servers arerunning Exchange 2013, they are both members of the same DAG, and that the specified database isn't a recovery database. The database file paths are alsoread.

2. Preparations occur for reseed checks from the Microsoft Exchange Replication service on the target server.3. The Microsoft Exchange Replication service on the target server checks for the presence of database and transaction log files in the file directories read by

the Active Directory checks in step 1.4. The Microsoft Exchange Replication service returns the status information from the target server to the administrative interface from where the cmdlet was

run.5. If all preliminary checks have passed, you're prompted to confirm the operation before continuing. If you confirm the operation, the process continues. If

an error is encountered during the preliminary checks, the error is reported and the operation fails.6. The seed operation is started from the Microsoft Exchange Replication service on the target server.7. The Microsoft Exchange Replication service suspends database replication for the active database copy.8. The state information for the database is updated by the Microsoft Exchange Replication service to reflect a status of Seeding.9. If the target server doesn't already have the directories for the target database and log files, they are created.

10. A request to seed the database is passed from the Microsoft Exchange Replication service on the target server to the Microsoft Exchange Replication serviceon the source server using TCP. This request and the subsequent communications for seeding the database occur on a DAG network that has beenconfigured as a replication network.

11. The Microsoft Exchange Replication service on the source server initiates an Extensible Storage Engine (ESE) streaming backup via the Microsoft ExchangeInformation Store service interface.

12. The Microsoft Exchange Information Store service streams the database data to the Microsoft Exchange Replication service.13. The database data is moved from the source server's Microsoft Exchange Replication service to the target server's Microsoft Exchange Replication service.14. The Microsoft Exchange Replication service on the target server writes the database copy to a temporary directory located in the main database directory

called temp-seeding.15. The streaming backup operation on the source server ends when the end of the database is reached.16. The write operation on the target server completes, and the database is moved from the temp-seeding directory to the final location. The temp-seeding

directory is deleted.17. On the target server, the Microsoft Exchange Replication service proxies a request to the Microsoft Exchange Search service to mount the content index

catalog for the database copy, if it exists. If there are existing out-of-date catalog files from a previous instance of the database copy, the mount operationfails, which triggers the need to replicate the catalog from the source server. Likewise, if the catalog doesn't exist on a new instance of the database copyon the target server, a copy of the catalog is required. The Microsoft Exchange Replication service directs the Microsoft Exchange Search service to suspendindexing for the database copy while a new catalog is copied from the source.

18. The Microsoft Exchange Replication service on the target server sends a seed catalog request to the Microsoft Exchange Replication service on the sourceserver.

19. On the source server, the Microsoft Exchange Replication service requests the directory information from the Microsoft Exchange Search service andrequests that indexing be suspended.

20. The Microsoft Exchange Search service on the source server returns the search catalog directory information to the Microsoft Exchange Replication service.21. The Microsoft Exchange Replication service on the source server reads the catalog files from the directory.22. The Microsoft Exchange Replication service on the source server moves the catalog data to the Microsoft Exchange Replication service on the target server

using a connection across the replication network. After the read is complete, the Microsoft Exchange Replication service sends a request to the MicrosoftExchange Search service to resume indexing of the source database.

23. If there are any existing catalog files on the target server in the directory, the Microsoft Exchange Replication service on the target server deletes them.24. The Microsoft Exchange Replication service on the target server writes the catalog data to a temporary directory called CiSeed.Temp until the data is

completely transferred.25. The Microsoft Exchange Replication service moves the complete catalog data to the final location.26. The Microsoft Exchange Replication service on the target server resumes search indexing on the target database.27. The Microsoft Exchange Replication service on the target server returns a completion status.28. The final result of the operation is passed to the administrative interface from which the cmdlet was called.

Configuring database copies

After a database copy is created, you can view and modify its configuration settings when needed. You can view some configuration information by examining theProperties page for a database copy in the EAC. You can also use the Get-MailboxDatabase and Set-MailboxDatabaseCopy cmdlets in the Shell to view and configuredatabase copy settings, such as replay lag time, truncation lag time, and activation preference order. For detailed steps about how to view and configure databasecopy settings, see Configure Mailbox Database Copy Properties.

Using replay lag and truncation lag options

Mailbox database copies support the use of a replay lag time and a truncation lag time, both of which are configured in minutes. Setting a replay lag time enablesyou to take a database copy back to a specific point in time. Setting a truncation lag time enables you to use the logs on a passive database copy to recover from theloss of log files on the active database copy. Because both of these features result in the temporary buildup of log files, using either of them will affect your storagedesign.

Replay lag time

Replay lag time is a property of a mailbox database copy that specifies the amount of time, in minutes, to delay log replay for the database copy. The replay lagtimer starts when a log file has been replicated to the passive copy and has successfully passed inspection. By delaying the replay of logs to the database copy, youhave the capability to recover the database to a specific point in time in the past. A mailbox database copy configured with a replay lag time greater than 0 isreferred to as a lagged mailbox database copy, or simply, a lagged copy.

A strategy that uses database copies and the litigation hold features in Exchange 2013 can provide protection against a range of failures that would ordinarily causedata loss. However, these features can't provide protection against data loss in the event of logical corruption, which although rare, can cause data loss. Laggedcopies are designed to prevent loss of data in the case of logical corruption. Generally, there are two types of logical corruption:

Database logical corruption The database pages checksum matches, but the data on the pages is wrong logically. This can occur when ESE attempts towrite a database page and even though the operating system returns a success message, the data is either never written to the disk or it's written to thewrong place. This is referred to as a lost flush. To prevent lost flushes from losing data, ESE includes a lost flush detection mechanism in the database alongwith a page patching feature (single page restore).Store logical corruption Data is added, deleted, or manipulated in a way that the user doesn't expect. These cases are generally caused by third-partyapplications. It's generally only corruption in the sense that the user views it as corruption. The Exchange store considers the transaction that produced thelogical corruption to be a series of valid MAPI operations. The litigation hold feature in Exchange 2013 provides protection from store logical corruption(because it prevents content from being permanently deleted by a user or application). However, there may be scenarios where a user mailbox becomes socorrupted that it would be easier to restore the database to a point in time prior to the corruption, and then export the user mailbox to retrieve uncorrupteddata.

The combination of database copies, hold policy, and ESE single page restore leaves only the rare but catastrophic store logical corruption case. Your decision onwhether to use a database copy with a replay lag (a lagged copy) will depend on which third-party applications you use and your organization's history with storelogical corruption.

If you choose to use lagged copies, be aware of the following implications for their use:

The replay lag time is an administrator-configured value, and by default, it's disabled.The replay lag time setting has a default setting of 0 days, and a maximum setting of 14 days.Lagged copies aren't considered highly available copies. Instead, they are designed for disaster recovery purposes, to protect against store logicalcorruption.

Page 16: Dag

The greater the replay lag time set, the longer the database recovery process. Depending on the number of log files that need to replayed during recovery,and the speed at which your hardware can replay them, it may take several hours or more to recover a database.We recommend that you determine whether lagged copies are critical for your overall disaster recovery strategy. If using them is critical to your strategy,we recommend using multiple lagged copies, or using a redundant array of independent disks (RAID) to protect a single lagged copy, if you don't havemultiple lagged copies. If you lose a disk or if corruption occurs, you don't lose your lagged point in time.Lagged copies aren't patchable with the ESE single page restore feature. If a lagged copy encounters database page corruption (for example, a -1018 error),it will have to be reseeded (which will lose the lagged aspect of the copy).

Activating and recovering a lagged mailbox database copy is an easy process if you want the database to replay all log files and make the database copy current. Ifyou want to replay log files up to a specific point in time, it's a more difficult operation because you manually manipulate log files and run Exchange ServerDatabase Utilities (Eseutil.exe).

For detailed steps about how to activate a lagged mailbox database copy, see Activate a Lagged Mailbox Database Copy.

Truncation lag time

Truncation lag time is a property of a mailbox database copy that specifies the amount of time, in minutes, to delay log deletion for the database copy after thelog file has been replayed into the database copy. The truncation lag timer starts when a log file has been replicated to the passive copy, successfully passedinspection, and has been successfully replayed into the copy of the database. By delaying the truncation of log files from the database copy, you have thecapability to recover from failures that affect the log files for the active copy of the database.

Database copies and log truncation

Log truncation works the same in Exchange 2013 as it did in Exchange 2010. Truncation behavior is determined by the replay lag time and truncation lag timesettings for the copy.

The following criteria must be met for a database copy's log file to be truncated when lag settings are left at their default values of 0 (disabled):

The log file must have been successfully backed up, or circular logging must be enabled.The log file must be below the checkpoint (the minimum log file required for recovery) for the database.All other lagged copies must have inspected the log file.All other copies (not lagged copies) must have replayed the log file.

The following criteria must be met for truncation to occur for a lagged database copy:

The log file must be below the checkpoint for the database.The log file must be older than ReplayLagTime + TruncationLagTime.The log file must have been truncated on the active copy.

Database activation policy

There are scenarios in which you may want to create a mailbox database copy and prevent the system from automatically activating that copy in the event of afailure, for example:

If you deploy one or more mailbox database copies to an alternate or standby datacenter.If you configure a lagged database copy for recovery purposes.If you are performing maintenance or an upgrade of a server.

In each of the preceding scenarios, you have database copies that you don't want the system to activate automatically. To prevent the system from automaticallyactivating a mailbox database copy, you can configure the copy to be blocked (suspended) for activation. This allows the system to maintain the currency of thedatabase through log shipping and replay, but prevents the system from automatically activating and using the copy. Copies blocked for activation must bemanually activated by an administrator. You can configure the database activation policy for an entire server by using the Set-MailboxServer cmdlet or an individualdatabase copy by using the Set-MailboxDatabaseCopy cmdlet to set the DatabaseCopyAutoActivationPolicy parameter to Blocked.

For more information about configuring database activation policy, see Configure Activation Policy for a Mailbox Database Copy.

Effect of mailbox moves on continuous replication

On a very busy mailbox database with a high log generation rate, there is a greater chance for data loss if replication to the passive database copies can't keep up withlog generation. One scenario that can introduce a high log generation rate is mailbox moves. Exchange 2013 includes a Data Guarantee API that's used by servicessuch as the Microsoft Exchange Mailbox Replication service (MRS) to check the health of the database copy architecture based on the value of theDataMoveReplicationConstraint parameter that was set by the system or an administrator. Specifically, the Data Guarantee API can be used to:

Check replication health Confirms that the prerequisite number of database copies is available.Check replication flush Confirms that the required log files have been replayed against the prerequisite number of database copies.

When executed, the API returns the following status information to the calling application:

Retry Signifies that there are transient errors that prevent a condition from being checked against the database.Satisfied Signifies that the database meets the required conditions or the database isn't replicated.NotSatisfied Signifies that the database doesn't meet the required conditions. In addition, information is provided to the calling application as to why theNotSatisfied response was returned.

The value of the DataMoveReplicationConstraint parameter for the mailbox database determines how many database copies should be evaluated as part of therequest. The DataMoveReplicationConstraint parameter has the following possible values:

None When you create a mailbox database, this value is set by default. When this value is set, the Data Guarantee API conditions are ignored. This settingshould be used only for mailbox databases that aren't replicated.SecondCopy This is the default value when you add the second copy of a mailbox database. When this value is set, at least one passive database copy mustmeet the Data Guarantee API conditions.SecondDatacenter When this value is set, at least one passive database copy in another Active Directory site must meet the Data Guarantee APIconditions.AllDatacenters When this value is set, at least one passive database copy in each Active Directory site must meet the Data Guarantee API conditions.AllCopies When this value is set, all copies of the mailbox database must meet the Data Guarantee API conditions.

Check Replication Health

When the Data Guarantee API is executed to evaluate the health of the database copy infrastructure, several items are evaluated.

If theDataMoveReplicationConstraint Conditions

Page 17: Dag

SecondCopy At least one passive database copy for areplicated database must meet the conditions inthe next column.

The passive database copy must:

Be healthy.Have a replay queue within 10 minutes of the replay lag time.Have a copy queue length less than 10 logs.Have an average copy queue length less than 10 logs. Theaverage copy queue length is computed based on the numberof times the application has queried the database status.

SecondDatacenter At least one passive database copy in anotherActive Directory site must meet the conditions inthe next column.

AllDatacenters The active copy must be mounted, and a passivecopy in each Active Directory site must meet theconditions in the next column.

AllCopies The active copy must be mounted, and allpassive database copies must meet theconditions in the next column.

Check Replication Flush

The Data Guarantee API can also be used to validate that a prerequisite number of database copies have replayed the required transaction logs. This is verified bycomparing the last log replayed timestamp with that of the calling service's commit timestamp (in most cases, this is the timestamp of the last log file that containsrequired data) plus an additional five seconds (to deal with system time clock skews or drift). If the replay timestamp is greater than the commit timestamp, theDataMoveReplicationConstraint parameter is satisfied. If the replay timestamp is less than the commit timestamp, the DataMoveReplicationConstraint isn't satisfied.

Before moving large numbers of mailboxes to or from replication databases within a DAG, we recommend that you configure the DataMoveReplicationConstraintparameter on each mailbox database according to the following:

Mailbox databases that don't have anydatabase copies

None

A DAG within a single Active Directorysite

SecondCopy

A DAG in multiple datacenters using astretched Active Directory site

SecondCopy

A DAG that spans two Active Directorysites, and you will have highly availabledatabase copies in each site

SecondDatacenter

A DAG that spans two Active Directorysites, and you will have only laggeddatabase copies in the second site

SecondCopy

This is because the Data Guarantee API won't guarantee data being committed until the log file is replayed into thedatabase copy, and due to the nature of the database copy being lagged, this constraint will fail the move request,unless the lagged database copy ReplayLagTime value is less than 30 minutes.

A DAG that spans three or more ActiveDirectory sites, and each site will containhighly available database copies

AllDatacenters

Balancing database copies

Due to the inherent nature of DAGs, as the result of database switchovers and failovers, active mailbox database copies will change hosts several times throughout aDAG's lifetime. As a result, DAGs can become unbalanced in terms of active mailbox database copy distribution. The following table shows an example of a DAG thathas four databases with four copies of each database (for a total of 16 databases on each server) with an uneven distribution of active database copies.

DAG with unbalanced active copy distribution

ServerNumber of activedatabases

Number of passivedatabases

Number of mounteddatabases

Number of dismounteddatabases

Preference countlist

EX1 5 11 5 0 4, 4, 3, 5

EX2 1 15 1 0 1, 8, 6, 1

EX3 12 4 12 0 13, 2, 1, 0

EX4 1 15 1 0 1, 1, 5, 9

In the preceding example, there are four copies of each database, and therefore, only four possible values for activation preference (1, 2, 3, or 4). The Preferencecount list column shows the count of the number of databases with each of these values. For example, on EX3, there are 13 database copies with an activationpreference of 1, two copies with an activation preference of 2, one copy with an activation preference of 3, and no copies with an activation preference of 4.

As you can see, this DAG isn't balanced in terms of the number of active databases hosted by each DAG member, the number of passive databases hosted by eachDAG member, or the activation preference count of the hosted databases.

You can use the RedistributeActiveDatabases.ps1 script to balance the active mailbox databases copies across a DAG. This script moves databases between theircopies in an attempt to have an equal number of mounted databases on each server in DAG. If required, the script also attempts to balance active databases acrosssites.

The script provides two options for balancing active database copies within a DAG:

BalanceDbsByActivationPreference When this option is specified, the script attempts to move databases to their most preferred copy (based on activationpreference) without regard to the Active Directory site.BalanceDbsBySiteAndActivationPreference When this option is specified, the script attempts to move active databases to their most preferred copy, whilealso trying to balance active databases within each Active Directory site.

After running the script with the first option, the preceding unbalanced DAG becomes balanced, as shown in the following table.

DAG with balanced active copy distribution

Page 18: Dag

ServerNumber of activedatabases

Number of passivedatabases

Number of mounteddatabases

Number of dismounteddatabases

Preference countlist

EX1 4 12 4 0 4, 4, 4, 4

EX2 4 12 4 0 4, 4, 4, 4

EX3 4 12 4 0 4, 4, 4, 4

EX4 4 12 4 0 4, 4, 4, 4

As shown in the preceding table, this DAG is now balanced in terms of number of active and passive databases on each server and activation preference across theservers.

The following table lists the available parameters for the RedistributeActiveDatabases.ps1 script.

RedistributeActiveDatabases.ps1 script parameters

Parameter Description

DagName Specifies the name of the DAG you want to rebalance. If this parameter is omitted, the DAG of which the local serveris a member is used.

BalanceDbsByActivationPreference Specifies that the script should move databases to their most preferred copy without regard to the Active Directorysite.

BalanceDbsBySiteAndActivationPreference Specifies that the script should attempt to move active databases to their most preferred copy, while also trying tobalance active databases within each Active Directory site.

ShowFinalDatabaseDistribution Specifies that a report of current database distribution be displayed after redistribution is complete.

AllowedDeviationFromMeanPercentage Specifies the allowed variation of active databases across sites, expressed as a percentage. The default is 20%. Forexample, if there were 99 databases distributed between three sites, the ideal distribution would be 33 databases ineach site. If the allowed deviation is 20%, the script attempts to balance the databases so that each site has no morethan 10% more or less than this number. 10% of 33 is 3.3, which is rounded up to 4. Therefore, the script attempts tohave between 29 and 37 databases in each site.

ShowDatabaseCurrentActives Specifies that the script produce a report for each database detailing how the database was moved and whether it'snow active on its most-preferred copy.

ShowDatabaseDistributionByServer Specifies that the script produce a report for each server showing its database distribution.

RunOnlyOnPAM Specifies that the script run only on the DAG member that currently has the PAM role. The script verifies it's beingrun from the PAM. If it isn't being run from the PAM, the script exits.

LogEvents Specifies that the script logs an event (MsExchangeRepl event 4115) containing a summary of the actions.

IncludeNonReplicatedDatabases Specifies that the script should include non-replicated databases (databases without copies) when determining howto redistribute the active databases. Although non-replicated databases can't be moved, they may affect thedistribution of the replicated databases.

Confirm The Confirm switch can be used to suppress the confirmation prompt that appears by default when this script isrun. To suppress the confirmation prompt, use the syntax -Confirm:$False. You must include a colon ( : ) in thesyntax.

RedistributeActiveDatabases.ps1 examples

This example shows the current database distribution for a DAG, including preference count list.

This example redistributes and balances the active mailbox database copies in a DAG using activation preference without prompting for input.

This example redistributes and balances the active mailbox database copies in a DAG using activation preference, and produces a summary of the distribution.

Monitoring database copies

A database copy is your first defense if a failure occurs that affects the active copy of a database. It's therefore critical to monitor the health and status of databasecopies to ensure that they will be available when needed. You can view a variety of information, including copy queue length, replay queue length, status, andcontent index state information, by examining the details of a database copy in the EAC. You can also use the Get-MailboxDatabaseCopyStatus cmdlet in the Shellto view a variety of status information for a database copy.

For more information about monitoring database copies, see Monitoring Database Availability Groups.

Removing a database copy

A database copy can be removed at any time by using the EAC or by using the Remove-MailboxDatabaseCopy cmdlet in the Shell. After removing a database copy,you must manually delete any database and transaction log files from the server from which the database copy is being removed. For detailed steps about how toremove a database copy, see Remove a Mailbox Database Copy.

RedistributeActiveDatabases.ps1 -DagName DAG1 -ShowDatabaseDistributionByServer | Format-Table

RedistributeActiveDatabases.ps1 -DagName DAG1 -BalanceDbsByActivationPreference -Confirm:$False

RedistributeActiveDatabases.ps1 -DagName DAG1 -BalanceDbsByActivationPreference -ShowFinalDatabaseDistribution

Page 19: Dag

Database switchovers

The Mailbox server that hosts the active copy of a database is referred to as the mailbox database master. The process of activating a passive database copy changes themailbox database master for the database and turns the passive copy into the new active copy. This process is called a database switchover. In a database switchover,the active copy of a database is dismounted on one Mailbox server and a passive copy of that database is mounted as the new active mailbox database on anotherMailbox server. When performing a switchover, you can optionally override the database mount dial setting on the new mailbox database master.

You can quickly identify which Mailbox server is the current mailbox database master by reviewing the right-hand column under the Database Copies tab in the EAC.You can perform a switchover by using the Activate link in the EAC, or by using the Move-ActiveMailboxDatabase cmdlet in the Shell.

There are several internal checks that will be performed before activating a passive copy:

The status of the database copy is checked. If the database copy is in a failed state, the switchover is blocked. You can override this behavior and bypass thehealth check by using the SkipHealthChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter allows you to move the active copy to adatabase copy in a failed state.The active database copy is checked to see if it's currently a seeding source for any passive copies of the database. If the active copy is currently being used as asource for seeding, the switchover is blocked. You can override this behavior and bypass the seeding source check by using the SkipActiveCopyChecks parameterof the Move-ActiveMailboxDatabase cmdlet. This parameter allows you to move an active copy that's being used as a seeding source. Using this parameter willcause the seeding operation to be cancelled and considered failed.The copy queue and replay queue lengths for the database copy are checked to ensure their values are within the configured criteria. Also, the database copy isverified to ensure that it isn't currently in use as a source for seeding. If the values for the queue lengths are outside the configured criteria, or if the database iscurrently used as a source for seeding, the switchover is blocked. You can override this behavior and bypass these checks by using the SkipLagChecks parameterof the Move-ActiveMailboxDatabase cmdlet. This parameter allows a copy to be activated that has replay and copy queues outside of the configured criteria.The state of the search catalog (content index) for the database copy is checked. If the search catalog isn't up to date, is in an unhealthy state, or is corrupt, theswitchover is blocked. You can override this behavior and bypass the search catalog check by using the SkipClientExperienceChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter causes this search to skip the catalog health check. If the search catalog for the database copy you're activatingis in an unhealthy or unusable state and you use this parameter to skip the catalog health check and activate the database copy, you will need to either crawl orseed the search catalog again.

When performing a database switchover, you also have the option of overriding the mount dial settings configured for the server that hosts the passive database copybeing activated. Using the MountDialOverride parameter of the Move-ActiveMailboxDatabase cmdlet instructs the target server to override its own mount dial settingsand use those specified by the MountDialOverride parameter.

For detailed steps about how to perform a switchover of a database copy, see Activate a Mailbox Database Copy.

Page 20: Dag
Page 21: Dag

Monitoring Database Availability Groups

Applies to: Exchange Server 2013

Topic Last Modified: 2012-11-16

Making sure that servers are operating reliably and that database copies are healthy are key daily objectives for messaging administrators. To help ensure the availabilityand reliability of your Microsoft Exchange Server 2013 organization, the hardware, Windows operating system, and Exchange 2013 services and protocols must be activelymonitored.

Historically, monitoring Exchange has meant using an external application, such as System Center 2012 Operations Manager, to collect performance and event log data,and to react or provide recovery action for problems that are detected as a result of analyzing the collected data. Exchange 2010 and previous versions included healthmanifests and correlation engines in the form of management packs. These correlation engines would analyze the collected data and make a determination as to whethera particular component was healthy or unhealthy. In addition, System Center 2012 Operations Manager was also able to leverage the built-in test cmdlet infrastructure torun synthetic transactions against various aspects of the system to ensure the system was available.

In Exchange 2013, native, built-in monitoring and recovery actions are included in a feature called Managed Availability.

You can use the details in this topic for monitoring the health and status of mailbox database copies for database availability groups (DAGs).

Contents

Managed availability

Get-MailboxDatabaseCopyStatus cmdlet

Test-ReplicationHealth cmdlet

Crimson channel event logging

CollectOverMetrics.ps1 script

CollectReplicationMetrics.ps1 script

Managed availability

Managed availability is the integration of built-in monitoring and recovery actions with the Exchange built-in high availability platform. It's designed to detect andrecover from problems as soon as they occur and are discovered by the system. Unlike previous external monitoring solutions for Exchange, managed availabilitydoesn't try to identify or communicate the root cause of an issue. It's instead focused on recovery aspects that address three key areas of the user experience:

Availability Can users access the service?Latency How is the experience for users?Errors Are users able to accomplish what they want?

The new architecture in Exchange 2013 makes each Exchange server an island where services on that island only service the active databases located on that server. Thearchitectural changes in Exchange 2013 require a new approach to availability model used by Exchange. The Mailbox and Client Access server architecture imply that anyMailbox server with an active database is in production for all services, including all protocol services. As a result, this fundamentally changes the model used to managethe protocol services.

Managed availability was conceived to address this change and to provide a native health monitoring and recovery solution. The integration of the building blockarchitecture into a unified framework provides a powerful capability to detect failures and recover from them. Managed availability moves away from monitoringindividual separate slices of the system to monitoring the end-to-end user experience, and protecting the end user's experience through recovery-oriented computing.

In Exchange 2013, client access protocols for a specific mailbox are always served from the protocol instance that's local to the active database copy. As a result, it'simportant that managed availability's monitoring and recovery actions take into account more than just the health of the database.

Managed availability is an internal process that runs on every Exchange 2013 server. It's implemented in the form of two services:

Exchange Health Manager Service (MSExchangeHMHost.exe) This is a controller process used to manage worker processes. It's used to build, execute, andstart and stop the worker process, as needed. It's also used to recover the worker process in case that process fails, to prevent the worker process from being asingle point of failure.Exchange Health Manager Worker process (MSExchangeHMWorker.exe) This is the worker process responsible for performing the run-time tasks.

Managed availability uses persistent storage to perform its functions:

XML configuration files are used to initialize the work item definitions during startup of the worker process.The Windows registry is used to store run-time data, such as bookmarks.The Windows crimson channel event log infrastructure is used to store the work item results.

As illustrated in the following drawing, managed availability includes three main asynchronous components that are constantly doing work.

Managed availability

The first component is the probe engine, which is responsible for taking measurements on the server and collecting data. The results of those measurements flow intothe second component, the monitor. The monitor contains all of the business logic used by the system based on what is considered healthy on the data collected.Similar to a pattern recognition engine, the monitor looks for the various different patterns on all the collected measurements, and then it decides whether something isconsidered healthy. Finally, there is the responder engine, which is responsible for recovery actions. When something is unhealthy, the first action is to attempt torecover that component. This could include multi-stage recovery actions; for example, the first attempt may be to restart the application pool, the second may be torestart the service, the third attempt may be to restart the server, and the subsequent attempt may be to take the server offline so that it no longer accepts traffic. If therecovery actions are unsuccessful, the system escalates the issue to a human through event log notifications.

Exchange 2013 1 out of 1 rated this helpful

TechNet Products IT Resources Downloads Training Support

Page 22: Dag

The probe engine contains probes, checks, and notification logic. Probes are synthetic transactions performed by the system to test the end-to-end user experience.Checks are the infrastructure that perform the collection of performance data, including user traffic, and measure the collected data against thresholds that are set todetermine spikes in user failures. This enables the checks infrastructure to become aware when users are experiencing issues. Finally, the notification logic enables thesystem to take action immediately based on a critical event, without having to wait for the results of the data collected by a probe. These are typically exceptions orconditions that can be detected and recognized without a large sample set.

Monitors query the data collected by probes to determine if action needs to be taken based on a predefined rule set. Depending on the rule or the nature of the issue, amonitor can either initiate a responder or escalate the issue to a human via an event log entry. In addition, monitors define how much time after a failure that aresponder is executed, as well as the workflow of the recovery action. Monitors have various states. From a system state perspective, monitors have two states:

Healthy The monitor is operating properly and all collected metrics are within normal operating parametersUnhealthy The monitor isn't healthy and has either initiated recovery through a responder or notified an administrator through escalation.

From an administrative perspective, monitors have additional states that appear in the Shell:

Degraded When a monitor is in an unhealthy state from 0 through 60 seconds, it's considered Degraded. If a monitor is unhealthy for more than 60 seconds, itis considered Unhealthy.Disabled The monitor has been explicitly disabled by an administrator.Unavailable The Microsoft Exchange Health service periodically queries each monitor for its state. If it doesn't get a response to the query, the monitor statebecomes Unavailable.Repairing An administrator sets the Repairing state to indicate to the system that corrective action is in process by a human, which allows the system andhumans to differentiate between other failures that may occur at the same time corrective action is being taken (such as a database copy reseed operation).

Return to top

Get-MailboxDatabaseCopyStatus cmdlet

You can use the Get-MailboxDatabaseCopyStatus cmdlet to view status information about mailbox database copies. This cmdlet enables you to view information aboutall copies of a particular database, information about a specific copy of a database on a specific server, or information about all database copies on a server. Thefollowing table describes possible values for the copy status of a mailbox database copy.

Database copy status

Database copy status Description

Failed The mailbox database copy is in a Failed state because it isn't suspended, and it isn't able to copy or replay log files. While in aFailed state and not suspended, the system will periodically check whether the problem that caused the copy status to changeto Failed has been resolved. After the system has detected that the problem is resolved, and barring no other issues, the copystatus will automatically change to Healthy.

Seeding The mailbox database copy is being seeded, the content index for the mailbox database copy is being seeded, or both are beingseeded. Upon successful completion of seeding, the copy status should change to Initializing.

SeedingSource The mailbox database copy is being used as a source for a database copy seeding operation.

Suspended The mailbox database copy is in a Suspended state as a result of an administrator manually suspending the database copy byrunning the Suspend-MailboxDatabaseCopy cmdlet.

Healthy The mailbox database copy is successfully copying and replaying log files, or it has successfully copied and replayed allavailable log files.

ServiceDown The Microsoft Exchange Replication service isn't available or running on the server that hosts the mailbox database copy.

Initializing The mailbox database copy is in an Initializing state when a database copy has been created, when the Microsoft ExchangeReplication service is starting or has just been started, and during transitions from Suspended, ServiceDown, Failed, Seeding, orSinglePageRestore to another state. While in this state, the system is verifying that the database and log stream are in aconsistent state. In most cases, the copy status will remain in the Initializing state for about 15 seconds, but in all cases, itshould generally not be in this state for longer than 30 seconds.

Resynchronizing The mailbox database copy and its log files are being compared with the active copy of the database to check for anydivergence between the two copies. The copy status will remain in this state until any divergence is detected and resolved.

Mounted The active copy is online and accepting client connections. Only the active copy of the mailbox database copy can have a copystatus of Mounted.

Dismounted The active copy is offline and not accepting client connections. Only the active copy of the mailbox database copy can have acopy status of Dismounted.

Mounting The active copy is coming online and not yet accepting client connections. Only the active copy of the mailbox database copycan have a copy status of Mounting.

Dismounting The active copy is going offline and terminating client connections. Only the active copy of the mailbox database copy canhave a copy status of Dismounting.

DisconnectedAndHealthy The mailbox database copy is no longer connected to the active database copy, and it was in the Healthy state when the loss ofconnection occurred. This state represents the database copy with respect to connectivity to its source database copy. It maybe reported during DAG network failures between the source copy and the target database copy.

DisconnectedAndResynchronizing The mailbox database copy is no longer connected to the active database copy, and it was in the Resynchronizing state whenthe loss of connection occurred. This state represents the database copy with respect to connectivity to its source databasecopy. It may be reported during DAG network failures between the source copy and the target database copy.

FailedAndSuspended The Failed and Suspended states have been set simultaneously by the system because a failure was detected, and becauseresolution of the failure explicitly requires administrator intervention. An example is if the system detects unrecoverabledivergence between the active mailbox database and a database copy. Unlike the Failed state, the system won't periodicallycheck whether the problem has been resolved, and automatically recover. Instead, an administrator must intervene to resolvethe underlying cause of the failure before the database copy can be transitioned to a healthy state.

SinglePageRestore This state indicates that a single page restore operation is occurring on the mailbox database copy.

The Get-MailboxDatabaseCopyStatus cmdlet also includes a parameter called ConnectionStatus, which returns details about the in-use replication networks. If you usethis parameter, two additional output fields, IncomingLogCopyingNetwork and SeedingNetwork, will be populated in the task's output.

Get-MailboxDatabaseCopyStatus examples

Page 23: Dag

The following examples use the Get-MailboxDatabaseCopyStatus cmdlet. Each example pipes the results to the Format-List cmdlet to display the output in listformat.

This example returns status information for all copies of the database DB2.

This example returns the status for all database copies on the Mailbox server MBX2.

This example returns the status for all database copies on the local Mailbox server.

This example returns status, log shipping, and seeding network information for the database DB3 on the Mailbox server MBX1.

For more information about using the Get-MailboxDatabaseCopyStatus cmdlet, see Get-MailboxDatabaseCopyStatus.

Return to top

Test-ReplicationHealth cmdlet

You can use the Test-ReplicationHealth cmdlet to view continuous replication status information about mailbox database copies. This cmdlet can be used to check allaspects of the replication and replay status to provide a complete overview of a specific Mailbox server in a DAG.

The Test-ReplicationHealth cmdlet is designed for the proactive monitoring of continuous replication and the continuous replication pipeline, the availability of ActiveManager, and the health and status of the underlying cluster service, quorum, and network components. It can be run locally on or remotely against any Mailbox serverin a DAG. The Test-ReplicationHealth cmdlet performs the tests listed in the following table.

Test-ReplicationHealth cmdlet tests

Test name Description

ClusterService Verifies that the Cluster service is running and reachable on the specified DAG member, or if no DAG member is specified, on the localserver.

ReplayService Verifies that the Microsoft Exchange Replication service is running and reachable on the specified DAG member, or if no DAG member isspecified, on the local server.

ActiveManager Verifies that the instance of Active Manager running on the specified DAG member, or if no DAG member is specified, the local server, isin a valid role (primary, secondary, or stand-alone).

TasksRpcListener Verifies that the tasks remote procedure call (RPC) server is running and reachable on the specified DAG member, or if no DAG member isspecified, on the local server.

TcpListener Verifies that the TCP log copy listener is running and reachable on the specified DAG member, or if no DAG member is specified, on thelocal server.

DagMembersUp Verifies that all DAG members are available, running, and reachable.

ClusterNetwork Verifies that all cluster-managed networks on the specified DAG member, or if no DAG member is specified, the local server, are available.

QuorumGroup Verifies that the default cluster group (quorum group) is in a healthy and online state.

FileShareQuorum Verifies that the witness server and witness directory and share configured for the DAG are reachable.

DBCopySuspended Checks whether any mailbox database copies are in a state of Suspended on the specified DAG member, or if no DAG member isspecified, on the local server.

DBCopyFailed Checks whether any mailbox database copies are in a state of Failed on the specified DAG member, or if no DAG member is specified, onthe local server.

DBInitializing Checks whether any mailbox database copies are in a state of Initializing on the specified DAG member, or if no DAG member is specified,on the local server.

DBDisconnected Checks whether any mailbox database copies are in a state of Disconnected on the specified DAG member, or if no DAG member isspecified, on the local server.

DBLogCopyKeepingUp Verifies that log copying and inspection by the passive copies of databases on the specified DAG member, or if no DAG member isspecified, on the local server, are able to keep up with log generation activity on the active copy.

DBLogReplayKeepingUp Verifies that replay activity for the passive copies of databases on the specified DAG member, or if no DAG member is specified, on thelocal server, is able to keep up with log copying and inspection activity.

Test-ReplicationHealth example

This example uses the Test-ReplicationHealth cmdlet to test the health of replication for the Mailbox server MBX1.

Get-MailboxDatabaseCopyStatus -Identity DB2 | Format-List

Get-MailboxDatabaseCopyStatus -Server MBX2 | Format-List

Get-MailboxDatabaseCopyStatus -Local | Format-List

Get-MailboxDatabaseCopyStatus -Identity DB3\MBX1 -ConnectionStatus | Format-List

Test-ReplicationHealth -Identity MBX1

Page 24: Dag

Return to top

Crimson channel event logging

Windows includes two categories of event logs: Windows logs, and Applications and Services logs. The Windows logs category includes the event logs available inprevious versions of Windows: Application, Security, and System event logs. It also includes two new logs: the Setup log and the ForwardedEvents log. Windows logs areintended to store events from legacy applications and events that apply to the entire system.

Applications and Services logs are a new category of event logs. These logs store events from a single application or component rather than events that might havesystem-wide impact. This new category of event logs is referred to as an application's crimson channel.

The Applications and Services logs category includes four subtypes: Admin, Operational, Analytic, and Debug logs. Events in Admin logs are of particular interest if youuse event log records to troubleshoot problems. Events in the Admin log should provide you with guidance about how to respond to the events. Events in theOperational log are also useful, but may require more interpretation. Admin and Debug logs aren't as user friendly. Analytic logs (which by default are hidden anddisabled) store events that trace an issue, and often a high volume of events are logged. Debug logs are used by developers when debugging applications.

Exchange 2013 logs events to crimson channels in the Applications and Services logs area. You can view these channels by performing these steps:

1. Open Event Viewer.2. In the console tree, navigate to Applications and Services Logs > Microsoft > Exchange.3. Under Exchange, select a crimson channel: HighAvailability or MailboxDatabaseFailureItems.

The HighAvailability channel contains events related to startup and shutdown of the Microsoft Exchange Replication service, and the various components that runwithin the Microsoft Exchange Replication service, such as Active Manager, the third-party synchronous replication API, the tasks RPC server, TCP listener, and VolumeShadow Copy Service (VSS) writer. The HighAvailability channel is also used by Active Manager to log events related to Active Manager role monitoring and databaseaction events, such as a database mount operation and log truncation, and to record events related to the DAG's underlying cluster.

The MailboxDatabaseFailureItems channel is used to log events associated with any failures that affect a replicated mailbox database.

Return to top

CollectOverMetrics.ps1 script

Exchange 2013 includes a script called CollectOverMetrics.ps1, which can be found in the Scripts folder. CollectOverMetrics.ps1 reads DAG member event logs to gatherinformation about database operations (such as database mounts, moves, and failovers) over a specific time period. For each operation, the script records the followinginformation:

Identity of the databaseTime at which the operation began and endedServers on which the database was mounted at the start and finish of the operationReason for the operationWhether the operation was successful, and if the operation failed, the error details

The script writes this information to .csv files with one operation per row. It writes a separate .csv file for each DAG.

The script supports parameters that allow you to customize the script's behavior and output. For example, the results can be restricted to a specified subset by using theDatabase or ReportFilter parameters. Only the operations that match these filters will be included in the summary HTML report. The available parameters are listed inthe following table.

CollectOverMetrics.ps1 script parameters

Parameter Description

DatabaseAvailabilityGroup Specifies the name of the DAG from which you want to collect metrics. If this parameter is omitted, the DAG of which the local serveris a member will be used. Wildcard characters can be used to collect information from and report on multiple DAGs.

Database Provides a list of databases for which the report needs to be generated. Wildcard characters are supported, for example, -Database:"DB1","DB2" or -Database:"DB*".

StartTime Specifies the duration of the time period to report on. The script gathers only the events logged during this period. As a result, thescript may capture partial operation records (for example, only the end of an operation at the start of the period or vice-versa). Ifneither StartTime nor EndTime is specified, the script defaults to the past 24 hours. If only one parameter is specified, the period will be24 hours, either beginning or ending at the specified time.

EndTime Specifies the duration of the time period to report on. The script gathers only the events logged during this period. As a result, thescript may capture partial operation records (for example, only the end of an operation at the start of the period or vice-versa). Ifneither StartTime nor EndTime is specified, the script defaults to the past 24 hours If only one parameter is specified, the period will be24 hours, either beginning or ending at the specified time.

ReportPath Specifies the folder used to store the results of event processing. If this parameter is omitted, the Scripts folder will be used. Whenspecified, the script takes a list of .csv files generated by the script and uses them as the source data to generate a summary HTMLreport. The report is the same one that's generated with the -GenerateHtmlReport option. The files can be generated across multipleDAGs at many different times, or even with overlapping times, and the script will merge all of their data together.

GenerateHtmlReport Specifies that the script gather all the information it has recorded, group the data by the operation type, and then generate an HTMLfile that includes statistics for each of these groups. The report includes the total number of operations in each group, the number ofoperations that failed, and statistics for the time taken within each group. The report also contains a breakdown of the types of errorsthat resulted in failed operations.

ShowHtmlReport Specifies that the HTML-generated report should be displayed in a Web browser after it's generated.

SummariseCsvFiles Specifies that the script read the data from existing .csv files that were previously generated by the script. This data is then used togenerate a summary report similar to the report generated by the GenerateHtmlReport parameter.

ActionType Specifies the type of operational actions the script should collect. The values for this parameter are Move, Mount, Dismount, andRemount. The Move value refers to any time that the database changes its active server, whether by controlled moves or by failovers.The Mount, Dismount, and Remount values refer to times that the database changes its mounted status without moving to anothercomputer.

ActionTrigger Specifies which administrative operations should be collected by the script. The values for this parameter are Admin or Automatic.Automatic actions are those performed automatically by the system (for example, a failover when a server goes offline). Adminactions are any actions that were performed by an administrator using either the Exchange Management Shell or the ExchangeAdministration Center.

Page 25: Dag

RawOutput Specifies that the script writes the results that would have been written to .csv files directly to the output stream, as would happen withwrite-output. This information can then be piped to other commands.

IncludedExtendedEvents Specifies that the script collects the events that provide diagnostic details of times spent mounting databases. This can be a time-consuming stage if the Application event log on the servers is large.

MergeCSVFiles Specifies that the script takes all the .csv files containing data about each operation and merges them into a single .csv file.

ReportFilter Specifies that a filter should be applied to the operations using the fields as they appear in the .csv files. This parameter uses the sameformat as a Where operation, with each element set to $_ and returning a Boolean value. For example: {$_DatabaseName -notlike "Mailbox Database*"} can be used to exclude the default databases from the report.

CollectOverMetrics.ps1 examples

The following example collects metrics for all databases that match DB* (which includes a wildcard character) in the DAG DAG1. After the metrics are collected, anHTML report is generated and displayed.

The following examples demonstrate ways that the summary HTML report may be filtered. The first uses the Database parameter, which takes a list of databasenames. The summary report then contains data only about those databases. The next two examples use the ReportFilter option. The last example filters out all thedefault databases.

Return to top

CollectReplicationMetrics.ps1 script

CollectReplicationMetrics.ps1 is another health metric script included in Exchange 2013. This script provides an active form of monitoring because it collects metrics inreal time, while the script is running. CollectReplicationMetrics.ps1 collects data from performance counters related to database replication. The script gathers counterdata from multiple Mailbox servers, writes each server's data to a .csv file, and then reports various statistics across all of this data (for example, the amount of time eachcopy was failed or suspended, the average copy or replay queue length, or the amount of time that copies were outside of their failover criteria).

You can either specify the servers individually, or you can specify entire DAGs. You can either run the script to first collect the data and then generate the report, or youcan run it to just gather the data or to only report on data that's already been collected. You can specify the frequency at which data should be sampled and the totalduration to gather data.

The data collected from each server is written to a file named CounterData.<ServerName>.<TimeStamp>.csv. The summary report will be written to a file namedHaReplPerfReport.<DAGName>.<TimeStamp>.csv, or HaReplPerfReport.<TimeStamp>.csv if you didn't run the script with the DagName parameter.

The script starts Windows PowerShell jobs to collect the data from each server. These jobs run for the full period in which data is being collected. If you specify a largenumber of servers, this process can use a considerable amount of memory. The final stage of the process, when data is processed into a summary report, can also bequite time consuming for large amounts of data. It's possible to run the collection stage on one computer, and then copy the data elsewhere for processing.

The CollectReplicationMetrics.ps1 script supports parameters that allow you to customize the script's behavior and output. The available parameters are listed in thefollowing table.

CollectReplicationMetrics.ps1 script parameters

Parameter Description

DagName Specifies the name of the DAG from which you want to collect metrics. If this parameter is omitted, the DAG of which the local server is amember will be used.

DatabaseNames Provides a list of databases for which the report needs to be generated. Wildcard characters are supported for use, for example, -DatabaseNames:"DB1","DB2" or -DatabaseNames:"DB*".

ReportPath Specifies the folder used to store the results of event processing. If this parameter is omitted, the Scripts folder will be used.

Duration Specifies the amount of time the collection process should run. Typical values would be one to three hours. Longer durations should beused only with long intervals between each sample or as a series of shorter jobs run by scheduled tasks.

Frequency Specifies the frequency at which data metrics are collected. Typical values would be 30 seconds, one minute, or five minutes. Under normalcircumstances, intervals that are shorter than these won't show significant changes between each sample.

Servers Specifies the identity of the servers from which to collect statistics. You can specify any value, including wildcard characters or GUIDs.

SummariseFiles Specifies a list of .csv files to generate a summary report. These files are the files named CounterData.<CounterData>* and are generatedby the CollectReplicationMetrics.ps1 script.

Mode Specifies the processing stages that the script executes. You can use the following values:

CollectAndReport This is the default value. This value signifies that the script should both collect the data from the servers andthen process them to produce the summary report.CollectOnly This value signifies that the script should just collect the data and not produce the report.ProcessOnly This value signifies that the script should import data from a set of .csv files and process them to produce thesummary report. The SummariseFiles parameter is used to provide the script with the list of files to process.

MoveFilestoArchive Specifies that the script should move the files to a compressed folder after processing.

LoadExchangeSnapin Specifies that the script should load the Shell commands. This parameter is useful when the script needs to run from outside the Shell, suchas in a scheduled task.

CollectReplicationMetrics.ps1 example

CollectOverMetrics.ps1 -DatabaseAvailabilityGroup DAG1 -Database:"DB*" -GenerateHTMLReport -ShowHTMLReport

CollectOverMetrics.ps1 -SummariseCsvFiles (dir *.csv) -Database MailboxDatabase123,MailboxDatabase456CollectOverMetrics.ps1 -SummariseCsvFiles (dir *.csv) -ReportFilter { $_.DatabaseName -notlike "Mailbox Database*" }CollectOverMetrics.ps1 -SummariseCsvFiles (dir *.csv) -ReportFilter { ($_.ActiveOnStart -like "ServerXYZ*") -and ($_.ActiveOnEnd -notlike "ServerXYZ*") }

Page 26: Dag

The following example gathers one hour's worth of data from all the servers in the DAG DAG1, sampled at one minute intervals, and then generates a summaryreport. In addition, the ReportPath parameter is used, which causes the script to place all the files in the current directory.

The following example reads the data from all the files matching CounterData* and then generates a summary report.

Return to top

CollectReplicationMetrics.ps1 -DagName DAG1 -Duration "01:00:00" -Frequency "00:01:00" -ReportPath

CollectReplicationMetrics.ps1 -SummariseFiles (dir CounterData*) -Mode ProcessOnly -ReportPath

Page 27: Dag

Recommended