Exchange Server‎ > ‎

Troubleshoot Exchange 2007 SCR

1. Determine the replication status.

From the SCR source:

Get-StorageGroupCopyStatus -StandbyMachine exchdrms01

From the SCR target:

Get-StorageGroupCopyStatus -server exchccrms01 -StandbyMachine exchdrms01

Test-ReplicationHealth  (optional switch for more detail use:  -verbose)

2. Interpret the results.

·         Do not confuse replication results from CCR with SCR results. This happens when running cmdlets in clustered environments without being fully aware of the context of the script/cmdlet. Keep in mind the scope and context of the cmdlets you have run when interpreting the results.

·         Take note of status and error messages for later reference. Pay particular attention to where the problem is.  i.e. specific database / storage group.

·         Use your judgement. Sorry this can't be documented, but below I have some pointers:

o   Healthy is the expected result. Everything is probably good and unless you have background knowledge to go on such as normal queue lengths for the environment then it is the best interpretation you can hope for.

o   Initialising probably means you just need to be patient, (e.g. after a reboot or unexpected shutdown) trying to fix this if it isn't actually broken, will break it, then you will need to fix it. If there have been no new initialisations or recent shutdowns, you also have a problem needing fixed.

o   Not Configured either it is not configured, the standby node is switched off or, you have a typo in your cmdlet.

o   Failed something is wrong with the replication. Check application logs etc for clues. If everything seems ok otherwise and you cant find any reason for the failure or, you know there was an unexpected shutdown somewhere, then try the old stop/start technique ....

 Most replication issues can be resolved by stopping/starting the replication.              step 3.

Some more serious problems will be resolved by re-seeding the database copies.    step 4.


 For information only.

 It is unwise to attempt to interpret the following data without background information and knowledge of what constitutes normal operation in your environment.

 CopyQueueLength is the number of transaction logs waiting to be shipped. If this number is growing, your WAN connection may not have sufficient bandwidth to cope with current load.

ReplayQueueLength is the number of logs in the SCR target's log directory waiting to be replayed. This number will increase continually until a full backup is taken on the SCR source. Then the SCR target "replays" the logs and commits them to the database file on the target server. It is important to know there is a hard coded lag of 50 log files that cannot be changed.

Last InspectedLogTime shows the data and time of the last log inspected on the SCR target.

 

3. Suspend then Resume replication.

Suspend-StorageGroupCopy -Identity "EXCHCCRMS01\IT" -StandbyMachine EXCHDRMS01

 

Resume-StorageGroupCopy -Identity "EXCHCCRMS01\IT" -StandbyMachine EXCHDRMS01

 

Change the storage Group name as required, the example uses storage group named "IT"

Change the Server names as required.

4. Re-seed databases.

Suspend-StorageGroupCopy -Identity "EXCHCCRMS01\IT" -StandbyMachine EXCHDRMS01

 

Update-StorageGroupCopy -Identity "VSOPEXCHCCRMS01\IT" -StandbyMachine VSEXCHDRMS01 -DeleteExistingFiles

This will re-seed the database. It should automatically resume when complete.

It will delete the database copy and log files.
It must be run from the SCR target.