I have just completed a project where I had to Install and configure VMware vSphere Site Recovery Manager. Storage was provided by NetApp FAS and V-Series filers, thus I had to use the NetApp provided Storage Replication Adapter. As of the time of this writing the latest version was 2.01. True to form I ran into a couple of bugs, which took a bit of figuring out.
Unable to add a controller: “Error: SRA command ‘discoverArrays’ failed”
Execute the following commands on your filers
- options httpd.admin.enable on
- options httpd.enable on
- options httpd.admin.ssl.enable off
Error when adding an Array Pair: “Internal error: std::exception 'class Dr::Xml::XmlValidateException' "Element 'SourceDevices' is not valid for content model: '(SourceDevice,)”
There are two solutions to this issue
- Downgrade back to NetApp SRA version 2.0.0
- Manually include the lists of volumes you want discovered by the SRA. You’ll need to do this on both controllers in the pair.
This is a documented bug
Reprotect Job fails after recovering to Disaster Recovery Site
The SRM / SRA timeouts seems a bit aggressive to me. This is highlighted when you do a reprotect on a failed over Protection Group. Part of the task sequences is to reverse the direction of replication, but this fails consistently due to the SRM not waiting long enough for this reversal to take place.
You can kludge it by:
- Re-running the reprotect until it works
- Manually refresh the Array Manager while the Re-Protect job is running
Recovered DataStores have snap-xxx prefixes
More of a cosmetic irritant than a true bug, I wanted this fixed nonetheless.
- Within SRM, right-click your site and select Advanced Settings
- Click StorageProvider
- Select the storageProvider.fixRecoveredDatastoresNames check box
I would suggest increasing your SRM SAN provider timeout settings to something a bit more sane, like double. Instructions can be found here.
Also make sure that the ALUA settings on your iGroups in both the protected and recovery sites are the same.