AZ Failover Configuration Prerequisites
Introduction
This article covers all of the requirements and recommendations for performing an Access Zone Failover with Superna DR. It's important to review this guide before attempting to execute an Access Zone Failover.
Access Zone Failover Requirements
Cluster Version Requirements - May Block Failover
Clusters participating in an Access Zone Failover must be running the supported PowerScale OneFS Cluster version for this feature. See the Feature Release Compatibility matrix in the release notes specific to your appliance version here.
Access Zone Requirements - Blocks Failover
For an Access Zone failover with Superna DR Edition, the Access Zone (AZ) must meet the following requirements.
- The AZ must exist on the Target Cluster with the same name and authentication providers. (AZ Sync Jobs in Superna DR Edition can be used to create Access Zones automatically.)
- The AZ must be associated with at least one subnet IP pool.
- The AZ must be associated with one or more SyncIQ policies.
- For SyncIQ policies associated to the AZ by path, the SyncIQ policy source path must be the same or below the AZ base path.
- In an Active-Active data replication topology, there must be a dedicated Access Zone for each replication direction.
Failover with Superna DR Edition of an Access Zone for an Active-Active data replication topology that is shared by SyncIQ Policies on both clusters is not supported due to the inability to partially fail back individual SyncIQ policies. To ensure supported failover and failback operations, the correct configuration involves separating the data into distinct Access Zones. Each zone should have its own set of SyncIQ policies, allowing for independent failover and failback of specific zones without affecting others.
SyncIQ Policy Requirements - Blocks Failover
For an Access Zone failover with Superna DR Edition, the SyncIQ policy(s) identified as part of the Access Zone (based on the Access Zone base path and the SyncIQ Policy source path) meet the following requirements:
- All PowerScale OneFS SyncIQ Policy(s) must have the same Target Cluster provisioned (release 1.5.4 and earlier) In Release 1.6 and later, multi-site replication allows an Access Zone policy to have a 3rd cluster target for multi-site failover requirements. This can be a simple policy based failover, or a fully automated Access Zone multi-site failover policy.
- PowerScale OneFS SyncIQ Policy(s) target host SmartConnect Zone must be associated with a pool that is not going to be failed over.
- SyncIQ Policy(s) source root directory must be at or below the Access Zone base directory
DFS Mode Requirements
In DFS mode, failover does not require SmartConnect zone names. However, if your DFS mode SyncIQ policies include paths that fall within the Access Zone root, you must have a dedicated subnet and pool with SmartConnect Zones for the UNC paths used as DFS folder targets.
DFS-enabled policies that also protect NFS exports will require a separate SmartConnect Zone for NFS export data access to utilize Access Zone automation. Alternatively, manual steps or post-failover scripting will be needed to update NFS client mounts.
Shares / Exports / NFS Alias Requirements
Exports with multiple paths, where not all paths are associated with the same SyncIQ Policy, are not supported for Access Zone failover. This will result in a "A unique readiness data ..." error in Zone Readiness, preventing the Overall Status from being displayed.
SmartConnect Requirements - Blocks Failover
For an Access Zone failover with Eyeglass, the SmartConnect Zone FQDN must not exceed 50 characters. Additionally, the Pool's "SmartConnect service subnet" must be provisioned with the same subnet in which the Pool was originally created.
Failover Mapping Hint Requirements - Blocks Failover
For an Access Zone failover with Eyeglass, a mapping between subnet pools on the Source and Target clusters is necessary to ensure data is accessed from the correct SmartConnect Zone IP and node pool on the Target cluster after failover.
This mapping supports the failover and failback of SmartConnect zones from one IP pool to another.
Eyeglass mapping hints must meet specific requirements. Although this setup is a one-time process, it needs to be repeated if new IP pools are created.
- Eyeglass Mapping Hints are simple SmartConnect aliases created using ISI or UI to map IP Pools and SmartConnect Name failover mappings between cluster pairs.
- Every subnet IP pool associated with the Access Zone being failed over must have a corresponding mapping for each IP pool in the Access Zone. The DR Dashboard will display an error if mapping hints are missing or incorrectly configured.
- Eyeglass mapping hints on both the Source and Target cluster IP pools must be created in unique pairs. Incorrectly mapped pools will trigger an alert in the Access Zone Readiness Tab of the DR Dashboard.
- In DFS mode, SmartConnect zone names are not required for failover. However, if you have DFS mode SyncIQ policies in the Access Zone, a dedicated subnet and pool with an Eyeglass igls-ignore hint applied is needed to retain SmartConnect zones on both the Source and Target clusters.
- The Subnet Pool used in the SyncIQ Policy Restrict Source Nodes option must not have an Eyeglass mapping hint and must have an igls-ignore hint applied. Misconfiguration could cause the SmartConnect zone used by SyncIQ to failover, negatively affecting failback operations and SyncIQ replication.
- The Subnet Pool associated with the SyncIQ Policy Target Host property must not have an Eyeglass mapping hint and must have an igls-ignore hint applied. This pool on the target cluster is used for SyncIQ replication. Failure to apply this hint can affect failback operations and SyncIQ replication if the SmartConnect name is failed over by Eyeglass.
- Ignore hints are simply aliases named “igls-ignore.” To ensure uniqueness, it's best to use a naming format that includes the cluster name, such as “igls-ignore-clustername.” This allows Eyeglass to match on igls-ignore while keeping the hint unique, avoiding SPN collisions in Active Directory if the SmartConnect alias is added to AD with the check or repair ISI command on the cluster. Although these hints are SmartConnect aliases and can be added to the AD machine account, they are not required for Kerberos authentication since they are not used to mount shares. See the section on Active Directory Machine Account Service Principal Name (SPN) Delegation for more details.
Failover Target Cluster Requirements
For an Access Zone failover with Eyeglass, the PowerScale OneFS Cluster targeted for the failover must be IP reachable by Eyeglass with the necessary ports open.
Quota Job Requirements
For an Access Zone failover with Eyeglass, there are no Eyeglass Quota Job state requirements. Quotas will be failed over whether Eyeglass Quota Job is in Enabled or Disabled state.
Unsupported Data Replication Topology
A replication topology that has shares or NFS aliases with the same name on both clusters, but is protected by different SyncIQ policies, is not supported. Configuration Replication will overwrite the path on one cluster as the share or alias, leading to a situation where two SyncIQ policies on the same cluster have the same source path, causing failover to fail. This is an invalid DR configuration because it creates duplicate shares pointing to different data. This is not a good DR configuration, and successful failover will not be possible, with or without Eyeglass.
Recommendations for Access Zone Failover
The conditions outlined in the following section are highly recommended to ensure that all automated Access Zone failover steps can be completed. If any one of these conditions is not met, it will result in a warning.
Warnings will not block an assisted failover, but it is probable that the failover process will require additional manual steps. However, Errors will block a Superna DR Assisted failover from starting.
Shares / Exports / NFS Alias Recommendations
- All shares, exports, and aliases should be created within the Access Zone that is being failed over. It is not supported to have shares, exports, or aliases with a path located outside or above the Access Zone base path in the file system.
- Impact - Data Access Outage: If the path does not match the Access Zone base path, the policy will not be selected for failover, resulting in data that will not be included in the Access Zone failover.
- The Access Zones Readiness section in the DR Dashboard shows which policies have been matched to the Access Zone and should be checked to ensure that all expected SyncIQ policies are present.
- Eyeglass Configuration Replication Jobs for the SyncIQ Policies in the Access Zone being failed over should have been completed without error.
- Impact - Data Access Outage: Any missing or incorrect share / export / NFS alias information will prevent client access to data on the Target cluster. These configuration items will have to be corrected manually on the Target Cluster.
Service Principal Name Recommendations
For optimal Access Zone failover with Eyeglass, where the Access Zone contains SMB shares directly mounted using SmartConnect Zones, it is recommended to set up delegation for SPN add and delete. This allows Eyeglass to automatically update SPNs based on SmartConnect changes made during failover (refer to "How to Configure Delegation of Cluster Machine Accounts with Active Directory Users and Computers Snap-in").
Impact: Without SPN updates, SMB share client authentication may fail. SPNs will need to be manually updated for both the Source and Target clusters to re-enable Kerberos authentication. Additionally, NTLM fallback authentication should be verified with Active Directory.
SyncIQ Policy Recommendations - Does Not Block Failover
The last run of the SyncIQ Policy should have been successful:
- PowerScale OneFS SyncIQ Jobs should have been run at least once.
- The most recent PowerScale OneFS SyncIQ Job should have been successful.
- PowerScale OneFS SyncIQ Jobs should not be in a Paused or Cancelled state.
Impact: Depending on its current status, the SyncIQ Policy may not be able to be run by Eyeglass-assisted failover if these recommendations have not been met. If the policy does not run, data loss may occur during failover.
Example 1: If the SyncIQ Policy is in an error state and cannot be run from PowerScale OneFS, it will also not be able to run from Eyeglass.
Example 2: If the SyncIQ Policy is paused, Eyeglass failover cannot resume the paused policy. It must be resumed from PowerScale OneFS.
You must investigate these errors and understand their impact to your failover solution.
SyncIQ Policy Exclusions and Inclusions
PowerScale OneFS does not support SyncIQ Policies with excludes or includes for failover.
Impact: This configuration is not supported for failback.
Restricting Source Nodes in SyncIQ Policies
PowerScale OneFS best practices recommend that SyncIQ Policies utilize the Restrict Source Nodes option to control which nodes replicate between clusters.
Impact: Without controlling the subnet pool used for data replication, all nodes in the cluster can replicate data from all IP pools. This complicates bandwidth management and requires that all nodes have access to the WAN.
Handling Disabled SyncIQ Policies
Eyeglass failover will skip failover for any SyncIQ policies in the Access Zone that are in a disabled state.
Impact - Data Access Outage: If SyncIQ Policies are disabled, the associated filesystem will remain writable on the source and will not failover. As a result, data on the source cluster will likely be inaccessible to clients because the necessary networking and SmartConnect Zone for data access will have been failed over, and the source SmartConnect zone will be renamed to prevent client access.
See Also
After reviewing all of the requirements and recommendations for an Access Zone assisted failover, review the Access Zone Failover Configuration Procedures to configure your environment properly.