Storage Failover Steps
Introduction
The storage layer failover process is a critical part of disaster recovery (DR). A complete DR plan should also address application shutdown and restart procedures to ensure seamless failover. The storage layer provides the foundation for all higher-level failover processes, and Eyeglass streamlines this step while enabling error detection during failover.
For complex or application-layer failover scenarios, Superna Professional Services offers support through Proof of Concept (POC) engagements, recommendations, and assessments. Examples include:
- VMware SRM with externally mounted storage for virtual machines.
- Oracle RAC Data Guard with file system dependencies for applications.
After identifying the Failover Mode suitable for your environment, the next section describes the high-level steps for each mode.
Failover Modes and High-Level Steps
Ordered Steps for Non-DFS and DFS Mode
This step-by-step process details failover actions for SyncIQ Policy or Microsoft DFS Mode. Each item specifies its purpose, describes the action performed, and explains how the action is initiated based on the selected failover mode.
-
Ensure There Is No Live Access to Data (Source)
Verify that no active operations are accessing the data to prevent inconsistencies during failover.
Perform a manual check for open files. If open files are found, determine whether to wait for their closure or proceed with the failover.- How Action Is Initiated:
- SyncIQ Mode: Manual
- DFS Mode: Manual
noteThe DR Assistant Data Integrity failover option (version 2.0 or later) blocks IO to SMB shares before failover starts.
tipDisable SMB and NFS protocols on the source cluster. This is a cluster-wide operation designed to eliminate data loss.
-
Enable Data Integrity Failover (SMB Only)
Applies the "Deny Everyone" setting to SMB shares before the failover process begins.- How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass
- DFS Mode: Automated by Eyeglass
-
Cache Schedule for SyncIQ Policies Being Failed Over
Retrieves the schedule for SyncIQ policies being failed over on OneFS and sets policies to manual to ensure they do not run again during failover.- How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass
- DFS Mode: Automated by Eyeglass
- How Action Is Initiated:
-
Begin Failover With DR Assistant (Eyeglass)
Initiates the failover process from Eyeglass.- How Action Is Initiated:
- SyncIQ Mode: Manual or Eyeglass REST API
- DFS Mode: Manual or Eyeglass REST API
- How Action Is Initiated:
-
Validation of Failover Job (Eyeglass)
Verifies all warnings before submitting the failover job to ensure successful execution.- How Action Is Initiated:
-
SyncIQ Mode: Automated by Eyeglass
-
DFS Mode: Automated by Eyeglass
-
-
Validation - Block on Warning Enabled
Prevents the failover process from continuing if warnings, such as quota scan requirements, are detected.- How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass
- DFS Mode: Automated by Eyeglass
- How Action Is Initiated:
-
Set Eyeglass Config Jobs to User-Disabled
Sets Eyeglass configuration jobs to a user-disabled state to prevent failed steps from running. Jobs can only run if the user re-enables them post-failover.-
How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass
- DFS Mode: Automated by Eyeglass
-
- How Action Is Initiated:
-
Synchronize Data (Run SyncIQ Policies)
Runs all OneFS SyncIQ policy jobs related to the Access Zone being failed over.- How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass
- DFS Mode: Automated by Eyeglass
- How Action Is Initiated:
-
Synchronize Configuration
Executes Eyeglass configuration replication, including shares, export aliases, snapshot schedules, and dedupe paths.- How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass (configuration exists on source and target)
- DFS Mode: Automated by Eyeglass
- How Action Is Initiated:
-
Renaming Shares for DFS Mode to Redirect DFS Clients For DFS failover, shares are renamed on the source and target clusters to redirect clients with dual DFS target paths to the target cluster.
-
How Action Is Initiated:
- SyncIQ Mode: Not Applicable
- DFS Mode: Automated by Eyeglass
infoDFS Mode: Special handling renames shares on the source and target so only one DFS target UNC is reachable and active for DFS clients to switch over.
-
-
Provide Write Access to Data on Target Allow writes to SyncIQ policies related to failover.
- How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass
- DFS Mode: Automated by Eyeglass
- How Action Is Initiated:
-
Disable SyncIQ on Source and Activate on Target
Prepare for failover by creating a mirror policy on the target. Disable the source cluster policy and enable the target cluster policy on OneFS.
- How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass
- DFS Mode: Automated by Eyeglass
- How Action Is Initiated:
-
Reset SyncIQ Schedule on Target Mirror Policy
Apply the schedule to the target mirror policy using the schedule retrieved from Step 1.
- How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass
- DFS Mode: Automated by Eyeglass
- How Action Is Initiated:
-
Failover Quotas (Optional) Eyeglass DR Assistant automatically fails over quotas by running quota jobs related to the SyncIQ policies being failed over.
- How Action Is Initiated:
-
SyncIQ Mode: Automated by Eyeglass
infoQuotas are deleted on the source cluster and created on the target cluster.
-
DFS Mode: Automated by Eyeglass
infoQuotas are deleted on the source cluster and created on the target cluster so that post-failover quotas are applied.
-
- How Action Is Initiated:
-
Remove Quotas on Directories That Are Targets of SyncIQ Eyeglass deletes all quotas on the source cluster for the specified policies.
- How Action Is Initiated:
-
SyncIQ Mode: Automated by Eyeglass
-
DFS Mode: Automated by Eyeglass
-
-
Run Mirror Policy Executes the mirror policy to resync data in the reference direction.
- How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass
- DFS Mode: Automated by Eyeglass
- How Action Is Initiated:
-
Set Eyeglass Config Jobs to Enabled
Enables configuration synchronization only if the Resync Prep step completes successfully for the policy.- How Action Is Initiated:
- SyncIQ Mode: Automated by Eyeglass
- DFS Mode: Automated by Eyeglass
- How Action Is Initiated:
- How Action Is Initiated:
-
Change SmartConnect Zone on Source
Rename SmartConnect zones and aliases on the source cluster to prevent names from being resolved by clients.- How Action Is Initiated:
- SyncIQ Mode: Manual
- DFS Mode: Not Required (source and destination clusters can use existing SmartConnect zones)
- How Action Is Initiated:
-
Avoid SPN Collision
Sync SPNs in all Active Directory (AD) providers to match the current SmartConnect zone names and aliases on the source cluster.- How Action Is Initiated:
- SyncIQ Mode: Manual (deletes SmartConnect SPN from the source cluster machine account)
- DFS Mode: Not Applicable (DFS SPNs are not changed during failover)
- How Action Is Initiated:
-
Move SmartConnect Zone to Target
Add source SmartConnect zones as aliases on the target cluster.- How Action Is Initiated:
- SyncIQ Mode: Manual
- DFS Mode: Not Required (source and destination clusters can use existing SmartConnect zones)
- How Action Is Initiated:
-
Create SPNs for Kerberos Authentication on Target for SMB Shares
Sync SPNs in all Active Directory (AD) providers to match current SmartConnect zone names and aliases on the target cluster.- How Action Is Initiated:
- SyncIQ Mode: Manual (adds new SmartConnect alias SPNs to the target cluster machine account)
- DFS Mode: Not Applicable (DFS SPNs are not changed or registered to cluster machine accounts)
- How Action Is Initiated:
-
Repoint DNS to the Target Cluster IP Address
Update DNS delegations for all SmartConnect zones that are members of the access zone.- How Action Is Initiated:
- SyncIQ Mode: Manual
- DFS Mode: Not Applicable (no updates are needed as DFS resolution has not changed in DNS, only the target UNC with an active share)
- How Action Is Initiated:
-
Refresh Session to Pick Up DNS Change
Remount the SMB/NFS shares.- How Action Is Initiated:
- SyncIQ Mode: Manual on clients
- DFS Mode: Automatic (Windows 7 or later with DFS support)
- How Action Is Initiated:
Ordered Steps for Access Zone or IP Pool Failover
This section outlines the step-by-step process for failover involving Access Zones or IP Pools. Each step describes the action performed, its purpose (if necessary), and how the action is initiated. Let me know the details for the first step, and I’ll format it according to the guidelines.
-
Ensure That There Is No Live Access to Data (Source)
Perform a manual check for open files. If open files are found, decide whether to wait for them to close or proceed with the failover.tipDisable SMB and NFS protocols on the source cluster before failover. This is a cluster-wide operation required to prevent data loss.
- How Action Is Initiated:
-
Access Zone: Manual
-
-
Enable Data Integrity Failover (SMB Only)
Applies the "Deny Everyone" setting to SMB shares before the failover process begins.
- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
- How Action Is Initiated:
-
Cache Schedule for SyncIQ Policies
Retrieves the schedule associated with SyncIQ policies being failed over on OneFS. Sets policies to manual to ensure they do not run again during failover.
- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
- How Action Is Initiated:
- How Action Is Initiated:
-
Begin Failover With DR Assistant (Eyeglass)
Initiate the failover process from Eyeglass.- How Action Is Initiated:
- Access Zone: Manual or Eyeglass REST API
- How Action Is Initiated:
-
Validation of Failover Job (Eyeglass)
Verify all warnings before submitting the failover job.- How Action Is Initiated:
-
Access Zone: Automated by Eyeglass
-
-
Validation - Block on Warning Enabled
Prevents the failover process from continuing if warnings are detected, such as quota scan requirements.- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
- How Action Is Initiated:
-
Set Eyeglass Config Jobs to User-Disabled
Configures Eyeglass jobs to a user-disabled state to prevent failed steps from running unless re-enabled manually after the failover.- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
- How Action Is Initiated:
- How Action Is Initiated:
-
Synchronize Data (Run SyncIQ Policies)
Run all OneFS SyncIQ policy jobs related to the Access Zone being failed over.- How Action Is Initiated:
-
Access Zone: Automated by Eyeglass
infoAll policies in the Access Zone.
-
- How Action Is Initiated:
-
Synchronize Configuration
Run Eyeglass configuration replication, including shares, export aliases, snapshot schedules, and dedupe paths.- How Action Is Initiated:
-
Access Zone: Automated by Eyeglass
infoBased on matching Access Zone base path.
-
- How Action Is Initiated:
-
note
It is possible to integrate DFS-protected data inside an Access Zone failover to protect shares, exports, and DFS data as part of the Access Zone failover process.
If DFS is configured, redirect rename steps would be executed at this point.
-
Change SmartConnect Zone on Source
Rename SmartConnect zones and aliases on the source cluster to prevent client resolution. Dual delegation eliminates the need for DNS updates.- How Action Is Initiated:
-
Access Zone: Automated by Eyeglass
infoBased on matching Access Zone base path.
-
- How Action Is Initiated:
-
Avoid SPN Collision
Sync SPNs in all Active Directory (AD) providers to match current SmartConnect zone names and aliases. The proxy is configured through the target cluster.- How Action Is Initiated:
-
Access Zone: Automated by Eyeglass
infoAD delegation must be completed as per install documentation.
-
- How Action Is Initiated:
-
Move SmartConnect Zone to Target Add source SmartConnect zones and aliases on the target cluster.
- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
- How Action Is Initiated:
-
Update SPN to Allow Authentication Against Target Sync SPNs in all Active Directory (AD) providers to match current SmartConnect zone names and aliases. Proxy configuration is handled through the target cluster.
- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
- How Action Is Initiated:
-
Repoint DNS to the Target Cluster IP Address
Perform DNS dual delegation for all SmartConnect zones that are members of the Access Zone.- How Action Is Initiated:
-
Access Zone: Automated by Eyeglass
-
- How Action Is Initiated:
-
Remove Quotas on Directories That Are Targets of SyncIQ Eyeglass deletes all quotas on the source cluster for the specified policies.
- How Action Is Initiated:
-
SyncIQ Mode: Automated by Eyeglass
-
DFS Mode: Automated by Eyeglass
-
-
Run Mirror Policy
Execute the mirror policy to resync data in the reference direction.- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
-
Set Eyeglass Config Jobs to Enabled
Enables configuration synchronization only if the Resync Prep step completes successfully for the policy.- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
- How Action Is Initiated:
-
Disable SyncIQ on Source and Make Active on Target
Perform the resync prep SyncIQ policy step to failover. This creates a mirror policy on the target, disables the source cluster policy, and enables the target cluster policy in OneFS.
- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
- How Action Is Initiated:
-
Set Proper SyncIQ Schedule on Target
Set the schedule on the mirror policy (target) using the schedule retrieved from step 6 in OneFS for the policies related to the failover.
- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
- How Action Is Initiated:
-
Synchronize Quotas
Run Eyeglass quota jobs related to the SyncIQ policy or Access Zone being failed over.
- How Action Is Initiated:
- Access Zone: Automated by Eyeglass
- How Action Is Initiated:
-
Repoint DNS to the Target Cluster IP Address
Dual delegation with Eyeglass avoids DNS steps during failover for all SmartConnect zones that are failed over.
- How Action Is Initiated:
-
Access Zone: Automated by Eyeglass
-
- How Action Is Initiated:
-
Refresh Session to Pick Up DNS Change
Remount the SMB/NFS shares or remount exports if data integrity remount is not required for SMB.
- How Action Is Initiated:
-
Access Zone: Automated by Eyeglass
-
- How Action Is Initiated:
Additional Information
-
Initiates Eyeglass Configuration Replication task for all Eyeglass jobs.
-
SyncIQ does NOT modify the ACL (Access control settings on the file system); it locks the file system.
-
A
system.xml
change is required to enable parallel step mode.noteAll policy steps are attempted. On failure, the failover job will continue to attempt all steps and skip downstream steps per policy if the previous SyncIQ step failed.