Skip to main content
Version: 2.9.0

Failover Planning with Superna

Introduction

Checklist to Plan for Failover

Use this checklist to ensure that your failover process is organized, documented, and ready when needed.

Steps Before Failover Day

  1. Document A DR Runbook Plan

    • Include a clear sequence of steps
    • List key contacts
    • Define the order of tasks and which contacts to involve
  2. Submit Support Logs For Failover Readiness Audit (7 days before the planned event)

    • When submitting a case, select Failover Case Type Planned
    tip
  3. Complete Failover Training Labs

  4. Review DR Design Best Practices And Failover Release Notes

  5. Upgrade Eyeglass To The Latest Version

    • Each Eyeglass release includes failover rules engine updates to improve reliability and avoid known failover issues.
    tip

    Review the Eyeglass PowerScale Edition Upgrade Guide for instructions.

  6. Test DR Procedures

    • Set up the runbook robot feature for continuous DR testing
    • Test failover with Superna Eyeglass
    • Repeat testing multiple times
    • Perform failback
    • Review logs to understand steps Eyeglass executes
    • Consult documentation on the failover mode you plan to implement
    • Execute the test plan before failover day to validate procedures
  7. Benchmark Failover (Access Zone)

    • Prepare a test policy or runbook robot access zone for multi-policy testing
    • Perform test failovers with one, then two, then three policies
    • Record make writable times and calculate the average to estimate production failover times
    note
    • Fully configure the test access zone (SPNs, shares, exports, quotas) for accurate estimates.
    • If change rate is zero, skip creating changed data before failover.
    • Create as many shares under each policy as in production to benchmark rename step times.
    warning

    Failover logs include steps after failover. The failover job time does not represent actual failover time. Calculate the make writable step from the logs.

  8. Benchmark Failover (DFS Mode)

    • Use an Access Zone with a DFS mode policy or create a test DFS mode policy
    • Copy test data into the path
    • Create shares as in production
    • Test failover with one, then two, then three policies
    • Calculate the average make writable time to estimate production failover times
    note

    Creating as many shares as in production helps benchmark the rename step, which runs in parallel.

    warning

    Failover logs include steps after failover. Calculate the make writable step from the logs.

  9. Create A Contact List For Failover Day

    • Active Directory administrator
    • DNS administrator
    • Cluster storage administrator
    • Workstation, server administrators
    • Application team for dependent applications
    • Change management team for the outage window
  10. Reduce Failover And Failback Time With Domain Mark

    • Run manual domain mark jobs on all SyncIQ policy paths
    • This speeds up failover by reducing domain mark time
    • Domain Mark Details
  11. Count Shares, Exports, NFS Alias, Quotas On Source And Target

    • Validate that configuration items are synced correctly
    • Ensure no quotas are synced on the target (only shares, exports, NFS alias)
    • Use OneFS UI and Superna Eyeglass DR Dashboard to verify
  12. Verify Dual Delegation In DNS Before Failover

    • Confirm DNS is pre-configured for failover across all SmartConnect Zones in the Access Zone
  13. Prepare For DFS Failover

    • Use dfsutil diag viewdfspath and dfsutil cache referral to verify two active paths and correct active path
    • Confirm DFS mounts have both referrals
    • Download dfsutil tool by OS type from Microsoft Docs Archive
  14. Communicate With Application Teams And Business Units

    • Schedule a maintenance window
    • Inform them that data loss will occur if data is written after the maintenance window start time
  15. Adjust SyncIQ Policy Schedules One Day Before Failover

    • Set all policies to run every 15 minutes or less
    • Avoid enabling run on change
    • This keeps data in sync and avoids long-running jobs that extend failover
    warning

    Reviewing DR design best practices, failover release notes, running the latest Eyeglass version, testing DR procedures, and benchmarking failover (both Access Zone and DFS Mode) is mandatory. DR assistance requires acceptance before you continue.

Steps On The Failover Day

  1. Pause SMB And NFS IO Before Failover Starts

    • For SMB (2.0 or later), use the failover option to block IO on shares.
    • This option applies a deny read permission before failover and removes it after failover completes.
    • For NFS, disable the protocol or unmount exports to ensure no IO during failover.
  2. Force Run All SyncIQ Policies One Hour Before Planned Failover

    • Run each SyncIQ policy so that the failover run has less data to sync.
  3. Execute Failover

  4. Monitor Failover

  5. If Required, Refer To Data Recovery Guide

  6. Ensure Active Directory Administrator Is Available

    • ADSIedit recovery steps may be required.
    • Make sure the Active Directory administrator can access cluster machine accounts.

After Failover: Test Data Access

Failover Readiness

Quota Failover Options

  • Skip Quotas During Short Failovers
    During failover, use the skip failover option by unchecking the quota checkbox. This skips the quota step, leaving quotas on the source cluster. Use this option for short failovers, such as over a weekend, to avoid interference from quota scans and SyncIQ. Quotas are not required for failover testing and are safer to leave on the source cluster.

    note

    On failback, make sure to uncheck the quota failover option.

  • Enable Pre-Synchronization of Quotas
    In version 2.5.3 or later, you can use the CLI guide to enable the quota inventory job, which collects quotas on a new job using a default schedule of twice per day. Pre-synchronizing quotas is also possible with a quota sync schedule job. This ensures quotas are not failed over and remain pre-staged on the target DR cluster.

    note

    Use the skip quota option in DR Assistant if pre-syncing quotas.

Access Zone Failover

IP Pool Failover

Eyeglass supports IP pools as a failover unit within an Access Zone. Selecting the IP pool as the failover unit simplifies DR readiness calculations and failover operations. With this mode, shares, exports, and quotas are included in failover.

IP pool failover manages networking failover for SmartConnect Zones and any existing SmartConnect Zone aliases. Eyeglass fails over all policies mapped to the pool using the IP pool policy mapping UI in the DR Dashboard. All SmartConnect names and aliases configured on the pool, along with mapped SyncIQ policies, shares, exports, and quotas, fail over together. During failover, Eyeglass renames (rather than deletes) source cluster zone names to avoid SPN collisions in Active Directory and to prevent clients from mounting the source cluster.

info
  • Planning and mapping IP pools from source to target clusters are required before marking the pools as ready for failover.

  • Converting an Access Zone to IP pool failover is necessary, meaning all pools in an Access Zone must have a policy mapped to a pool before any pool in the zone can fail over.

SMB authentication relies on the AD machine account having correct SPN values for SmartConnect Zones. SPN registration must occur with a writable cluster for authentication and failover. Eyeglass automates SPN management and creates SmartConnect Zone aliases, enabling access with a DNS update that delegates the SmartConnect Zone to the PowerScale cluster.

note

DFS mode does not require DNS, SPN, or SmartConnect Zone changes during failover. You can fail over DFS IP pools using the Pool Failover feature.

Eyeglass DR Assistant - IP Pool Failover Summary

  1. Verify that no live access exists to data, or enable the Data Integrity failover option to disable access to SMB shares before failover.
  2. Start the failover process (automated by Eyeglass).
  3. Validate the failover (automated by Eyeglass).
  4. Set configuration replication policies to USERDISABLED (automated by Eyeglass).
  5. Enable write access to data on the target (automated by Eyeglass).
  6. Move the SmartConnect Zone to the target (automated by Eyeglass).
  7. Update the SPN to enable authentication on the target (automated by Eyeglass).
  8. Repoint DNS to the target cluster's IP address using a post-failover script (automated by Eyeglass with scripting).
  9. Refresh the session to apply the DNS changes using a post-failover script (automated by Eyeglass with scripting).
tip

For more information on this failover mode, refer to the Failover Configuration, and review the IP Pool Failover section.

SyncIQ DFS Mode with Eyeglass

SyncIQ Mode with Eyeglass

DR Rehearsal Mode

FAQs