Assessing Readiness - DR Dashboard
Introduction
The DR Dashboard is a vital tool for monitoring and validating the readiness of your disaster recovery environment. It provides a real-time overview of key components such as SyncIQ policies, configuration replication jobs, and network settings. By consolidating critical information in one place, the dashboard allows administrators to quickly assess overall readiness, identify potential issues, and take corrective actions to ensure that all elements are properly configured for failover or failback.
What is the DR Dashboard
The DR Dashboard is a centralized interface designed to monitor and assess the readiness of disaster recovery (DR) systems. It provides real-time validation checks on critical components, such as SyncIQ policies and configuration replication jobs, ensuring that they are properly configured for failover or failback operations. The dashboard categorizes the status of these components into "OK," "Warning," or "Error," enabling administrators to quickly identify and resolve potential issues in their DR setup.
How to Assess Readiness
The DR Dashboard serves as the main status screen for checking overall cluster readiness. It consolidates key information across multiple areas (DFS, Policy, and Access Zone readiness) and sends critical alarms when issues are detected. Below is a guide on how to assess readiness for each mode, as well as a set of general steps that apply to all modes of operation.
Overall Steps for Assessing Readiness
The following high-level steps apply to assessing DR readiness regardless of whether you're dealing with DFS-enabled SyncIQ policies, SyncIQ policies, IP Pools, or Access Zones:
-
Login to the Eyeglass appliance and open the DR Dashboard
The DR Dashboard provides a centralized view of readiness status across various components.
-
Check the Readiness Tabs
Depending on your needs, review the specific readiness tabs for DFS, SyncIQ Policies, IP Pools, or Access Zones to evaluate failover preparedness.
-
Verify Overall DR Status
The status indicators (green, red, or orange) will give you an immediate view of whether the component is ready for failover. If an error is detected, Eyeglass will trigger a system alarm, allowing you to take corrective action before a failover is attempted
-
Review Detailed Status
Expand individual policies or zones to get detailed information about any issues or actions required to resolve them.
-
Fix Critical Issues
Investigate any error or warning states and resolve them to ensure that the system is fully prepared for DR events.
Assessing DFS Readiness
Go to the DR Dashboard and select the DFS Readiness tab. The status is "OK" when:
- Your SyncIQ Policy is enabled.
- The Last Started and Last Success timestamps of your SyncIQ Policy are identical.
- The Eyeglass configuration replication job is enabled.
- The Last Run and Last Success timestamps of the Eyeglass configuration replication job are the same.
- The audit status of the Eyeglass configuration replication job is OK.
The DR Status is displayed per policy. It also indicates which pair of clusters are used in the DFS configuration and the associated SyncIQ policy.
Expand a policy to see the details for the SyncIQ Policy and the Eyeglass Configuration Replication status:
- Last Run time of the SyncIQ policy and the status of the last run.
- Last Run time of the Eyeglass Configuration Replication job and the status of the last run and audit.
If new shares are created on the DFS mode policy, the next run of the Eyeglass Configuration Replication job in DFS mode will be aware of the new shares and ready to fail them over. It is important to check this job's status after creating more DFS shares under a policy.
Assessing Policy Readiness
Go to the DR Dashboard and select the Policy Readiness tab to view a summary of the SyncIQ Policy statuses. The DR Status for each policy combines the SyncIQ PowerScale OneFS data replication status and the Eyeglass Configuration Replication job status, providing an overview of the readiness for failover.
This combined status is updated every time the Eyeglass Configuration Replication task runs and is the most reliable indicator of DR readiness for failover. Administrators can use this information to review the status of each component, identify errors, and resolve them to ensure that all SyncIQ policies are properly configured for failover.
- A green status means that all SyncIQ Policy Failover Recommendations have passed validation, indicating the policy is safe to failover.
- A red status indicates that one or more SyncIQ Policy Failover Recommendations have failed, and the policy is not ready to failover. In this case, Eyeglass will trigger a system alarm to notify you of the issue.
If you make any changes to your environment, the following Eyeglass jobs must run before the Policy Readiness status is updated:
- Configuration Replication
Assessing Access Zone Readiness
Go to the DR Dashboard and select the Zone Readiness tab to view a summary of key networking, SmartConnect, and Kerberos Service Principal Name (SPN) configurations. The status for each are combined to provide an overall DR Status.
- A green status means all requirements are met, and the Access Zone is ready for failover.
- An orange status indicates that all requirements are met, but some recommendations are not met. In this state, the Access Zone can be failed over, but there may be some additional manual steps. The DR Assistant will allow you to start failover in this state.
- A red status indicates unmet requirements, and the zone is not ready to failover. A system alarm will be triggered to block any failover attempts. In this state, the DR Assistant will block you from starting the failover.
If you make a change to your environment, the following Eyeglass jobs must run before the Zone Readiness will be updated:
- Configuration Replication
- Failover Readiness
Readiness is not assessed for an Access Zone that has already failed over. The DR Dashboard only displays the readiness status for failover from the current active (primary) cluster to the DR target cluster. The readiness status for "failback" (returning operations to the original primary cluster) is not checked until after the failover to the target cluster has been completed.
Assessing IP Pool Readiness
About Validations
Failover Validations are a core component of ensuring your system is ready for disaster recovery. These checks evaluate the status of key components like SyncIQ policies, Access Zones, and network configurations. The results are consolidated in the DR Dashboard to provide a clear picture of whether your environment is safe to failover.
Failover Validations are performed regularly and include checks for data replication, network readiness, configuration consistency, and more. Each validation either confirms that the system is ready or flags critical issues that need to be addressed before failover.
For more details on specific validations, refer to the Failover Validation Reference.
See Also
After assessing the readiness of your failover solution, see the Executing a Failover with DR Assistant for instructions on how to successfully run a failover.