Skip to main content
Version: 2.9.0

Advanced DR Robot Configuration

This option exercises all Eyeglass Failover automation and more closely follows the steps for an Access Zone failover and failback operation. This takes more time to set up but offers the highest level of confidence everything required for failover is in place for production Access Zones.

Prerequisites

  • Dedicated IP pool added as member to the Robot Access zone.
  • Create SmartConnect Zone Assign a name to the source and target cluster IP pools.
  • SyncIQ policy in an Access Zone with Runbook Robot prefixed name AND only one Runbook Robot prefixed policy can exist in this Access Zone. If any other SyncIQ policies are detected, the Robot will disable itself and stop functioning. This is designed to ensure no production data ends up getting failed over in the Access Zone.
  • Share or export created in the path of the Robot SyncIQ policy. They will be failed over as well using normal configuration sync jobs. It's a good way to test the whole failover of configuration. The configuration names of shares, exports can be any name.
info

Note: Eyeglass will modify the description, Root Client, and Map Root User settings as required for Robot operation on an existing export with the same path as the Robot SyncIQ Policy root path.

  • Quotas will be failed over during the Robot execution and deleted on the source automatically if quotas have been created within the SyncIQ policy used for the Robot.
  • Zone Readiness task must be enabled (default state is disabled). The Zone Readiness jobs can be enabled from the jobs windows or the web shell CLI with (igls admin schedules set --id Readiness --enabled true).
  • Zone Readiness task must have run and Zone Readiness status for Robot Zone must be OK. Then Configuration Replication for the Robot Job must have been run successfully.
  • Eyeglass Hostname on the Network Card Setup and the Network Settings (setup using yast during initial Eyeglass appliance configuration) must be identical.

Failover Readiness

info

Preparation and planning instructions for Zone Readiness can be found in the Eyeglass Access Zone Failover Guide.

Configuration Steps for the Advanced DR Robot

  1. Create an Access Zone with name beginning with EyeglassRunbookRobot on both source and target clusters to be tested for DR. (Note: more than one pair of clusters can be tested with Runbook Robot).

    Network Mapping

  2. Create a SyncIQ policy with the well-known name EyeglassRunbookRobot-xxx, where xxx represents a number or string. Once the policy is created, run it.

    • Use a path that is a child of the access zone base path. Any path will do.
    • See Basic setup above for detailed steps.

    Configuration Replication Configuration Replication

  3. Create a subnet pool (example Robotpool) and make it a member of the Access Zone created in Step #1, example EyeglassRunbookRobot.

info

Note: The IP address space used should be reachable by Eyeglass to mount with NFS export. Create this pool on the source and target clusters.

Example: subnet0:Robotpool

  • Now create mapping alias for the Robot to move SmartConnect Zones from one pool to another. (See detailed section on hints for explanations in the Access Zone guide that describes syntax and how to make them unique to avoid SPN collisions).
    • Example on Source cluster: isi network modify pool --name=subnet0:Robotpool --add-zone-aliases=igls-01-prod
    • Example on target cluster: isi network modify pool --name=subnet0:Robotpool --add-zone-aliases=igls-01-dr
info

NOTE: In this release, for the Access Zone Robot to show up in the Runbook Robot jobs, some configuration data needs to exist in the Access Zone. We recommend creating a share with no permissions anywhere in the Access Zone that is under the SyncIQ policy created within the Access Zone root path. Example Robotshare.

  1. After the inventory runs (6-hour default interval), the Runbook Robot Jobs section will display the SyncIQ policy created in this section.

    Runbook Robot Jobs

  2. To get the Zone Readiness updated (default interval 6 hours), run this job manually (See igls CLI in the Eyeglass CLI to change schedule of all jobs).

  3. After the Zone Readiness job has completed, wait for the next Configuration Replication cycle to complete.

    Configuration Replication Cycle

  4. Open the DR Dashboard and select Zone Readiness to view the Robot Zone Readiness status.

    Zone Readiness Status

  5. In case of errors related to the Robot zone, click the link to identify which areas are not correctly set up. Correct the errors by reviewing the documentation.

    Zone Readiness Errors

  6. To view the SmartConnect to pool mapping, click the View Map link. Correctly mapped with alias hints will look like the image below.

    Robot Access Zone with mapping example:

    Robot Access Zone Mapping

  7. Confirm that the Robot zone is set up with all Zone Readiness Status indicators showing green. All indicators need to be green.

    Zone Readiness Green

  8. The default Robot jobs run every day at midnight and executes both Access Zone based failover Robot policies which moves all networking, SPN updates (delete, add) and aliases on subnet pools using the mapping hints each. It also executes any Runbook Robot policies in other Access Zones like system.

info

Robot jobs are configured without Continue on Error so initial validation check must not have any errors.

Configuration Steps and Verification for the Advanced DR Robot

  1. The default Robot jobs run every day at midnight and execute failover Robot policies which fails over the policy.

    • Creates export.
    • Mounts the cluster and writes test data using export created.
    • It failovers the policy.
    • Runs the policy.
    • Deletes SPN's on source.
    • Renames SmartConnect alias on source with igls-original.
    • Creates SmartConnect alias on target based on mapping setup beforehand.
    • Creates new SPN's against new cluster machine account.
    • Mounts the data on synced export on target.
    • Unmounts.
    • Moves the schedule on policy to the target.
    • Runs resync prep on source.
    • Goes to sleep until time to failback.
  2. Methods to verify it was successful:

    • Check the DR Dashboard policies tab and verify it’s green.
    • Check the cluster SyncIQ policy status on both clusters and make sure the policy moves from one cluster to the other each day.
      • Make sure the test file exists with a current date and time stamp on the active cluster (Hint: look in the policy path root file system for the test file).
    • If quota was applied to Robot policy path, make sure the quota moved to the target cluster AND was deleted on the source cluster (Eyeglass moves policies on failover).
    • Review the Failover Log (DR Assistant / Failover History / Open log file). Failover log may also be downloaded from the Failover Log Viewer using the Download File link.
    • Check the DR Dashboard Zone Readiness Screen to make sure the Robot runs successfully and shows Failover or Green depending on which cluster the policies are active.
info

Initially after failover, the Zone Readiness will show an error for Eyeglass Configuration Replication and Zone Configuration Replication until the next Configuration Replication task and Zone Readiness task have completed.

  1. Check the SmartConnect Zone alias on each cluster and look for the igls-original prefix on the SmartConnect Zone name failed over.

    SmartConnect Zone Alias

  2. The image above shows this cluster's SmartConnect pool zone name was renamed on failover and indicates the dr.ad1.test zone has been moved to the production cluster based on the alias hints mapping.

    • Cool!
  3. The other way to check status is to see which cluster has the enabled SyncIQ policy for the Robot zone.

See Also

  • Advanced Settings: Detailed instructions on advanced disaster recovery configurations, such as manually creating NFS exports for Runbook Robot, setting up multi-site replication topologies, and pool mapping hints for a robust failover strategy.

  • Basic DR Config: Quick and easy setup for disaster recovery testing using non-production data, including SyncIQ policy failover, export creation, quota failover, and failback operations for DR validation.

  • How to Use Runbook Robot: Automate testing of failover infrastructure by using the Runbook Robot. It tests SyncIQ policies and access zones, records pass/fail results, and supports SMB protocol for automated failover.