Troubleshooting AirGap
Introduction
This guide provides steps to open and close maintenance windows and how to stop replication in emergencies. It also covers solutions to common problems such as job failures due to active ransomware events.
Troubleshooting Procedures
Opening and Closing Maintenance Window - Enterprise AirGap
To manage the maintenance window for Enterprise AirGap, follow these steps:
-
Preparation:
- Ensure that the environment variable
EYEGLASS_OPEN_VAULT_ENABLED=true
is set on the Vault Agent VM by checking/opt/superna/eca/eca-env-common.conf
. - Collect the Vault Agent ID (
EVA_ID
) from either/opt/superna/eca/eca-env-common.conf
or from/opt/superna/sca/data/airgap/AirGapVaultAgents.json
.
- Ensure that the environment variable
-
Opening the Maintenance Window:
- Use the following command to set the request for opening the vault:
igls airgap vaultaccessrequest --interval=<DURATION> --vault=<EVA_ID>
.
Here,<DURATION>
is in minutes.
- Use the following command to set the request for opening the vault:
-
Check Vault Access Requests:
- To view existing vault access requests:
igls airgap vaultaccessview [--vault=<EVA_ID>]
.
- To view existing vault access requests:
-
Closing the Maintenance Window:
- You can cancel any active vault access request using:
igls airgap vaultaccesscancel --vault=<EVA_ID>
.
- You can cancel any active vault access request using:
Opening and Closing Maintenance Window - Basic AirGap
For Basic AirGap, follow these steps to open and close the maintenance window:
-
Opening the Maintenance Window:
- Collect the job name (
JOB_NAME
) from AirGap > Job History or from/opt/superna/sca/data/airgap/syncAirGap.json
. - You will also need the policy name (
POLICY
) and the source name (SOURCE_NAME
) from the Isilon source. - Run one of the following commands to open the maintenance window:
- Using job name:
igls airgap connect --job=<JOB_NAME> --timeout=<DURATION>[m|h]
- Using policy and source:
igls airgap connect --policy=<POLICY> --source=<SOURCE_NAME> --timeout=<DURATION>[m|h]
.
- Using job name:
- Collect the job name (
-
Closing the Maintenance Window:
- To close the window, run the respective command:
- Using job name:
igls airgap disconnect --job=<JOB_NAME>
. - Using policy and source:
igls airgap disconnect --policy=<POLICY> --source=<SOURCE_NAME>
.
- Using job name:
- To close the window, run the respective command:
Stop AirGap Replication in an Emergency
In case of an emergency where AirGap replication must be stopped to protect the replicated data, follow these steps:
-
Disable AirGap Replication:
- SSH into Eyeglass as the admin user.
- Run the command:
igls airgap disable
.
This command quickly disables all AirGap policies, halting any syncing activity.
-
Re-enable AirGap Replication:
- When the situation is resolved, re-enable the replication by running:
igls airgap enable
.
- When the situation is resolved, re-enable the replication by running:
Common Issues
AirGap Job Failed Due to Active RSW Events
AirGap jobs might fail to start if there are active Ransomware events. These events can block AirGap jobs from proceeding.
Troubleshooting Steps:
-
Identify the Failure Point:
- Check the AirGap > Job History to determine at what step the AirGap job failed.
-
Resolve Active RSW Events:
- Review any active RSW events and resolve them.
-
Retry the AirGap Job:
- Once the RSW events are resolved, either wait for the next scheduled job or manually start the job.
AirGap Job Failed After Default Timer Ran Out
Eyeglass has a timeout setting, airgapJobTimeout
, which stops the AirGap SIQ job if it exceeds the specified time. By default, this is set to 240 minutes (4 hours). If jobs are failing due to this limit, you can extend the timeout.
-
Find the Current Timeout Value
Check the file/opt/superna/sca/data/system.xml
to find the currentairgapJobTimeout
setting. -
Increase the Time Until Timeout
Follow these steps to modify the timeout value:
a. Switch to the root user:
sudo su
b. Open the file with a text editor:
vi /opt/superna/sca/data/system.xml
c. Locate the line with
<airgapJobTimeout>240</airgapJobTimeout>
, replace240
with the desired timeout in minutes, then save the file.d. Restart the service to apply changes:
systemctl restart sca
This procedure will update the timeout value, allowing longer AirGap jobs to complete without failure.