Alarm Codes
Introduction
Alarms
Abbreviations
Abbreviation | Project |
---|---|
SCA | Eyeglass |
AG | AIRGAP |
RSW | Ransomware Defender |
EAU | Easy Auditor |
ECA | Eyeglass |
ES | Eyeglass Search |
DRSDEdge | Eyeglass for virtual clusters |
SETUP | Setup issues |
LM | License Manager |
SCA0001
Title: GENERIC_ALARM
Description: Error within the SCA service.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This covers all unknown errors that have no more specific errors. -
Resolution:
- Review system logs to identify any related issues or patterns that could indicate the source of the error.
- Check for recent updates or patches that may affect the SCA service.
- Contact technical support if unknown errors continue to occur after troubleshooting steps have been followed.
SCA0002
Title: INVALID_SRC_DST_CONFIGURATION
Description: Found a replication job where either the source or destination is not a managed network element.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm will occur when an Eyeglass Configuration Replication Job fails to run because it is associated with a PowerScale cluster that is not provisioned in Eyeglass. The Eyeglass Configuration Replication jobs are created based on the SyncIQ policies discovered during the Eyeglass inventory task. It is possible that a PowerScale cluster provisioned in Eyeglass could have SyncIQ policies that include PowerScale cluster targets that are not provisioned in Eyeglass. -
Resolution:
- Provision Eyeglass with all PowerScale clusters related to Eyeglass configuration replication jobs.
- If not all PowerScale clusters associated with Eyeglass configuration replication jobs can be managed by Eyeglass, disable Eyeglass configuration replication jobs associated with PowerScale clusters not managed by Eyeglass to avoid the error.
SCA0003
Title: INVENTORY_FAILED
Description: Failed to retrieve inventory.
Severity: CRITICAL
Help on this Alarm:
-
Problem: This alarm will occur when the Eyeglass Inventory task, which discovers the configuration information on a PowerScale cluster, has failed to run.
-
Resolution:
-
Ensure that PowerScale clusters are reachable from the Eyeglass server.
-
This problem can also occur if the Inventory task attempts to start while another instance of the Inventory task is still running. In this case, the problem will be transient and will clear on its own. If this problem persists, contact support.superna.net for assistance.
-
If the info for the alarm shows “Error fetching data from ssh. Inventory may be incomplete,” there is an issue with retrieving the Inventory information that requires SSH access.
-
Ensure that the PowerScale cluster user role in Eyeglass has Network Read/Write permissions.
-
Ensure that the PowerScale cluster user role in Eyeglass does not contain a - or other special characters. This can cause an issue with the sudo command that is used. To troubleshoot this condition:
-
Login to the PowerScale cluster CLI as the PowerScale Cluster user that has been provisioned in Eyeglass.
-
Execute the following command – it needs to be able to execute without error:
sudo isi networks list pools
Example of successful command execution:
% sudo isi networks list pools
Subnet Pool SmartConnect Zone Ranges Alloc
int-a-subnet int-a-pool 192.168.4.145-192.1... StaticExample of failed command execution:
sudo isi networks list pools
sudo: >>> /usr/local/etc/sudoers: syntax error near line 168 <<<
sudo: parse error in /usr/local/etc/sudoers near line 168
sudo: no valid sudoers sources found, quitting
sudo: unable to initialize policy plugin
-
-
-
SCA0004
Title: REPLICATION_JOB_FAILED
Description: Replication job failed to run.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
Following reasons are commonly seen to cause a replication job failure. The info related to the alarm should be verified to confirm the cause:-
AEC_NOT_FOUND Zone
<Zone Name>
not found.- Problem: This error is issued when the Eyeglass configuration replication job runs and attempts to replicate a share or export when the associated Zone does not exist on the target.
- Resolution: Ensure all Zones associated with shares and exports exist on the target. Once the Zones exist, the next configuration replication job will succeed, and the alarm will be cleared.
-
AEC_NOT_FOUND "Path 'x/y/z' not found: No such file or directory".
-
Problem: This error occurs when the Eyeglass configuration replication job runs and attempts to replicate a share, export, or quota when the associated directory does not exist on the target.
-
Possible Cause for Missing Path (Group 1):
- SyncIQ Policy associated with the path has not been run.
- Path is on the SyncIQ Policy Excluded list.
- SyncIQ Policy has paths in included or excluded lists, and the path that was not found is protected by the policy but is not in either list.
-
Resolution (Group 1): Ensure that all directories associated with shares, exports, and quotas exist on the target. Once the directories exist, the next configuration replication job will succeed, and the alarm will be cleared.
-
Possible Cause for Missing Path (Group 2):
- Share path has a trailing "/" at the end of the share path - example /ifs/home/
-
Resolution (Group 2): We recommend removing the trailing "/" from the path of the share to resolve the error.
-
-
AEC_EXCEPTION bad hostname
<host name>
- This occurs when the NFS Export Clients field has a hostname entry that cannot be resolved on replication of the Export.
-
AEC_EXCEPTION Cannot set security descriptor on
<share path>
Read only file system- You will receive the above error when Eyeglass is unable to sync share properties such as ABE (Access Based Enumeration).
- In this case, we recommend comparing all properties of the active cluster share with the target cluster share and making them identical to the source.
-
NOTE: Failure for any property of a share, export, or quota to replicate will stop all replication for that share/export/quota.
SCA0005
Title: FAILED_TO_CONNECT
Description: Failed to connect to the designated target.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm will occur when Eyeglass cannot establish a connection with the PowerScale cluster it is managing. -
Resolution:
- Verify that there have been no changes in PowerScale credentials, which would cause Eyeglass provisioning to be outdated.
- Verify that there have been no networking or firewall changes that would result in network connectivity loss between the Eyeglass server and the PowerScale clusters.
SCA0006
Title: BACKUP_JOB_FAILED
Description: Failed to create a backup archive.
Severity: MAJOR
SCA0007
Title: JOB_AUDIT_FAILURE
Description: Replication job audit failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
The Eyeglass audit task compares source and destination configuration items post-replication to confirm that the replication task has done its job and that the configuration items on the source and target are indeed identical. The replication job audit failed alarm will occur when:- The audit job fails to run.
- The audit job finds configuration items where the source and target are different.
-
Resolution:
Use the info related to the alarm to aid in resolving the issue.-
For audit job failure to run:
- Info: Audit of shares for the current job is inconclusive because the source and target network elements are identical.
- Resolution: Eyeglass configuration replication job has the same PowerScale cluster configured as source and target. Eyeglass cannot perform replication configuration or the associated audit with this configuration. Disable this job to avoid this alarm.
-
Info: Audit of job 'xxx' has failed: Source and/or target network element were not found in the provided list of network elements.
- Resolution: Source and target PowerScale clusters in Job 'xxx' must be managed by Eyeglass for the audit to run.
-
For audit job finding configuration items where source and target are different:
- Info: Audit failed: Values for key 'xxx' do not match: source: share or export; target: share or export.
- Key 'xxx' refers to a property of share or export where a mismatch was found.
- Info: Auditable source (type: 'xxx', zone: 'yyy', name: 'zzz') and target (type: 'xxx', zone: 'yyy', name: 'zzz') objects not found on source or target cluster, hence audit fails.
- Auditable source refers to a configuration item.
- Resolution:
- Check that the associated replication job was successfully run. If not, the source and destination may be different, and the replication job issue must be resolved.
- It is possible that a change to the configuration item was made via OneFS after the replication job was completed and before the audit task ran. The next replication task will resolve this mismatch.
- Info: Audit failed: Values for key 'xxx' do not match: source: share or export; target: share or export.
-
SCA0008
Title: QUOTA_FAILOVER_JOB_FAILED
Description: Quota failover job failed.
Severity: CRITICAL
SCA0009
Title: BASE_LICENSE_TO_EXPIRE
Description: Base license to expire.
Severity: WARNING
Help on this Alarm:
A base license is about to expire. Visit support.superna.net and open a case with your appliance ID.
SCA0010
Title: BASE_LICENSE_HAS_EXPIRED
Description: Base license has expired.
Severity: MAJOR
Help on this Alarm:
The base license has expired. Visit support.superna.net and open a case with your appliance ID.
SCA0011
Title: DISCOVERY_LICENSE_TO_EXPIRE
Description: Discovery license to expire.
Severity: WARNING
Help on this Alarm:
A discovery license is about to expire. Visit support.superna.net and open a case with your appliance ID.
SCA0012
Title: DISCOVERY_LICENSE_HAS_EXPIRED
Description: Discovery license has expired.
Severity: MAJOR
Help on this Alarm:
The discovery license has expired. Visit support.superna.net and open a case with your appliance ID.
SCA0013
Title: FEATURE_LICENSE_TO_EXPIRE
Description: Feature license to expire.
Severity: WARNING
Help on this Alarm:
Visit support.superna.net and open a case with your appliance ID.
SCA0014
Title: FEATURE_LICENSE_HAS_EXPIRED
Description: Feature license has expired.
Severity: MAJOR
Help on this Alarm:
The feature license has expired. Visit support.superna.net and open a case with your appliance ID.
SCA0015
Title: MANAGEDOBJECT_LICENSE_TO_EXPIRE
Description: Managed object license to expire.
Severity: WARNING
Help on this Alarm:
A managed object license is about to expire. Visit support.superna.net and open a case with your appliance ID.
SCA0016
Title: MANAGEDOBJECT_LICENSE_HAS_EXPIRED
Description: Managed object license has expired.
Severity: MAJOR
Help on this Alarm:
The managed object license has expired. Visit support.superna.net and open a case with your appliance ID.
SCA0017
Title: SUPPORT_LICENSE_TO_EXPIRE
Description: Support license to expire.
Severity: WARNING
Help on this Alarm:
A support license is about to expire. Visit support.superna.net and open a case with your appliance ID.
SCA0018
Title: SUPPORT_LICENSE_HAS_EXPIRED
Description: Support license has expired.
Severity: MAJOR
Help on this Alarm:
The support license has expired. Visit support.superna.net and open a case with your appliance ID.
SCA0019
Title: TRIAL_KEY_LIMITS_REACHED
Description: Replication functionality limited by trial.
Severity: WARNING
Help on this Alarm:
The trial key has limits applied, restricting the replication functionality. Visit support.superna.net and open a case to request higher limits.
SCA0020
Title: AUDITTRUSTEE_ISSUE
Description: Replication audit issue with the trustee(s).
Severity: CRITICAL
SCA0021
Title: SINGLE_SHARE_MIGRATION_FAILURE
Description: Single Share Migration Job failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
The migration job has encountered an error related to the folder migration feature. -
Resolution:
- Check the SyncIQ policy and the running jobs window for details on the failed migration job.
- Locate the error code and click on the info button for more information regarding the failure.
- Raise a case with support using the details obtained from the error information.
SCA0022
Title: NO_SUPPORT_LICENSE_INSTALLED
Description: No support license has been installed. Patches cannot be applied.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This indicates that no support license key is installed, which is required to raise a case for a production appliance. Without a valid support license, patches cannot be applied. -
Resolution:
- Install a valid support license key to enable case creation and apply patches.
- If your support keys have expired, a new order is required to purchase a new support contract.
SCA0023
Title: REPLICATION_SOURCE_MATCHES_DESTINATION
Description: Found a replication job where the source and destination are the same, which replicates shares.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm is triggered when a SyncIQ Job is discovered where the source and target refer to the same cluster. This configuration is not valid for replication purposes. -
Resolution:
- Review the SyncIQ Job settings to verify the source and destination configuration.
SCA0024
Title: PREVIOUS_JOB_STILL_RUNNING
Description: A scheduled task could not run as another instance was already running.
Severity: WARNING
Help on this Alarm:
-
Problem:
This alarm occurs when a replication task is in progress at the time a new replication task is scheduled to begin. Eyeglass attempts to initiate a new replication task every 5 minutes by default. If the ongoing configuration replication takes longer than 5 minutes, this alarm will trigger for each blocked attempt to start a new replication. -
Resolution:
- No action is required at this time. The in-progress configuration replication job will complete, and the next scheduled configuration replication task will begin automatically.
- Monitor the duration of replication tasks and adjust the schedule if recurring conflicts arise.
SCA0025
Title: NETWORK_ELEMENT_TOO_BUSY
Description: The target is too busy to respond to all requests. Consider reducing the number of parallel operations.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm is triggered when the API call to the cluster returns an error indicating that the cluster node is busy and refusing to answer API calls. This situation may arise if too many parallel operations are issued to the cluster using PAPI, or if the cluster is heavily utilized or experiencing resource issues, such as high CPU utilization. -
Resolution:
- Raise a support request to obtain instructions on reducing the number of parallel API calls made to the cluster.
- Monitor the resource usage of the cluster nodes and consider optimizing operations to prevent overload.
SCA0026
Title: DR_DASHBOARD_STATUS_CHANGE_ALARM
Description: The job status for the job changed to error.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm occurs when there is an error state with either a Configuration Replication Job or a SyncIQ Policy Job, resulting in the overall DR Dashboard status related to that policy being marked as an Error. -
Resolution:
- Login to the Eyeglass web page and open the DR Dashboard window.
- Find the job that is reporting an error and expand it to identify whether the issue is with the SyncIQ Policy Job or the Eyeglass Configuration Replication Job.
- If the problem is related to the SyncIQ Job, further troubleshooting should be performed from OneFS.
- If the issue pertains to the Eyeglass Configuration Replication Job, open the Alarms window and look for an alarm that has this Configuration Replication Job as the source. The Info in this alarm will provide additional details regarding the cause of the issue.
SCA0027
Title: DR_DASHBOARD_STATUS_CHANGE_WARNING
Description: The job status for the job changed to either 'pending' or 'disabled.'
Severity: WARNING
Help on this Alarm:
-
Problem:
This alarm will occur when there is a DR Dashboard status change for a Job to either Pending or Disabled. -
Resolution:
- For a status of Pending, no action is required. This is an indication that the Job has not yet been run.
- For the status of Disabled, this status may occur if the SyncIQ Policy has been disabled in OneFS or if the Configuration Replication Job has been User Disabled in Eyeglass. If Disabled is not the correct status for this Job, login to the Eyeglass web page and open the DR Dashboard window. Expand the Job related to the alarm to determine whether the SyncIQ Policy has been disabled or the Eyeglass Configuration Replication Job. If this Job should not be disabled for the SyncIQ policy, you must enable it using the OneFS interface. For Eyeglass Configuration Replication, open the Jobs window and then select the Job in the list. Open the Select a bulk action menu and select Enable/Disable.
SCA0028
Title: RUNBOOK_ROBOT_JOB_FAILED
Description: The Runbook Robot job failed for the job.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm will occur when the Runbook Robot job fails and indicates that you have misconfigured the robot or the robot failed, and your cluster is not ready for a real DR event. The robot is the best indication of readiness for failover. -
Resolution:
- Review the configuration of the Runbook Robot to identify any errors or misconfigurations.
- Ensure that the cluster is properly configured and operational for disaster recovery scenarios.
- If necessary, rerun the Runbook Robot job after making the necessary corrections to the configuration or resolving any identified issues.
SCA0029
Title: POLICY_FAILOVER_FAILURE
Description: Policy Failover Job failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm will occur when a DR Assistant failover job has been created and submitted to failover a policy but fails to complete successfully. -
Resolution:
- Open the DR Assistant icon on the Eyeglass desktop and select the failover history.
- Review the failover log for the failover job to find which step in the failover did not complete, so you know where to begin manual step recovery.
- Use the Failover Recovery Procedures for guidance on manual recovery.
SCA0030
Title: ACCESS_ZONE_FAILOVER_FAILURE
Description: Access Zone Failover Job failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm will occur when a DR Assistant failover job has been created and submitted to failover an Access Zone but fails to complete successfully. -
Resolution:
- Open the DR Assistant icon on the Eyeglass desktop and select the failover history.
- Review the failover log for the failover job to find which step in the failover did not complete, so you know where to begin manual step recovery.
- Use the Failover Recovery Procedures for guidance on manual recovery.
SCA0031
Title: DFS_FAILOVER_FAILURE
Description: DFS Failover Job failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm will occur when a DR Assistant failover of a DFS job has failed. -
Resolution:
- Open the DR Assistant history window and review the log to identify which step failed.
- Take corrective action based on the point of failure to recover the failed steps manually.
- Use the Failover Recovery Procedures for guidance on manual recovery.
SCA0032
Title: READINESS_JOB_FAILED
Description: Readiness job failed to run.
Severity: MAJOR
Help on this Alarm:
-
Problem:
The Failover Readiness job encountered unexpected errors when running. -
Resolution:
- Run Failover Readiness manually from the Eyeglass Jobs window.
- Ensure the Access Zone Prerequisites have been met as outlined in the Planning and Procedures for Eyeglass Access Zone Failover.
SCA0033
Title: READINESS_CHECK_ERRORS
Description: Readiness job execution found errors - not ready for failover.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm occurs when a Zone readiness job checks various parameters, settings, and mappings for Smartconnect Zones and finds issues. It may indicate that SPN values on cluster machine accounts are not registered with Active Directory. This requires delegation steps to the SPN property on the machine account, which should be checked first in the documentation procedures. Additionally, network mapping not completed is another reason the readiness job can fail. The alarm will also trigger if the SyncIQ job associated with the Zone has failed (based on its path and Access Zone root path) or if configuration replication fails for any policy in an Access Zone. -
Resolution:
- Go to the Eyeglass DR Dashboard and select the Zone Readiness tab.
- Choose the zone name to check the status.
- Click on mapping to verify all hints are in place.
- Then click on status to ensure it is green. If not, look for the failed item in the list.
- Take action to resolve any identified issues related to SyncIQ failures, configuration replication job failures, policies, and SPN failures.
SCA0034
Title: SPN_PROCESSING_FAILED
Description: SPN processing (either checking or repairing) has failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm will occur when Eyeglass is unable to create or delete SPNs during the regular Inventory task. -
Resolution:
- Manually inspect existing SPNs using the ADSI Edit tool.
- Ensure that there is an SPN for each Smartconnect Zone and Smartconnect Zone alias used for SMB share access.
- Refer to the Failover Readiness Validations DR Dashboard - SPN Delegation Example for further guidance on SPN delegation.
SCA0035
Title: NODE_COUNT_VIOLATION
Description: The node count limitation has been exceeded.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm will occur when you have an Eyeglass Node-based license (License Type SEL EYEGLASS DR MANAGER ADVANCED or Ent VAPP in the Eyeglass Managed Licenses window) and the number of nodes in the PowerScale clusters being managed by Eyeglass exceeds the number of nodes that you have licensed. -
Resolution:
- Purchase additional node licenses to match the total number of managed nodes.
- Review your current licensing and compare it to the number of nodes being managed to ensure compliance.
- Note that no product support is provided for under-licensed clusters.
SCA0036
Title: RUNBOOK_ROBOT_COUNT_EXCEEDED
Description: The runbook robot job count has been exceeded.
Severity: WARNING
Help on this Alarm:
-
Problem:
This alarm indicates that more than one Eyeglass Runbook Robot job has been configured on the PowerScale clusters managed by Eyeglass. Eyeglass is only capable of executing one Runbook Robot job at a time. -
Resolution:
- Remove extra Runbook Robot jobs by deleting the SyncIQ policies with the special name, ensuring that only one job exists.
SCA0037
Title: ACCESS_ZONE_FAILOVER_SPN_FAILURE
Description: Failed to delete or repair SPNs during failover.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm occurs when Eyeglass is unable to create or delete SPNs (Service Principal Names) related to Smartconnect Zone changes made during an Access Zone failover. -
Resolution:
- Manually inspect existing SPNs using the ADSI Edit tool:
- Failover Source Cluster - An SPN for the SmartConnect Zone prefixed with “igls-original” SHOULD exist. If it does not, add it.
- Failover Target Cluster - An SPN for the SmartConnect Zone name or Alias added during failover SHOULD exist. If it does not, add it.
- If the above conditions are not met, use ADSI Edit to update the SPN to be on the correct cluster. Note that you cannot create a missing SPN on the Active Cluster if it still exists for the Failover Cluster. You must first remove the SPN from the Failover Cluster and then add it to the Active Cluster.
- For detailed instructions, refer to the guide on How to Configure Delegation of Cluster Machine Accounts with Active Directory Users and Computers Snapin.
- Manually inspect existing SPNs using the ADSI Edit tool:
SCA0038
Title: POLICY_CONTAINS_EXCLUDES
Description: Found a policy with excluded directories. This is not a supported SyncIQ configuration for failback.
Severity: WARNING
Help on this Alarm:
-
Problem:
This alarm indicates that Eyeglass has detected a SyncIQ Policy with excludes (or includes) configured, which is not supported for failback. -
Resolution:
- Review the SyncIQ Policy and remove any excluded directories.
- Ensure that only supported configurations are used for failback to avoid issues.
SCA0039
Title: DISASTER_RECOVERY_TESTING_FAILED
Description: The disaster recovery testing has failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that Eyeglass executed a disaster recovery (DR) test mode job, which resulted in an error. -
Resolution:
- Retry enabling or disabling the DR test job.
- Check the running jobs window for details on the failover to determine the cause of the failure.
SCA0040
Title: FAILOVER_SUCCEEDED
Description: Failover succeeded.
Severity: INFORMATIONAL
Help on this Alarm:
This alarm indicates that Eyeglass has successfully executed a failover mode without any errors.
No action is required; this is a notification alarm sent to log that a failover has occurred successfully.
SCA0041
Title: DISASTER_RECOVERY_TEMPLATE_FAILED
Description: The disaster recovery template replication has failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that the Eyeglass PowerScaleSD product has attempted to deploy an access zone template to an edge cluster, but the job has failed. -
Resolution:
- Check the running job for specific errors or steps that failed before attempting to retry the operation.
SCA0042
Title: DISASTER_RECOVERY_EDGE_REPLICATION_FAILED
Description: The disaster recovery replication to SD Edge has failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm occurs when the Eyeglass PowerScaleSD edition has executed a failover from the core to the edge cluster, resulting in a failure to set up the edge site after configuring the failover. -
Resolution:
- Consult the running jobs window for specific errors and the steps that failed before attempting to retry the job.
SCA0043
Title: DISASTER_RECOVERY_EDGE_DEPLOYMENT_FAILED
Description: The disaster recovery deployment to SD Edge has failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that the Eyeglass PowerScaleSD edition product has failed to deploy the core access zone template and sync shares and exports to the edge cluster. This setup requires policies to sync the folder structure before shares and exports can be created. -
Resolution:
- Check the policy run status on the PowerScale and the running jobs window to verify the specific step and error code that blocked the job from completing.
- Address any identified issues before retrying the job.
SCA0044
Title: PROBE_UNDER_LICENSED
Description: The Probe is under-licensed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that more clusters are managed by Eyeglass DR than the installed Eyeglass CA UIM Probe license keys. Eyeglass will randomly select which clusters are managed for the CA UIM alarm monitoring. -
Resolution:
- Contact sales@superna.net to get a quote on additional probe license keys to remove the alarm and achieve a fully supported installation.
SCA0045
Title: INVENTORY_DEGRADED
Description: Inventory is degraded/incomplete.
Severity: MINOR
Help on this Alarm:
-
Problem:
This alarm indicates that the inventory process, which runs during the cluster configuration replication job (default 5-minute intervals), encountered API failure responses from the cluster node to which the API request was sent. This condition can clear on its own if the cluster node resumes processing API REST calls. If the configuration replication job fails completely, it means that the cluster has stopped answering API calls. -
Resolution:
- Monitor how frequently this alarm occurs. If it is persistent, open a case with support and upload the relevant logs to the case. The logs may be requested if not attached.
- If the condition persists, an EMC support case should be opened, as the node may be overloaded and unable to respond to API queries.
NOTE: The OneFS UI also uses the REST API, so ignoring this event could affect the OneFS UI on the node that is experiencing API failures.
SCA0046
Title: CREATE_SD_EDGE_DEPLOYMENT_JOB
Description: Creation of SD Edge deployment job failed.
Severity: MINOR
Help on this Alarm:
-
Problem:
This alarm applies only to the Branch Office configuration and data protection licensed solution. The deployment job failed when attempting to build a new edge site. Possible causes include:- An unreachable cluster.
- A SyncIQ command failing to return a status code from the cluster.
- A timeout while waiting for a cluster command to complete.
-
Resolution:
- Check the SyncIQ reports for the temporary deployment policy that Eyeglass creates to identify errors on the source cluster where the template access zone was created.
- Verify cluster connectivity and command execution status to resolve the issue.
SCA0047
Title: NE_REMOVAL_ERROR
Description: Network element removal failed.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm is raised and logged when a cluster previously under management has been deleted through user action. This alarm will not be generated without a manual deletion from the Inventory Icon on the Eyeglass desktop. If this alarm is unexpected, it indicates that someone deleted the cluster and re-added it. This is not a recommended procedure for addressing any errors encountered with Eyeglass. -
Resolution:
- Before deleting a cluster from inventory while active DR monitoring is in production for this cluster, consult with support unless you are certain of the procedure you are performing. Test clusters can be deleted if necessary using the following procedure:
- Open the Inventory Window.
- Select the Cluster.
- Right-click and select "Delete NE."
- Before deleting a cluster from inventory while active DR monitoring is in production for this cluster, consult with support unless you are certain of the procedure you are performing. Test clusters can be deleted if necessary using the following procedure:
SCA0049
Title: JOB_AUDIT_WARNING
Description: Replication job audit resulted in a warning.
Severity: WARNING
Help on this Alarm:
-
Problem:
This alarm is raised when the cluster sync audit step runs as part of the configuration replication job.The audit collects inventory from both clusters and performs comparisons on shares, exports, quota, snapshot schedule, deduplication settings, NFS aliases, object attributes, and more.
This typically indicates that the policy configuration data between the source and target clusters is not fully synchronized. The error message in the alarms window provides information about the specific field or objects that do not match exactly.
-
Resolution:
- Check with support regarding the error message to determine the appropriate resolution.
- Verify that the default settings for shares and exports on both clusters match, as differences may lead to modifications by the target cluster that conflict with the source cluster settings.
- Investigate if any settings on a share or export on the target cluster were manually changed, which could have caused the audit discrepancy.
- In general, ensure that both the source and target clusters maintain identical configurations for optimal synchronization.
SCA0050
Title: QUOTA_INVENTORY_FAILED
Description: Failed to retrieve quotas.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm indicates a failure to retrieve quotas during the configuration sync job. Quotas are collected through REST API calls, which retrieve them in pages. On clusters with thousands or tens of thousands of quotas, this process requires numerous API calls to gather each quota and any notifications set up for each quota (which are retrieved through separate API calls). If an API call fails during this extended multi-query process, this error may be triggered. -
Resolution:
- Verify that Eyeglass has the minimum permissions set correctly for retrieving quotas, as insufficient permissions can lead to this error.
SCA0051
Title: DEDUPE_REPLICATION_FAILED
Description: Deduplication replication job failed to run.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm is raised when the deduplication replication job fails to execute. As of Eyeglass version 1.6.3 and later, dedupe paths are synchronized between clusters. Configuration replication jobs are responsible for syncing these paths, and an API failure to set all the paths on the target cluster will trigger this alarm. This issue may be due to insufficient permissions with the Eyeglass service account or an API failure. -
Resolution:
- Expand the configuration replication job logs to find the information text that identifies the specific API error that triggered the failure.
- Consult the documentation site for guidance on how to find and identify sync errors.
- Ensure that the Eyeglass service account has the necessary permissions to execute the deduplication replication job.
SCA0052
Title: DUPLICATE_INVENTORY_ITEM
Description: Found duplicate inventory items.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This error occurs when the inventory function, which runs as part of the configuration job, identifies two entries that are duplicates and cannot be saved to the database. -
Resolution:
- If this error appears, follow the instructions for resetting the database to remove duplicate items.
- If further assistance is needed, raise a support case.
- In version 1.7.0 and later, the database can be rediscovered safely, ensuring that all job settings will be restored.
SCA0053
Title: CONTINUOUS_OPERATION_STATUS_ERROR
Description: Continuous operation status is ERROR.
Severity: MAJOR
Help on this Alarm:
-
Problem:
The new continuous operations dashboard shows the overall status of Snapshot sync and dedupe sync status. This alarm is raised when the dashboard and status cannot be updated due to configuration sync job failures or when sync status fails on SyncIQ policy paths. This may occur if snapshot schedules or dedupe path settings related to SyncIQ jobs are not in sync on the source and target clusters, indicating that readiness for operations on the target cluster is impacted. -
Resolution:
- Review the dashboard to identify the affected policy, which will help pinpoint the area where snapshots or dedupe are in a sync error state. This can be achieved using the running jobs window and expanding the last errored job to find the policy and section for snapshot schedules and dedupe sync.
- Identify the API error that is the root cause of the dashboard's error state by checking for specifics in the sync errors.
- Search the documentation site for guidance on how to find and identify sync errors.
SCA0054
Title: MIGRATION_JOB_SUCCEEDED
Description: Migration job succeeded.
Severity: INFORMATIONAL
Help on this Alarm:
This alarm indicates that an Access Zone Migration job has completed successfully. No further action is required as the migration job has been successful.
SCA0055
Title: ARCHIVE_UPLOAD_FAILED
Description: Archive upload failed.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This error indicates that the upload to support via the Eyeglass backup tab has failed, likely due to firewall settings blocking the Eyeglass appliance from directly accessing the support portal to upload the log file. -
Resolution:
- If this error occurs, it means that support did not receive the log file. You should log in to the support site, download the backup archive, and upload it manually.
- Ensure that the Eyeglass appliance can establish a direct upload connection over port 443 (HTTPS) to the Internet to facilitate future uploads automatically.
SCA0056
Title: DEDUPE_AUDIT_FAILED
Description: Deduplication audit job failed to run.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that the audit of both clusters' deduplication settings (which runs automatically during configuration sync jobs) was unable to complete the comparison between the clusters. -
Resolution:
- Run the deduplication audit manually again and check if the error persists.
- If the error continues to be present, open a case with support for further assistance.
SCA0057
Title: QUOTA_SYNCHRONIZATION_FAILED
Description: Quota Synchronization Job Failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This error occurs when a quota job is run during failover or manually (which should not be done). It can happen if the cluster returns a create or delete quota error, most commonly due to a domain lock from a quota scan blocking the quota create APIs. Additionally, unlinked Everyone quotas can also cause quota create or delete operations to fail. -
Resolution:
- Use the information text in the running jobs screen to collect the error code.
- Submit a case to Support with the collected error code for further assistance.
SCA0058
Title: ERROR_RETRIEVING_CLUSTER_VERSION
Description: Error retrieving cluster version.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This error occurs when a heartbeat task checks the cluster OneFS version, and the API fails. It is commonly seen after a cluster upgrade to OneFS 8 if the SCA process was not restarted afterward. -
Resolution:
- Follow the steps outlined in the provided guide to ensure Eyeglass can detect the new cluster version after a major upgrade from OneFS 7 to 8. Guide here.
SCA0060
Title: ERROR_GENERATING_DATASET
Description: Error generating data sets.
Severity: MINOR
Help on this Alarm:
-
Problem:
This alarm is raised when the ServiceNow integration API tries to build a data set that is exposed via an XML URL (https://eyeglass_ip_address/servicenow/servicenow.xml). The error occurs if not all API data is available to generate the dataset and expose it through the XML interface. -
Resolution:
- This alarm does not affect DR functionality and can be ignored if you are not using the ServiceNow integration feature.
SCA0061
Title: DISCOVERY_RETRIEVAL_ERROR
Description: Failed to retrieve NE data.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates a failure to retrieve data from the Unity REST API. -
Resolution:
- Investigate the connection and authentication settings for the Unity REST API to ensure they are correct.
- Check for network connectivity issues that may be preventing data retrieval from the API.
- Review system logs for any specific error messages related to this failure and address those issues accordingly.
SCA0062
Title: REST_API_ERROR
Description: Failed API REST request.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates a failure in making a request to the Unity REST API, which may prevent data retrieval or other operations. -
Resolution:
- Check the API endpoint and ensure that it is correctly specified in the request.
- Verify the authentication credentials being used for the REST API call to ensure they are valid.
- Look for any network connectivity issues that may interfere with the API request.
- Review system logs for specific error messages related to the failed API request to help diagnose the problem.
SCA0063
Title: POLICY_FAILOVER_CANCEL
Description: Policy Failover Job cancelled.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that a running failover job was cancelled. This functionality is available in release 2.0 and later, and the failover log records the cancellation. -
Resolution:
- Since all recovery steps from a failed failover are manual, review the failover log and proceed with the appropriate manual recovery steps.
- Investigate the reason for the cancellation to ensure it aligns with operational procedures, and take necessary actions to prevent unintended cancellations in the future.
SCA0064
Title: ACCESS_ZONE_FAILOVER_CANCEL
Description: Access Zone Failover Job cancelled.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that a running Access Zone failover job was cancelled. This functionality is available in release 2.0 and later, and the failover log records the cancellation. -
Resolution:
- Since all recovery steps from a failed failover are manual, review the failover log and proceed with the appropriate manual recovery steps.
- Investigate the reason for the cancellation to ensure it aligns with operational procedures and take necessary actions to prevent unintended cancellations in the future.
SCA0065
Title: DFS_FAILOVER_CANCEL
Description: DFS Failover Job cancelled.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that a running DFS failover job was cancelled. This functionality is available in release 2.0 and later, and the failover log confirms that the failover was cancelled. -
Resolution:
- Since all recovery steps from a failed failover are manual, review the failover log and proceed with the necessary manual recovery steps.
- Investigate the reason for the cancellation to ensure compliance with operational procedures and to mitigate future unintentional cancellations.
SCA0066
Title: RUNBOOK_ROBOT_JOB_CANCELED
Description: The Runbook Robot job was cancelled.
Severity: MAJOR
Help on this Alarm:
-
Problem:
In release 2.0 and later, a failover job can be cancelled. This alarm indicates that someone cancelled a running failover. The failover log also indicates that the failover was cancelled. -
Resolution:
- All recovery steps from a failed failover are manual.
SCA0067
Title: REHEARSAL_FAILOVER_CANCEL
Description: Rehearsal Failover Job canceled.
Severity: MAJOR
Help on this Alarm:
-
Problem:
The user has cancelled a DR assistant failover using rehearsal mode. This decision can leave the system in an inconsistent state and should never be done without consulting support. -
Resolution:
- Open the DR Assistant, view the running failover, and download the failover log.
- Open a support case for assistance and provide the failover log.
- DR Rehearsal mode allows a rollback to the previous state on some failed steps.
- Start consulting the recovery guide for the state where you left the system: Guide here.
Impact: Data access failure can occur from this alarm.
SCA0068
Title: REHEARSAL_FAILOVER_FAILURE
Description: Rehearsal Failover Job failed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
The DR rehearsal failover job has failed. -
Resolution:
- Open the DR Assistant, go to running failovers, and download the failover log. Find the failed step.
- Open a support case and attach the failover log along with the failed step from the log.
- Create a support backup archive, upload it to the support site, and update the case that logs have been uploaded.
- Manual recovery steps may be required; consult the failover recovery guide: Guide here.
Impact: Data access failure can occur from this alarm.
SCA0069
Title: POLICY_POOL_MAPPING_ERROR
Description: Error mapping policy to a pool.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This validation alarm is raised by the DR Readiness process, which verifies that each policy in an Access zone has been mapped to an IP pool. The alarm will be triggered if the IP pool configuration has been initiated on an access zone but not completed, or if a new policy is created for an access zone and has not yet been mapped to a pool. -
Resolution:
Consult the Access Zone Design guide on IP pool-supported configurations for detailed instructions.
SCA0070
Title: DISK_SPACE_FULL_WARNING
Description: Disk usage has reached 75%.
Severity: WARNING
Help on this Alarm:
-
Problem:
This alarm is raised when disk space on Eyeglass or ECA nodes exceeds 75%, indicating that the logs are consuming more space than required. The logs are configured for circular logging and should not fill the hard disk space. A usage level greater than 75% may indicate that another process is consuming space or that memory dump files are present. -
Resolution:
Open a case with support if you receive this alarm.
SCA0071
Title: DISK_SPACE_FULL_ERROR
Description: Disk usage has reached 90%.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm is raised when disk space on Eyeglass or ECA nodes exceeds 90%. The VM is likely to fail soon if not corrected and will be unable to log or perform normal operations. -
Resolution:
Open a case immediately with support.
SCA0073
Title: CSM_GROUP_QUOTAS_MANAGER_FAILURE
Description: Saving AD trustees to the database failed.
Severity: WARNING
Help on this Alarm:
-
Problem:
This error indicates that the AD user and membership have failed to be stored in the database. This function is used for quota creation automation as part of the Cluster Storage Monitor product. -
Resolution:
Open a support case and upload a support backup.
SCA0074
Title: REHEARSAL_FAILOVER_SUCCEEDED
Description: Rehearsal Failover Job succeeded.
Severity: INFORMATIONAL
Help on this Alarm:
-
Problem:
A DR rehearsal mode failover has been completed successfully. This alarm indicates that all steps were completed successfully. -
Resolution:
- Test data access as per the guidelines provided in the testing data access guide: Guide here.
Impact: No impact on data access.
SCA0075
Title: DISK_SIZE_MONITOR_ALARM
Description: The Disk Size Monitor has detected that a folder or file(s) exceeded a threshold.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that disk space has crossed the designated threshold. -
Resolution:
- Open a support case to remove unnecessary files.
Impact: There is no immediate impact unless the disk reaches 100% usage.
SCA0076
Title: DISK_SIZE_MONITOR_INACTIVE
Description: The Disk Size Monitor is inactive.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that the monitoring process for disk space is no longer active. -
Resolution:
- Open a support case to get this process restarted.
Impact: No functional impact other than the monitoring being inactive.
SCA0077
Title: DISK_SIZE_MONITOR_ERROR
Description: The Disk Size Monitor identified an error.
Severity: MAJOR
Help on this Alarm:
-
Problem:
The disk monitor process has failed. -
Resolution:
- Open a support case to get this proactive monitoring process fixed.
Impact: No functional impact to product functionality other than the monitoring of disk usage not being active.
SCA0078
Title: REST_API_CALL_FAILURE
Description: Rest API call failed.
Severity: MINOR
Help on this Alarm:
-
Problem:
The REST API is used to automate failover tasks. When a REST API call is issued from an external application, a return code of HTTP 200 is sent on a successful API request. This alarm will be raised if an internal error occurs while processing an API request. -
Resolution:
- Open a support case and provide the API call that the application was using.
- Create a support archive and upload it to the support site.
- All of this information will be requested by support to assist with the root cause of the failure.
SCA0079
Title: NEW_POLICY_DISCOVERED
Description: Create a new job for the discovered policy.
Severity: INFORMATIONAL
Help on this Alarm:
-
Problem:
This alarm indicates that a new Synciq policy has been discovered. -
Resolution:
- The administrator needs to set the mode and enable the policy to bring this policy into production service within Eyeglass.
SCA0080
Title: UNCONFIGURED_JOB_PRESENT
Description: Unconfigured job present.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that a job in the job icon has a Synciq policy in an unconfigured state. -
Resolution:
- The administrator needs to set the mode (auto, DFS, or skip config) and then enable the job before this policy can be enabled for DR protection.
SCA0081
Title: READINESS_CHECK_WARNINGS
Description: Readiness job execution found warnings.
Severity: WARNING
Help on this Alarm:
-
Problem:
The failover readiness job detected a validation warning that affects your failover readiness. -
Resolution:
- Log in to Eyeglass and open the DR Dashboard.
- Review the validation tree for the configured failover mode and expand the tree to find the failed validation.
- The warning description is available in the DR Dashboard by selecting the validation in the tree and reading the description at the bottom of the window.
SCA0087
Title: SYNCIQ_MONITOR_ERRORS
Description: SyncIQ Monitor Job errors.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm is raised when the SyncIQ data integrity monitor feature is enabled. This feature creates a test file on the source policy path; each time SyncIQ runs, this file is audited on the target cluster to verify that the file was synced correctly. If the file audit fails, this alarm will be raised to warn the administrator that a data sync failed. -
Resolution:
- Check the SyncIQ logs for detailed information about the sync failure.
- Verify the configuration of the source and target policies.
- Ensure that the network connections between the source and target clusters are functioning properly.
SCA0088
Title: CONFIG_BACKUP_JOB_FAILURE
Description: Configuration backup job failed.
Severity: WARNING
Help on this Alarm:
-
Problem:
The appliance backup zip creation process has failed. -
Resolution:
- Open the jobs icon for running jobs, click on the job, and review the reason for the failure.
SCA0094
Title: RAM_UNDERSIZED
Description: The RAM size exceeded the published limits.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
The cluster count, object count (SMB, NFS, quota), policy count, access zone count, or a combination of these requirements necessitate additional RAM on the Eyeglass VM to maintain support as per the product documentation: here. Any use of the product that is below product specifications voids support. -
Resolution:
- Increase the RAM on the Eyeglass VM to meet the published limits.
- Review the product documentation for recommended configurations and specifications.
AG0001
Title: AIRGAP_JOB_FAILED
Description: Ransomware Defender AirGap job <Job_Name>
failed - needs attention.
Severity: CRITICAL
Help on this Alarm:
- Resolution:
- Open a case with support if you receive this alarm.
AG0002
Title: AIRGAP_JOB_SUCCEEDED
Description: Ransomware Defender AirGap job <JOB_NAME>
succeeded.
Severity: INFORMATIONAL
Help on this Alarm:
Each time the AirGap SyncIQ policies run, the AirGap is opened to execute the policies. This alarm confirms that the AirGap was opened, the sync jobs ran successfully, and the AirGap was closed after the sync jobs finished.
AG0003
Title: AIRGAP_ROUTE_CLOSED
Description: AirGap network is now DISCONNECTED for policy <POLICY_NAME>
Severity: INFORMATIONAL
Help on this Alarm:
- Problem:
This alarm is raised after an Airgap sync job is completed and confirms the Airgap was closed at the end of the sync to the vault.
AG0004
Title: AIRGAP_ROUTE_OPEN
Description: AirGap network is now CONNECTED for policy <POLICY_NAME>
Severity: INFORMATIONAL
Help on this Alarm:
- Problem:
Each time the Airgap jobs execute, the Airgap network is opened. This alarm indicates the network is open before the sync job starts.
AG0005
Title: AIRGAP_JOB_DISABLED_FOR_ACTIVE_EVENT
Description: AirGap job on policy <POLICY_NAME>
- cancelled for active RSW event
Severity: WARNING
Help on this Alarm:
-
Problem:
Each time a scheduled Airgap sync is scheduled, Ransomware Defender active alarms are checked for any active threat to the production data. The sync and airgap should remain closed to protect the Vault data integrity. This alarm indicates an active threat alarm is raised, and the scheduled data sync will be skipped. -
Resolution:
- The action required when this alarm is raised is for the Ransomware Defender administrator to validate the alarm and clear the alarm.
- Until active alarms are cleared, no data sync will occur, and each scheduled data sync will generate a new alarm to warn the airgap administrator that no data sync will occur.
AG0006
Title: VAULT_AIRGAP_JOB_FAILED
Description: Vault AirGap job <JOB_NAME>
failed - needs attention
Severity: MAJOR
Help on this Alarm:
- Resolution:
- Open a case with support if you get this alarm.
AG0007
Title: VAULT_AIRGAP_JOB_SUCCEEDED
Description: Vault AirGap job <JOB_NAME>
succeeded
Severity: INFORMATIONAL
AG0008
Title: VAULT_OPENED
Description: Vault is opened
Severity: CRITICAL
Help on this Alarm:
- Problem:
This alarm indicates the secure airgap vault is currently open.
AG0009
Title: AIRGAP_EVENT_RETRIEVE_JOB_FAILED
Description: Ransomware Defender Event Retriever AirGap job <JOB_NAME>
failed - needs attention
Severity: CRITICAL
Help on this Alarm:
- Resolution:
- Open a case with support if you get this alarm.
AG0010
Title: AIRGAP_EVENT_RETRIEVE_JOB_SUCCEEDED
Description: Ransomware Defender AirGap Event Retriever job <JOB_NAME>
succeeded
Severity: INFORMATIONAL
Help on this Alarm:
The alarms were successfully retrieved from the vault cluster.
AG0011
Title: AIRGAP_SCHEDULES_QUERY
Description: Ransomware Defender AirGap Schedules query
Severity: INFORMATIONAL
Help on this Alarm:
This means the Enterprise vault opened the vault and queried Eyeglass for new policies or schedule changes.
AG0012
Title: AIRGAP_JOB_NOT_STARTED
Description: Ransomware Defender AirGap Job did not run according to schedule
Severity: CRITICAL
Help on this Alarm:
- Resolution:
- Open a case with support if you get this alarm.
AG0013
Title: AIRGAP_JOB_STARTED
Description: Ransomware Defender AirGap Job Started.
Severity: INFORMATIONAL
Help on this Alarm:
- Problem:
This informs you that the airgap job has started execution.
AG0014
Title: AIRGAP_POLICY_CHANGED
Description: AirGap policy changed
Severity: CRITICAL
Help on this Alarm:
- Problem:
This indicates an administrator has modified an airgap policy. This event should be investigated to ensure that this is not a security breach and that the change was approved. The information on this event lists the change that was detected.
RSW0001
Title: RANSOMWARE_DEFENDER_EVENT
Description: Ransomware signal received
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm is raised for each security event raised for a user. This should be reviewed in the active events tab of the ransomware defender icon to determine the shares affected or lock-out status for this event. -
Resolution:
- Actions possible are lockout, recover, and initiate self-recovery options.
RSW0002
Title: RANSOMWARE_USER_LOCKED
Description: Locked user access
Severity: CRITICAL
Help on this Alarm:
-
Problem:
A user was locked out with a Major or Critical detection security event. Consult the Ransomware Defender GUI to identify the user ID and IP address of the machine where the user was logged in. -
Resolution:
- Files affected can be exported to a CSV to inspect and remediate compromised files.
- The data recovery initiated option can be used to recover files from DR and snapshots.
- The user can be restored using the restore user access option.
RSW0003
Title: RANSOMWARE_USER_LOCK_FAILED
Description: Failed to lock user access after ransomware events received
Severity: CRITICAL
Help on this Alarm:
-
Problem:
A lock of a user account was not successful. Consult the log for the action menu to see which shares or clusters the lockout job failed. This indicates these shares are not locked out for this user. -
Resolution:
- Manual lockout of the share for the affected user should be done as the lockout is not retried.
- It can be retried from the actions menu.
RSW0004
Title: RANSOMWARE_ECA_IGLS_SERVICE_FAILURE
Description: ECA Service not forwarding all security events
Severity: MAJOR
Help on this Alarm:
-
Problem:
One or more nodes on an ECA cluster are not successfully communicating with the Eyeglass appliance. -
Resolution:
- Click on the Manage Services icon on the Eyeglass desktop, and use the status of each ECA node and its sub-components to determine which ECA node is unhealthy.
RSW0005
Title: RANSOMWARE_ECA_HBASE_FAILURE
Description: ECA Service not scanning Hbase for events
Severity: MAJOR
Help on this Alarm:
-
Problem:
If the alarm in SCA0064 is raised, Eyeglass will attempt to scan the ransomware signals database for new signals periodically. If this scanning cannot happen, this alarm will be raised. -
Resolution:
- Restore Eyeglass to Hbase connectivity.
RSW0006
Title: RANSOMWARE_ECA_COMM_FAILURE
Description: ECA Service is unreachable to scan for events
Severity: MAJOR
Help on this Alarm:
-
Problem:
The Eyeglass VM cannot reach the ECA cluster to scan the analytics database. This impacts the detection of Ransomware events. -
Resolution:
- This should be fixed by checking networking and using the ecactl CLI commands in the ransomware admin guide for troubleshooting.
RSW0008
Title: RANSOMWARE_ENTER_MONITOR_MODE
Description: Ransomware: Entered monitor-only mode
Severity: MAJOR
Help on this Alarm:
- Problem:
When the Eyeglass Ransomware settings have monitor mode enabled, no lockout will occur. This alarm serves as a reminder that no data protection is enabled with monitor mode.
RSW0009
Title: RANSOMWARE_LEAVE_MONITOR_MODE
Description: Ransomware: Left monitor-only mode
Severity: MAJOR
Help on this Alarm:
- Problem:
When the Eyeglass Ransomware Defender setting disables monitor mode, this alarm indicates that data protection monitoring is now active.
RSW0010
Title: RANSOMWARE_ECA_VERSION
Description: Ransomware: ECA node version does not match the Eyeglass version.
Severity: MAJOR
Help on this Alarm:
-
Problem:
If the ECA cluster version does not match the Eyeglass version, this alarm is raised. The versions should match for proper functionality. -
Resolution:
- Ensure that the ECA and Eyeglass versions are upgraded to match each other.
RSW0011
Title: RSW_RESTORE_ACCESS_SUCCESS
Description: User access restored.
Severity: INFORMATIONAL
Help on this Alarm:
-
Problem:
The user's permissions on the security event were restored for all shares with a lockout. -
Resolution:
- Check the security event history to verify the list of shares where access was restored.
RSW0012
Title: RSW_RESTORE_ACCESS_FAILED
Description: Failed to restore user access
Severity: CRITICAL
Help on this Alarm:
-
Problem:
The user's permissions on the security event were not all restored. -
Resolution:
- Open the security event history to review the list of shares that were not successfully restored and manually edit the share permissions to remove the deny read permission for the user named in the security event.
RSW0013
Title: RSW_SNAPSHOT_WARNING
Description: Not all snapshots were created.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm is raised by the Ransomware Defender when a snapshot creation failed on one or more shares detected as part of a security event for a user. -
Resolution:
- Check the Active Security Event tab and review the list of shares for the security event. Compare this with the snapshot list column to identify which cluster or path does not have a snapshot created.
- Open the Job icon, navigate to the Running Jobs tab, find the Snapshot Creation Job, and identify the cluster and path that failed to create the snapshot.
- Alternatively, select the Security Action menu and run "Create Snapshot" to retry the snapshot creation.
Possible reasons for failure: licensing on PowerScale, or the Eyeglass Service Account Snapshot Create Permission is missing (To fix this, add Snapshot Create Permission to the Eyeglass Service Account).
RSW0014
Title: RSW_SNAPSHOT_FAILED
Description: Failed to create snapshots.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
See the explanation on alarm SCA0075 for details and steps to attempt another snapshot creation or debug the issue. -
Resolution:
- Refer to alarm SCA0075 for instructions on resolving snapshot creation failures.
RSW0015
Title: RSW_SNAPSHOT_DELETE_WARNING
Description: Not all snapshots were deleted.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that an attempt was made to delete a security event snapshot from the GUI, but the deletion failed. Snapshots have a 48-hour expiry limit. If an attempt is made to delete the snapshot sooner from the Ransomware Defender security event action menu, it results in this error code. -
Resolution:
- Check that eyeglass role permissions are set correctly to allow deletion of snapshots.
RSW0016
Title: RSW_SNAPSHOT_DELETE_FAILED
Description: Failed to delete snapshots.
Severity: MAJOR
Help on this Alarm:
-
Problem:
See the explanation for alarm SCA0077. -
Resolution:
- Refer to alarm SCA0077 for instructions and steps to debug the issue or attempt another deletion of the snapshots.
RSW0017
Title: HBASE_UPGRADE_FAILURE
Description: HBASE upgrade failed.
Severity: INFORMATIONAL
Help on this Alarm:
-
Problem:
The database stored on the PowerScale upgrade failed to complete. -
Resolution:
- Contact support for assistance regarding the failed HBASE upgrade.
RSW0018
Title: RANSOMWARE_DEFENDER_UNDER_LICENSED
Description: The Ransomware Defender is under-licensed.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm is raised when Eyeglass has more licenses and clusters added than the Ransomware agent license installed. Each monitored cluster requires an agent license. -
Resolution:
- Contact
sales@superna.net
to get a quote for agent licenses.
- Contact
RSW0020
Title: BACKUP_INGESTION_FAILURE
Description: Turboaudit backup ingestion failed.
Severity: WARNING
Help on this Alarm:
-
Problem:
This error means the historical event ingestion did not process all files. -
Resolution:
- Please open a support case for instructions to resolve this issue.
RSW0021
Title: SECURITY_EVENT_FAILED
Description: Security Event Received but Failed to Process.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
The Ransomware Defender or Easy Auditor products use the ECA cluster to process audit events. When a security event is detected, a REST message is sent to Eyeglass to process the event. This error indicates Eyeglass received a security event but could not connect to the ECA cluster to retrieve specifics about the event. This is a critical event indicating networking or reachability issues between Eyeglass and the ECA cluster. -
Resolution:
- This issue should be debugged immediately as it is possible that further security events will not be reported in the Eyeglass UI or processed with automated response actions until corrected.
RSW0022
Title: SECURITY_EVENT_RETURNED_ZERO
Description: Security event received but failed to return affected file list.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates a high rate of false positive detections. -
Resolution:
- A support case is required to adjust the settings to mitigate this issue.
EAU0009
Title: ROBO_AUDIT_SUCCESS
Description: Robo Audit succeeded.
Severity: INFORMATIONAL
Help on this Alarm:
-
Problem:
This alarm confirms that the RoboAudit schedule health check completed successfully and the audit data is being processed and stored. -
Resolution:
- No action required; this is an informational alarm indicating successful completion of the audit.
ECA0001
Title: ECA_SERVICE_INFO
Description: Eyeglass Clustered Agent information.
Severity: INFORMATIONAL
Help on this Alarm:
-
Problem:
This alarm serves to provide information about the Eyeglass Clustered Agent. It does not indicate any issues. -
Resolution:
- No action required; this is an informational alarm that does not require user intervention.
ECA0002
Title: ECA_CRITICAL_ERROR
Description: Eyeglass Clustered Agent unexpected error.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
The clustered agent has encountered an unexpected error that needs to be investigated. -
Resolution:
- Open a support case and provide the relevant support logs for investigation.
ECA0003
Title: ECA_SERVICE_WARNING
Description: Eyeglass Clustered Agent warning.
Severity: WARNING
Help on this Alarm:
-
Problem:
The Eyeglass clustered agent is in a warning state, which can potentially impact its operational state. -
Resolution:
- Open the managed service icon to identify the warning and the component name.
- Open a support case, providing the component name and error found in managed services for further resolution procedures.
ECA0004
Title: ECA_MINOR_ERROR
Description: Eyeglass Clustered Agent unexpected minor error.
Severity: MINOR
Help on this Alarm:
-
Problem:
The Eyeglass clustered agent has a component in a minor errored state; this will not negatively affect operations. -
Resolution:
- Open the managed service icon to identify the warning and the component name.
- Open a support case, providing the component name and error found in managed services for further resolution procedures.
ECA0005
Title: ECA_MAJOR_ERROR
Description: Eyeglass Clustered Agent unexpected major error.
Severity: MAJOR
Help on this Alarm:
-
Problem:
The Eyeglass clustered agent has a component in a major errored state, which will negatively affect operations. Additionally, ingestion of audit data is slowing down due to a buildup of archived GZ audit files on the cluster. -
Resolution:
- Open the managed service icon to identify the warning and the component name.
- Open a support case, providing the component name and error found in managed services for further resolution procedures.
NOTE: If this alarm includes the following information:
{
"info": "Turboaudit is lagging behind 1396 seconds processing events.",
"service": "Turboaudit Lag Monitor",
"severity": "MAJOR",
"source": "ecapdcl2_2"
}You can address the slowdown using one of two options:
-
Prune Old GZ Archived Files:
- Using OneFS 9.2 or later, enable the option to prune old GZ archived files to save space on the cluster and set prune settings for the number of days to retain GZ files.
- NOTE: GZ files contain historical audit data that can be deleted. If long-term storage of audit data is required, follow Dell's procedure to move GZ files to an alternate location from the default location under
/ifs/.ifsvar
. Open a Dell SR for instructions on moving old GZ files to a long-term storage location. - Easy Auditor customers already have a long-term copy of the GZ data in the Archive database and can purge old GZ files. It's recommended to keep 6 months of GZ files.
-
Enable REST API NFSv4 Audit Data Ingestion:
- To keep GZ files in place on the cluster, enable REST API NFSv4 audit data ingestion.
- NOTE: Keeping GZ files in place has no value as the
isi_audit_viewer
command cannot read through significant numbers of GZ files. They should be archived using Option 1 above. - For configuration, refer to the Installation Guide.
- NOTE: Keeping GZ files in place has no value as the
- To keep GZ files in place on the cluster, enable REST API NFSv4 audit data ingestion.
ECA0006
Title: ECA_FATAL_ERROR
Description: Eyeglass Clustered Agent unexpected fatal error.
Severity: FATAL
Help on this Alarm:
-
Problem:
The Eyeglass clustered agent has a component in a crashed state, which will negatively affect operations. Additionally, ingestion of audit data is slowing down due to a buildup of archived GZ audit files on the cluster. -
Resolution:
- Open the managed service icon to identify the warning and the component name.
- Open a support case and provide the component name and error found in managed services for further resolution procedures.
NOTE: If this alarm includes the following information:
{
"info": "Turboaudit is lagging behind 1396 seconds processing events.",
"service": "Turboaudit Lag Monitor",
"severity": "MAJOR",
"source": "ecapdcl2_2"
}You can address the slowdown using one of two options:
-
Prune Old GZ Archived Files:
- Using OneFS 9.2 or later, enable the option to prune old GZ archived files to save space on the cluster and set prune settings for the number of days to retain GZ files.
- NOTE: GZ files contain historical audit data that can be deleted. If long-term storage of audit data is required, follow Dell's procedure to move GZ files to an alternate location from the default location under
/ifs/.ifsvar
. Open a Dell SR for instructions on moving old GZ files to a long-term storage location. - Easy Auditor customers already have a long-term copy of the GZ data in the Archive database and can purge old GZ files. It's recommended to keep 6 months of GZ files.
-
Enable REST API NFSv4 Audit Data Ingestion:
- To keep GZ files in place on the cluster, enable REST API NFSv4 audit data ingestion.
- NOTE: Keeping GZ files in place has no value as the
isi_audit_viewer
command cannot read through significant numbers of GZ files. They should be archived using Option 1 above. - For configuration, refer to the Installation Guide.
- NOTE: Keeping GZ files in place has no value as the
- To keep GZ files in place on the cluster, enable REST API NFSv4 audit data ingestion.
ES001
Description: Eyeglass Search - Trace Notification
Help on this Alarm:
Debug level messages.
ES002
Description: Eyeglass Search - Informational Notification
Help on this Alarm:
Not an error, indicates a task completed.
ES003
Description: Eyeglass Search - Warning Notification
Help on this Alarm:
Warning condition on a task.
ES004
Description: Eyeglass Search - Minor Notification
Help on this Alarm:
An alarm that has a minor functional impact.
ES005
Description: Eyeglass Search - Major Notification
Help on this Alarm:
Functional impact.
ES006
Description: Critical error in the Eyeglass Search subsystem
Severity: CRITICAL
Help on this Alarm:
Product functionality impacted alarm.
ES007
Description: Successfully initialized module
Help on this Alarm:
Low-impact state change.
ES008
Description: Failed to initialize module
Help on this Alarm:
The module could not start a task.
ES009
Description: Consecutive service failure
Help on this Alarm:
A repeating failure condition.
ES010
Description: Retrieved subsystem successfully
ES011
Description: Created subsystem successfully
ES012
Description: Deleted subsystem successfully
ES013
Description: Modified subsystem successfully
ES014
Description: Failed to retrieve subsystem
ES015
Description: Failed to create a subsystem
ES016
Description: Failed to delete subsystem
ES017
Description: Failed to modify subsystem
ES018
Description: Added configuration successfully
ES019
Description: Deleted configuration successfully
ES020
Description: Modified configuration successfully
ES021
Description: Job started successfully
ES022
Description: Job completed successfully
ES023
Description: Job failed to run
ES024
Description: Login successful
ES025
Description: Login failed. Case
DRSDEDGE0001
Title: DRSDEDGE_NODE_COUNT_VIOLATION
Description: The node count limitation has been exceeded for the application DRSDEdge.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
The PowerScaleSD clusters added to Eyeglass exceed the license key count for PowerScaleSD clusters. -
Resolution:
- Please open a support case to determine the next steps and impact on your environment.
SETUP0001
Title: SETUP_REDISCOVER_REQUEST
Description: Database is in a corrupt, unrepairable state. Please schedule a time to execute a rediscover operation (igls appliance rediscover).
Severity: CRITICAL
Help on this Alarm:
-
Problem:
The alarm is issued during inventory updates in the database if this process fails. Eyeglass will attempt to correct any failure in the process, but at times, this is impossible; for example, if Eyeglass is shut down incorrectly or hard powered off—both operations can result in database content becoming corrupt. -
Resolution:
- Rebuild the database with the command
igls appliance rediscover
. This command will delete the database and rebuild it from API calls to the cluster. This operation is safe to run. - If this error occurs more than once, open a support case with the error code and upload support logs.
- Rebuild the database with the command
Impact: Blocking saves to the database impacts failover readiness validations from completing and will not allow an accurate view of DR readiness until addressed.
SETUP0002
Title: SETUP_INVENTORY_DUPLICATES_CRITICAL
Description: Duplicate inventory found; inventory not saved to the database as requested in file system.xml (tag manageinventoryduplicates
). Please correct this manually on your cluster or set the tag to 'DELETE' to allow the system to ignore those entries.
Severity: CRITICAL
Help on this Alarm:
-
Problem:
This alarm indicates that duplicate cluster objects have been detected, which blocks saving to the database. This is normally an issue with cluster configuration that should be fixed on the cluster. -
Resolution:
- Open a support case to identify the corrupt cluster configuration data.
- As a workaround, set the tag to 'DELETE' in the system.xml file to allow duplicates to be removed before saving. Consult with support to apply this setting.
Impact: Blocking saves to the database impacts failover readiness validations from completing and will not allow an accurate view of DR readiness until addressed.
SETUP0003
Title: SETUP_INVENTORY_DUPLICATES_WARNING
Description: Duplicate inventory found; duplicate inventory has been ignored, and data is saved to the database. Please correct this manually on your cluster.
Severity: WARNING
Help on this Alarm:
-
Problem:
This alarm indicates that duplicated data was detected and removed before saving it to the database. The duplicate skip flag is set in system.xml to continue saving to the database after removing duplicate inventory data. Duplicate inventory indicates an issue with the cluster configuration data, which should be fixed on the cluster. Duplicates can be syncIQ policy objects, share permissions, and other configuration data. -
Resolution:
- Correct the issues with duplicate inventory manually on your cluster.
Impact: No functional impact to Eyeglass functions; this alarm indicates an issue was found in the cluster configuration data.
LM0001
Title: LM_LICENSE_TO_EXPIRE
Description: License(s) to expire.
Severity: WARNING
Help on this Alarm:
-
Problem:
This means the support license will expire soon, and a renewal order is required. -
Resolution:
- Initiate the renewal order for the support license.
LM0002
Title: LM_LICENSE_HAS_EXPIRED
Description: License(s) expired.
Severity: MAJOR
Help on this Alarm:
-
Problem:
This alarm indicates that maintenance has expired, and access to support and software will not be possible until the new key is replaced. -
Resolution:
- Replace the expired license with a new key to restore access to support and software.