Configure Bulk Ingest
Requirements and Limitations​
Requirements​
- Targeted Date Event: The system supports data ingestion only for a specific, targeted date of an event. It does not support ingesting large time frames like days, weeks, months, or years of data. Users can only select files for submission within a window of the previous 3 days.
- Limited File Handling: The maximum number of .gz files that can be ingested per job is 20. For initial testing, it is recommended to use only one file.
- Job Scheduling: A cron job should be scheduled to run during non-peak hours to avoid performance issues, as the system prioritizes active audit data.
- Job Queueing: When submitting more than one ingestion job, they will be queued, and only one will execute at a time.
Limitations​
- No Bulk Support: Bulk ingestion is a background task, and it is not supported under the standard support contract. The system prioritizes processing of current audit data over bulk ingestion, meaning there is no way to predict how long bulk ingestion will take.
- Performance Impact: The system has no method to predict the time it will take to ingest audit data, as processing of active audit data takes precedence over bulk ingestion tasks.
- Concurrency Limitation: The system supports only 1 concurrent running job, with a queue for any additional jobs.
- Non-changeable Priority: There is no option to change the priority of bulk ingestion jobs, meaning they will always be lower priority than active audit data processing.
Configuration Steps​
Set Up NFS Ingestion​
Before using the Bulk Ingest feature, you must set up NFS ingestion on ECA clusters.
-
Create a Directory for Bulk Ingestion in the ECA Cluster
On the ECA nodes, create the directory for Bulk Ingestion by running:
ecactl cluster exec "sudo mkdir -p /opt/superna/mnt/bulkingestion/<cluster-guid>/<cluster-name>"
Replace
<cluster-guid>
with your Isilon cluster’s GUID and<cluster-name>
with its name.
If the directory exists, skip this step. -
Configure Bulk Ingestion Path (Non-Default Only)
To use a non-default path for
.gz
files, configure it via the Eyeglass CLI:igls config settings set --tag=bulkingestpath --value=<PATH>
Replace
<PATH>
with the full directory path on PowerScale where audit logs (.gz
) will be stored.noteDirectory Structure Requirement:
When using a non-default path, organize files as:
/PATH/node-name/protocol
Each node must have its own subdirectory under
<PATH>
, containing aprotocol
folder for.gz
files. -
Create an NFS Export in PowerScale (Non-Default Only)
If the export already exists (e.g., for default paths), skip this step.
For non-default paths, create the NFS export:isi nfs exports create <PATH> --root-clients="<eca-ips>" --clients="<eca-ips>" --read-only=true -f --description "Bulk ingest export" --all-dirs true
- Replace
<PATH>
with your custom path. - Include all ECA node IPs in
<eca-ips>
(comma-separated).
- Replace
-
Edit the Auto NFS Configuration on ECA Nodes
On each ECA node, edit
/opt/superna/eca/data/audit-nfs/auto.nfs
and add:/opt/superna/mnt/bulkingestion/<cluster-guid>/<cluster-name> --fstype=nfs,nfsvers=4,ro,soft <FQDN>:<PATH>
- Default Path: Replace
<PATH>
with/ifs/.ifsvar/audit/logs
. - Non-Default Path: Use the custom
<PATH>
from Step 2. - Replace
<cluster-guid>
,<cluster-name>
, and<FQDN>
with your cluster’s details.
- Default Path: Replace
-
Apply Configuration and Restart Services
Push the updated configuration and restart autofs:
ecactl cluster push-config
ecactl cluster exec "sudo systemctl restart autofs" -
Mount the NFS Export on ECA Nodes
Mount the export across all nodes using:ecactl cluster exec "sudo /opt/superna/eca/scripts/manual-mount.sh"
Alternatively, restart the cluster to trigger automatic mounting.
-
Test Bulk Ingest Feature
Verify the setup by testing Bulk Ingest through the Eyeglass UI.