Skip to main content
Version: 2.14.1

DASM Administration Guide

Administration and Feature Usage

Typical Use Cases

  1. Identify high risk hosts for rapid remediation, patching and hardening.
    • Features needed: The base feature that produces an AI predicted data attack surface
  2. High Security Data Risk. Set security baseline for hosts or users to access data
    • Features needed: Dynamic DataShield autonomous enforcement policies

Data Attack Surface Manager Main Dashboard

The main dashboard shows the data attack surface. Each row represents a host/user combination and the columns indicate the contributing factors to the Data Risk Score shown in the Vulnerability Level column:

  • PII Score — If Data Classification is enabled, rates the level of PII data detected from this host and user combination.
  • OS Type — The OS of the client machine accessing the storage.
  • Access List — The number of SMB or NFS exports the host has access to by user or by IP address.
  • SMB/NFS Shares (last 7 days) — Shows the usage view alongside the permissions view. This provides a data overexposed view of data that allows remediation to focus on users and hosts where a lot of permissions exist with very little usage.
  • Open Ports — The listening ports on the client that could be used by an attacker to compromise the host.
  • Anomaly — Indicates if the user behavior profile shows any abnormal deviations of reads or writes to data that would indicate suspicious user behavior.
  • Data Integrity Score — Shows the linguistic coherence of the file. This technology reads the content of files and can detect subtle data encryption of a slow attack or GenAI attack on the meaning of the data. It scores the probability the content is low quality or modified and has lost its semantic meaning.
  • CVE Score — An average of all CVEs found in the scan results.
  • CVE Description — A list of some of the CVEs found on the host.
  • CVE Last Scan — Indicates how current the scan results are when used to predict the Data Risk Score.

Data Attack Surface Manager Offensive Actions

Offensive Remediation

DVM supports remediation for high risk hosts and users with host level blocking. This allows high risk hosts to be blocked from accessing data until remediation can be completed. Unblocking restores the host's access to data. The dashboard provides a block and unblock action button per host row.

Risk Discovery

Identify Data Attack Surface hosts with no vulnerability scan results — exposing data and user accounts to breach from zero day attacks or a malicious actor that has joined a machine to the network. SOC Analysts can choose to block hosts without a completed vulnerability scan assessment directly from the dashboard.

Autonomous Data Shield (DDS) Policies

The DDS policies allow creation of policies that automate the remediation of high risk data score hosts and users from accessing data until minimum security assessment is complete or the DRS (Data Risk Score) drops below a specific threshold.

Create policy requires the following fields (matching criteria uses AND logic for all 3 fields):

  1. name — text field — describes the policy purpose.
  2. source host IP — IP address range — x.x.x.x/yy syntax to identify source computers by network. Default * means any source IP.
  3. Data Risk Score (mandatory) — drop down of risk scores >= greater than or equal to. Example: medium means medium and above DRS.
  4. Path — absolute path, example /ifs/marketing (treated as /ifs/marketing/* — all paths under the marketing folder). Device name is ignored, so a path applies to any device under management. Default is any path.
  5. Action
    • Once a policy is created, DDS monitors incoming audit data to match criteria and enforce host lockouts.
    • Unblock logic — DDS reassesses on an hourly basis. If the AI model assessment indicates the threat level has dropped equal to or below the DRS score, the host level unblock logic executes and a webhook alert is sent.
    • Host lockouts update the policy dashboard to display status.
    • The main dashboard shows a column for policies under "Autonomous data protection policy name".
    • Alerting — Each time a policy acts with a lockout response, a webhook using the Superna Zero Trust payload schema is sent so that Incident Response teams are informed that autonomous security policies have been activated or deactivated.

Dynamic DataShield Policy Operations

Overview Video

When DDS policies activate, the dashboard shows locked status and the policy name that triggered the lockout on the host. The policy dashboard displays:

  • A list of all active policies with their matching criteria (name, source IP range, DRS threshold, path).
  • The current status of each policy (active or inactive).
  • For each locked host: the host IP, the policy name that triggered the lockout, and the timestamp of the lockout event.

Each hour the Superna Data AI inferencing runs to re-evaluate the Data Risk Score of the attack surface including any hosts that a lockout has been applied. If the DRS score falls below the policy definition the lockout is removed.

Data Classification - How it Works

Overview

This feature adds valuable data to the Superna AI model to influence the Data Risk Score based on the type of data a user is accessing. The classification detects and classifies using the following methods:

  • Pretrained spaCy models (for names, locations, etc.)
  • Regex patterns (e.g., SSNs, credit card numbers)
  • Checksum validation (for entities like credit cards and IBANs)

How it works

Users that are detected on high risk data attack surface hosts have their most recent activity sampled with a default of 10% of the files touched by a user. The results are leveraged to compute a PII score between 0 to 10 with 10 being the highest, and this influences the AI model predictions.

The current solution uses snapshots on file systems that are NFS mounted to read data over a read only NFS mount. The snapshot is deleted after the processing of user files is completed. This ensures the snapshot will not consume space when it is not required.

How To Configure Classification Modes

Edit the environment variable CLASSIFICATION_MODE on the host:

export CLASSIFICATION_MODE=x

Where x is:

  • 1 = regex classifiers only
  • 2 = spaCy with en-core-web-sm (small model) — person entity classifiers
  • 3 = spaCy with en-core-web-lg (large model) — person entity classifiers

Data Classification Categories

Identity & Demographics (6)

Entity TypeDescription
PERSONFull names or identifiable persons
AGEAge mentions
GENDERGender expressions
DATE_TIMEDate and time values
US_PASSPORTU.S. passport numbers
US_DRIVER_LICENSEU.S. driver's license numbers

Medical - Healthcare (3)

Custom Superna classifiers.

Entity TypeDescription
MRN - Medical Record NumberA Medical Record Number (MRN) is a unique identifier assigned to a patient's medical record within a specific healthcare organization.
National Provider Identifier (NPI)Detects National Provider Identifier (NPI) numbers, which are 10-digit numeric identifiers used in U.S. healthcare for uniquely identifying providers. Applies Luhn checksum validation with the required "80840" prefix during calculation. Ensures only valid NPIs are returned (invalid check digits are rejected).
UUID128-bit identifiers used to uniquely tag objects. In healthcare data, UUIDs represent PATIENT_ID values.

Sensitive Numbers & Codes (7)

CategoryCount
Identity & Demographics6
Sensitive Numbers & Codes7
Digital Identifiers & Network6
Location-Based Identifiers2
Professional & Organizational2
National IDs (Non-US)1
Total24

Digital Identifiers & Network Data (6)

Entity TypeDescription
IP_ADDRESSIPv4 or IPv6 addresses
EMAIL_ADDRESSEmail addresses
URLWeb URLs
DOMAIN_NAMEDomain names
AWS_ACCESS_KEYAmazon Web Services access key
AZURE_STORAGE_KEYMicrosoft Azure storage keys

Location-Based Identifiers (2)

Entity TypeDescription
LOCATIONCities, countries, etc.
US_ZIP_CODEU.S. postal codes

Professional & Organizational (2)

Entity TypeDescription
ORGANIZATIONCompany or org names
MEDICAL_LICENSEMedical license identifiers

National IDs (Non-US) (1)

Entity TypeDescription
NRPSingapore National Registration ID

Summary Table

CategoryCount
Identity & Demographics6
Sensitive Numbers & Codes7
Digital Identifiers & Network6
Location-Based Identifiers2
Professional & Organizational2
National IDs (Non-US)1
Total24

Data Classification Exposure Analytics

This reporting dashboard offers insight into your PII exposure with 2 different dashboards that summarize where your risks lie and which users and computers are connected to this risk.

SMB/NFS PII Exposure Detection

This dashboard shows users that create PII content and which host they use, along with a summary of the types of PII found for this user's activity, along with a score between 1–10, where 10 represents a high level of PII exposure. The Data Risk Score for the host/user is also shown to help prioritize remediation.

The dashboard displays the following columns per row:

  • User — the AD user account that touched PII data
  • Host — the client machine used to access the data
  • PII Types Detected — a summary of the PII categories found in the files accessed by this user
  • PII Score — a score from 1–10 indicating the level of PII exposure
  • Data Risk Score — the overall host/user risk score from the AI model

Clicking the user or client number drills into the users or clients that touched PII data on a specific SMB share in the last 7 days.

How this capability helps reduce risk

  1. Pinpoint Risk Hotspots — Discover where sensitive data is most vulnerable by mapping actual user interactions with PII across SMB shares and NFS exports. Prioritize protection efforts based on real usage, not assumptions.
  2. Quantify Exfiltration Risk in Real Terms — Move beyond generic risk scoring by measuring the true threat: who accessed what, when, and how much sensitive data was involved. Turn unstructured data sprawl into a targeted risk profile.
  3. Enable Targeted Remediation — Focus remediation efforts on shares or exports with high PII concentration and high user interaction, reducing false positives and maximizing the impact of security operations.
  4. Bridge Data Security with Compliance Monitoring — Deliver auditable insights into how and where sensitive data is exposed — empowering compliance teams with contextual, actionable evidence tied to user behavior.

The SMB/NFS view shows where PII data is concentrated and displays the number of users and hosts involved with PII data per share. Each share row shows:

  • Share name — the SMB share or NFS export path
  • PII File Count — number of files containing PII data
  • User Count — number of users that accessed PII data on this share
  • Host Count — number of client machines that accessed PII data on this share

Permission vs. Usage Over-Exposure Analysis

This dashboard monitors user activity and expands share level permissions to build a ratio of access vs permissions. In order to reduce risk and data exposure, the least privilege model requires users that do not access data to have it removed from the ACL or share level permissions to shrink the Data Attack Surface.

The dashboard shows per user:

  • Permissions — the number of shares the user has permission to access
  • Usage — the number of shares the user actually accessed in the last 7 days
  • Over-Exposure Ratio — the ratio of permissions to usage, where a high ratio indicates over-privileged access
  • Recommended Action — whether to remove the user from the ACL of specific shares

How This Capability Helps Reduce Risk

  1. Shrink the Data Attack Surface with Precision — Identify users who have access to data they don't use. Reduce risk exposure from dormant or excessive permissions and enforce least-privilege access policies intelligently.
  2. Operationalize Zero Trust at the File System Layer — Go beyond static access control lists — use real-world activity to justify or revoke access. DASM aligns with Zero Trust principles by validating actual need-to-know.
  3. Turn Audit Logs into Proactive Access Governance — Convert audit trails into actionable access intelligence. Automate the detection of access drift and help teams surgically close privilege gaps before they're exploited.
  4. Drive Access Reviews with Usage Context — Equip IT and security teams with usage-based insights that make access reviews faster, smarter, and more defensible — especially in regulated environments.

Data Risk & Exposure Dashboard

The Data Risk & Exposure Dashboard provides a consolidated view of the data attack surface across all monitored hosts and users. It includes:

  • A risk heatmap showing which hosts and users have the highest combined Data Risk Score
  • A breakdown of contributing risk factors: CVE score, anomaly score, PII score, open ports, and over-exposure ratio
  • Trend charts showing how the Data Risk Score has changed over time for the top risk hosts
  • A drill-down view per host showing the full list of contributing factors and recommended remediation actions

Data Integrity Dashboard

Overview

This guide provides administrators with the information needed to configure, operate, and maintain Superna's linguistic coherence technology for real-time content security. Unlike traditional classification tools, which identify and label sensitive data, this solution extends protection by validating the integrity and authenticity of content.

The system uses advanced linguistic analysis and entropy-based detection to identify subtle changes that may indicate:

  • Unauthorized content modification
  • Hidden or encrypted data insertion
  • Semantic shifts introduced by attackers or AI-driven tools

By continuously monitoring files, audit logs, and structured content, the platform delivers early warning of potential tampering and provides actionable signals for Content Threat Exposure Management (CTEM).

How it Works

The technology samples created or modified data throughout the day and checks the data integrity with a multi-layer assessment of the file contents. This includes a basic entropy test which checks the randomness of the data — a weak indicator of encryption — but when combined with Superna's linguistic coherence test that scores the meaning of the data as being high quality, low quality, or an indicator of tampering. This technology is multilingual and can detect sparse encryption of data.

The assessment is computed into a score between 1 to 10 with 10 indicating a high confidence of data tampering has been detected. This is mapped to users and computers that were used to originate the attack. The Dashboard allows reporting on the rollup of all data modifications under SMB shares and drill down to user or computers responsible for the data modifications.

Why Traditional Defenses Fall Short

Detecting modern content attacks is increasingly difficult because traditional security tools were not designed for the age of GenAI.

Entropy-based detection measures randomness and is effective at spotting obvious obfuscation like encryption, compression, or malware hidden in binary form. But GenAI-driven attacks don't need to raise entropy. They can generate fluent, natural-looking text that blends seamlessly into business documents, logs, or emails. Entropy scores remain normal — even though the meaning has been manipulated to mislead, hide fraud, or insert malicious instructions.

At the same time, backup-oriented defenses assume that restoring data from a "known good" copy will resolve an incident. This model fails against semantic attacks, where the attacker hasn't destroyed or encrypted data but has instead altered its meaning. In these cases:

  • The document looks normal.
  • The file opens without errors.
  • Backups faithfully preserve the corrupted content — because the "change" isn't corruption in the technical sense, but a subtle shift in semantics.

The result: restoration simply re-introduces the compromised version, leaving the organization blind to the integrity breach.

GenAI and semantic manipulation bypass both entropy-based detection and backup-centric recovery. To counter these threats, organizations need content-aware security that can evaluate not just the structure of data, but also its coherence and integrity of meaning. Without this layer of defense, attackers can operate undetected — quietly changing the story your data tells.

Dashboard with Data Integrity Scores

The Data Integrity Dashboard displays the linguistic coherence scores for all monitored files, organized by SMB share. It shows:

  • Share — the SMB share or NFS export path
  • Data Integrity Score — a score from 1–10 per share, where 10 indicates a high probability of data tampering
  • User — the AD user account responsible for the data modifications
  • Host — the client machine used to originate the modifications
  • File Count — the number of files with a high integrity risk score
  • Last Modified — the timestamp of the most recent modification flagged by the system

Clicking a share row drills into the individual files and users responsible for the modifications, showing the entropy score, linguistic coherence score, and the combined Data Integrity Score per file.

Operations

Local Login

  1. Open a new browser tab and navigate to https://<ip-address-of-dvm-vm>:5001/. It will open the initial login page.
    • The default user login is dasmadmin with default password Vuln3r@b1l1ty!

The login page shows a username and password field with the Superna DASM logo. After successful login, the main Data Attack Surface dashboard loads automatically.

Active Directory Login

The steps below configure AD authentication with role-based group support. This allows an AD group to control which users can log in to the Data Attack Surface Manager console.

  1. Edit the file below and enter the AD service account, key for session cookie, AD domain and remaining fields.

AD Authentication Configuration

Update the config file cvm_config.py with the variables based on your environment:

# cvm_config.py
# Configuration for DASM WebUI Login Authentication application

# ---------------------------
# Flask Application Settings
# ---------------------------
SECRET_KEY = "YOUR_FLASK_SECRET_KEY_HERE"

# -----------------------
# Local Admin Credentials
# -----------------------
LOCAL_USERS = {
'dasmadmin': {'password': 'Vuln3r$b1l1ty!'}
}

# ---------------------------
# Active Directory (AD/LDAP)
# ---------------------------
AD_HOST = "ad.example.com"
AD_USE_SSL = True
AD_PORT = 636
AD_DOMAIN = "EXAMPLE"
AD_VALIDATE_CERT = True
AD_CA_BUNDLE = "/path/to/ca_chain.pem"

# ---------------------------
# Service Account for AD Lookup
# ---------------------------
AD_SERVICE_ACCOUNT = r"EXAMPLE\ldap-reader"
AD_SERVICE_PASSWORD = "service_account_password"

# ---------------------------
# LDAP Search Bases
# ---------------------------
AD_BASE_DN = "DC=example,DC=com"
AD_ALLOWED_GROUP_DN = "CN=WebUI Users,OU=Security Groups,DC=example,DC=com"
  1. Log in with domain\user or user@domain syntax.

2FA with Google Authenticator

This solution supports login to a local user account or AD account with the addition of the One Time Password provided by the Google Authenticator application. The login requires setup of the key pair for the user account with Google Authenticator.

Setup flow:

  1. After first login, the system presents a QR code on screen.
  2. Scan the QR code with the Google Authenticator app on your mobile device.
  3. The app will add a new entry for DASM and start generating 6-digit one-time passwords.
  4. On subsequent logins, enter your username, password, and the current 6-digit OTP from the app.

The login page shows a third field labeled One-Time Password after entering valid credentials, where the 6-digit code is entered to complete authentication.

Alert and Webhook Configuration with Data Security Edition

To configure DVM to send webhooks to Eyeglass to leverage the integrations listed in the Integration Matrix, follow the configuration steps below. It is also possible to send webhooks directly to another endpoint assuming the endpoint can parse the payload.

  1. Log in to the DVM VM over SSH.
  2. Edit the configuration file:
nano /mnt/ml_data/ml-cvm/cvm_webui_rapid7.py
  1. Locate the section below and set the IP address to the Eyeglass VM IP address, if different from the DVM VM IP address:
DVM_WEBHOOK_IP = "x.x.x.x"      # Specify DVM Webhook Endpoint IP Address
DVM_WEBHOOK_PORT = "5000" # Specify DVM Webhook Endpoint Port Number
DVM_WEBHOOK_ENDPOINT = "/webhook" # Specify DVM Webhook Endpoint route
  1. Create an Eyeglass VM webhook integration to receive the DVM webhook alerts on port 5000 (or the value used if changed). Leave the endpoint route /webhook unless directed to change by support.
  2. Configure the integrations from the link above to configure which integration is used to send DVM alerts.
  3. If Eyeglass integrations already exist, you will need to change the port that DVM uses to send webhooks.
  4. Press Ctrl+X to save and exit.

Sample DVM Webhook JSON Payload for Data Containment Policy Enforcement

These payloads are sent when a data containment policy blocks or unblocks access to data based on matching criteria to the real time policies. The payload that is application specific is embedded in the extraparams section of the Zero Trust payload.

Download sample payloads:

Sample JSON Payload for Missing or Stale Vulnerability Scans

This feature monitors the attack surface vulnerability scans to ensure that a current scan exists within the last 7 days. This preemptive approach reduces risks that a vulnerability could put critical data at risk and flags this missing scan to the SOC.

Download sample payloads:

Software Start/Stop

To start or restart DVM processes:

cd /mnt/ml_data/ml-cvm
./cvm_check_restart.sh

To stop DVM processes:

cd /mnt/ml_data/ml-cvm
./cvm_stop_processes.sh

Log Gather For Support

cd /mnt/ml_data/ml-cvm
python3 cvm_loggather.py

This command generates a zip file that can be uploaded to support cases.

Data Classification Configuration

Data Processing

  1. Files are processed for each user on an hourly basis and the training model is built with this new classification risk data included in the data set.
  2. Only files smaller than 500 MB will be processed. Files on the exclusion list will be skipped.

Sensitive Tuning & File Type Filtering

  1. The data classification function will only register a file with a confidence level greater than 70%. This can be tuned higher to reduce false positives on low grade classification detections within files.
  2. The data classification feature supports over 1500 file types. No image file types are supported and are filtered out by default. The list of file types that will be ignored can be customized.
    • Edit /mnt/ml_data/ml-cvm/cvm_ext_config.json to add additional file extensions to be ignored. You can use * to wildcard part of the extension type.

The default exclusion list in cvm_ext_config.json contains common binary and image formats such as .exe, .dll, .jpg, .png, .mp4, and similar file types that are not relevant for PII text classification. Add any additional extensions your environment requires to this file to prevent unnecessary processing.