Air-Gapped Mode: Client-Side Anonymization
For organizations with strict Data Loss Prevention (DLP) policies, CloudWise provides client-side anonymization scripts that replace sensitive AWS identifiers with deterministic hashes before data leaves your network.
| Tier | Retention | Persistent Salt |
|---|---|---|
| Free / Shield | 7 days | ❌ No (regenerated each upload) |
| Compliance | 365 days | ✅ Yes (consistent across uploads) |
For quarterly trend analysis and audit reports, consider upgrading to Compliance tier.
Overview
Client-side anonymization ensures that:
- AWS Account IDs (12-digit numbers) are replaced with
acct_xxxxxxxxxxxxformat - Resource IDs (EC2 instances, EBS volumes, etc.) are replaced with
res_xxxxxxxxxxxxformat - Tag Keys (custom tag names) are replaced with
tag_xxxxxxxxxxxxformat to prevent DLP leaks - ARNs are preserved structurally but with hashed components
- Cost analysis remains accurate because hashing is deterministic
- No sensitive identifiers leave your network
Quick Start
Option 1: Use the Export Script with --anonymize
The CloudWise export script supports anonymization directly:
# Download the script and verify its checksum
curl -sLO https://cloudcostwise.io/scripts/cloudwise-export.sh
curl -sL https://cloudcostwise.io/scripts/SHA256SUMS | grep cloudwise-export.sh | (sha256sum -c 2>/dev/null || shasum -a 256 -c)
# Run with anonymization enabled
bash cloudwise-export.sh --anonymize
# Or with a custom salt for reproducible hashes
bash cloudwise-export.sh --anonymize --salt "your-secret-salt"
Option 2: Anonymize an Existing CUR File
If you already have a CUR CSV file, use the standalone anonymization script:
# Download the anonymization script and verify its checksum
curl -sLO https://cloudcostwise.io/scripts/cloudwise-anonymize-cur.py
curl -sL https://cloudcostwise.io/scripts/SHA256SUMS | grep cloudwise-anonymize-cur.py | (sha256sum -c 2>/dev/null || shasum -a 256 -c)
# Anonymize your CUR file (requires Python 3)
python3 cloudwise-anonymize-cur.py your-cur-file.csv -o anonymized-cur.csv
# (Optional) Generate a mapping file for internal reference
python3 cloudwise-anonymize-cur.py your-cur-file.csv -o anonymized-cur.csv --output-mapping mapping.csv
Anonymization Details
What Gets Anonymized
| Original Format | Anonymized Format | Example |
|---|---|---|
| AWS Account ID (12-digit) | acct_[12-char-hash] | 123456789012 → acct_a1b2c3d4e5f6 |
| EC2 Instance ID | res_[12-char-hash] | i-1234567890abcdef0 → res_e5f6g7h8i9j0 |
| EBS Volume ID | res_[12-char-hash] | vol-0abc123def456789 → res_f9e8d7c6b5a4 |
| Other Resource IDs | res_[12-char-hash] | sg-0123456789abcdef0 → res_b2c3d4e5f6g7 |
| Tag Keys (custom) | tag_[12-char-hash] | Project-Manhattan → tag_h8i9j0k1l2m3 |
| ARN Account Component | acct_[12-char-hash] | arn:aws:ec2:us-east-1:123456789012:... → arn:aws:ec2:us-east-1:acct_a1b2c3d4e5f6:... |
As of version 1.7.0, hashes are 12 characters (up from 8) to reduce collision risk at enterprise scale (50k+ resources).
How Hashing Works
Anonymization uses SHA256 hashing with a user-provided (or auto-generated) salt:
hash = SHA256(salt + original_value)
anonymized_id = prefix + first_12_characters_of_hash
Because hashing is deterministic:
- The same account ID always produces the same hash (with the same salt)
- Cross-referencing between uploads works correctly
- Cost aggregation by account remains accurate
Salt Management
The salt is a secret value that makes your hashes unique:
- Auto-generated salt: If you don't provide a salt, one is generated automatically
- Custom salt: Use
--salt "your-secret"for reproducible results across multiple exports - Keep your salt secure: Anyone with your salt and hash could theoretically brute-force original values
If you lose your salt, you won't be able to correlate new uploads with previous anonymized data.
Upload Validation
When you upload files to CloudWise, the upload wizard automatically detects whether your data is anonymized:
Anonymization Status Badges
| Badge | Meaning |
|---|---|
| 🛡️ Anonymized Data (green) | All identifiers are anonymized. Safe for DLP compliance. |
| ⚠️ Raw Data Detected (amber) | File contains raw AWS identifiers. Consider anonymizing first. |
| 🔶 Mixed Data (orange) | File contains both raw and anonymized identifiers. |
| ❓ Unknown (gray) | Could not determine anonymization status. |
Validation Step
Before uploading, you'll see a validation step that shows:
- Detected mode (raw, anonymized, mixed)
- Sample identifiers found in your files
- Warning if raw data is detected
- Option to proceed or go back and anonymize
Mapping Files
When anonymizing data, you can optionally generate a mapping file:
./anonymize-cur.sh input.csv -o output.csv --output-mapping mapping.csv
The mapping file contains:
original_account_id,anonymized_account_id
123456789012,acct_a1b2c3d4
987654321098,acct_f9e8d7c6
Never upload the mapping file to CloudWise. Keep it secure in your internal systems for reference only.
Best Practices
For DLP Compliance
- Always use
--anonymizewhen exporting data for CloudWise - Store your salt in a secure location (password manager, secrets manager)
- Verify the output before uploading by checking for
acct_andres_prefixes - Delete original files after anonymizing if required by your DLP policy
For Consistent Analysis
- Use the same salt across all exports to maintain identifier consistency
- Keep a local mapping file for internal incident investigation
- Document your anonymization process for audit purposes
Troubleshooting
"Raw Data Detected" Warning
If you see this warning during upload:
- Go back to the upload step
- Remove the current files
- Re-export using
--anonymizeflag or run the anonymization script - Upload the anonymized files
Hash Inconsistency Between Uploads
If the same account shows different hashes:
- Verify you're using the same salt value
- Check that the salt wasn't accidentally modified
- Consider re-anonymizing all historical data with a new consistent salt
Performance Issues with Large Files
For very large CUR files (>1GB):
# The script processes in streaming mode, but you can optimize:
# 1. Split your CUR file by month if possible
# 2. Run on a machine with more memory
# 3. Use the script directly (not piped through bash)
./anonymize-cur.sh large-file.csv -o output.csv
FAQ
Q: Why does my data expire after 7 days?
Air-Gapped Mode is designed for evaluation purposes. The 7-day data retention helps us keep infrastructure costs manageable for this free evaluation feature. For permanent data retention with automated monitoring, connect your AWS account to unlock Connected Mode.
Q: Can I extend the 7-day retention period?
Not in Air-Gapped Mode. If you need persistent data storage and historical tracking, please connect your AWS account. Connected Mode provides unlimited data retention based on your subscription tier.
Q: Can CloudWise reverse the anonymization?
No. SHA256 is a one-way hash. Without your salt and original data, the anonymized values cannot be reversed.
Q: Do I lose any analysis capabilities with anonymized data?
No. All cost analysis, waste detection, and optimization recommendations work the same way. Only the displayed identifiers are different.
Q: What if I need to investigate a specific resource?
Use your local mapping file to translate the anonymized ID back to the original AWS resource ID.
Q: Is the salt stored by CloudWise?
No. The salt never leaves your system. CloudWise only sees the already-anonymized data.
Q: Are tag keys also anonymized?
Yes, as of version 1.7.0. Custom tag keys (like Project-Manhattan or Client-Acme) are now anonymized to tag_xxxxxxxxxxxx format to prevent DLP leaks. Standard tag keys like Name, Environment, Owner, Team, and CostCenter are preserved for usability.
See Also
- Quick Start Guide - Get started with CloudWise
- AWS Setup Guide - Configure AWS for CloudWise integration
- FAQ - Common questions about data privacy and security