Waste Detection Guide
CloudWise Waste Detection automatically identifies AWS resources that are costing you money without providing value. This guide explains how each detector works, the confidence levels behind our recommendations, and the AWS pricing references that back up our calculations.
You can run Waste Detection in Air-Gapped Mode without granting CloudWise any access to your AWS account. Simply run our export script in AWS CloudShell, upload the results, and get instant analysis of your unused resources.
šÆ Our Philosophy: Accuracy Over Everythingā
We believe in transparent, verifiable savings. Every waste item we flag includes:
- Confidence Level: How certain we are about the finding
- Exact Savings Calculation: Based on actual AWS pricing, not estimates
- AWS Pricing Reference: Link to official AWS documentation
- Action Command: Ready-to-run AWS CLI command
We intentionally do not include detectors that make arbitrary assumptions (like "50% savings from downsizing") because misleading recommendations destroy trust.
š Confidence Levels Explainedā
CloudWise uses three confidence levels to help you prioritize actions:
HIGH Confidence ā ā
These findings are 100% accurate with no assumptions:
- Uses actual resource sizes/counts from AWS APIs
- Uses fixed AWS pricing (published rates)
- Clear binary detection (e.g., "attached or not")
- You can act on these immediately
Examples: Unattached EBS volumes, Unattached Elastic IPs, Old snapshots
MEDIUM Confidence ā ļøā
These findings are accurate with caveats:
- Uses real CloudWatch metrics
- Thresholds are configurable (e.g., "idle" means < 5% CPU)
- May have legitimate exceptions (scheduled jobs, DR resources)
- Review before taking action
Examples: Idle EC2 instances, Idle RDS databases, Log groups without retention
LOW Confidence ā”ā
These are informational only or for cleanup purposes:
- Minimal actual cost savings
- Detection is for housekeeping, not cost reduction
- Consider during maintenance windows
Examples: Unused Lambda functions (pay-per-use = $0 when unused)
š Complete Detector Inventoryā
CloudWise includes 191 waste detectors spanning 42 AWS services, organized into 10 service categories, each designed to identify specific cost optimization opportunities with verified savings calculations.
Detector Overview by Categoryā
| Category | Detectors | Description |
|---|---|---|
| Compute Optimization | 34 | EC2, Lambda, ECS, EKS (including Extended Support), WorkSpaces (5 detectors), Lightsail (6 detectors), Beanstalk (6 detectors) |
| Storage Optimization | 26 | EBS, S3 (lifecycle, growth, tiering, empty buckets), EFS, FSx, ECR, Snapshots, AWS Backup |
| Database Optimization | 40 | RDS (including Extended Support), Aurora (4 detectors), DynamoDB, ElastiCache (7 detectors including Extended Support), Redshift, OpenSearch (including Extended Support), Neptune (4 detectors), DocumentDB (including Extended Support), Timestream, QLDB |
| Network Optimization | 19 | EIP, NAT Gateway, Load Balancers (ALB/NLB/CLB), CloudFront, Route53, Global Accelerator (3 detectors), Transfer Family, VPC Endpoints |
| Serverless Optimization | 10 | Lambda, API Gateway, AppSync (3 detectors), Step Functions (4 detectors) |
| Analytics Optimization | 24 | EMR (6 detectors), Kinesis (6 detectors), MSK (idle + oversized), Glue (8 detectors), MQ (2 detectors), Firehose |
| ML/AI Optimization | 6 | SageMaker Notebooks, Endpoints, Oversized Endpoints, Stopped Notebook Storage, Previous-Gen Instances, SP Recommendations |
| Management & Operations | 7 | CloudWatch Logs (4 detectors), CloudTrail, CloudWatch Dashboards |
| Security & Compliance | 12 | Secrets Manager, KMS, Encryption-at-rest (EBS, RDS, EFS, S3, OpenSearch, DocumentDB), Deletion Protection (RDS, DynamoDB), Public Access (RDS), Backup Coverage |
| AWS Compute Optimizer | 4 | ML-backed rightsizing for EC2, Lambda, EBS, RDS |
| Reserved Instance/Savings Plans | 16 | RI & SP purchase recommendations, commitment utilization, expiry tracking, coverage analysis |
| IPv4 Optimization | 2 | EIP on stopped instance, Multiple EIPs per instance |
Note: 171 detectors are available in Air-Gapped Mode with full export. 20 detectors require Online Mode (Compute Optimizer, RI/SP recommendations, Commitment Risk Intelligence, and orphaned DNS record cross-referencing). See the Coverage Matrix for details.
š“ Air-Gapped Mode vs Online Modeā
| Feature | Air-Gapped Mode | Online Mode |
|---|---|---|
| Setup Time | ~5 minutes | ~10 minutes |
| AWS Connection | None required | IAM role required |
| Detectors Available | 171 of 191 | All 191 |
| Real-time Monitoring | ā | ā |
| Compute Optimizer | ā | ā |
| RI/SP Recommendations | ā | ā |
| Ideal For | Evaluation, compliance | Production monitoring |
Air-Gapped Mode is perfect for:
- Companies with strict security/compliance requirements
- Evaluating CloudWise before full integration
- One-time cost audits
- Teams that need approval before granting external access
Get started with Air-Gapped Mode ā
Complete Detector Reference Tableā
The table below lists all implemented waste detectors with their detection criteria and status.
Compute Optimization (34 Detectors)ā
| Waste Type | Confidence | Detection |
|---|---|---|
idle_ec2 | HIGH | CPU < 5% for 14 days |
stopped_ec2_with_ebs | MEDIUM | Stopped instance with EBS volumes |
unused_lambda | HIGH | Zero invocations for 30 days |
over_provisioned_lambda | MEDIUM | Memory utilization < 50% |
lambda_provisioned_concurrency_idle | HIGH | Provisioned Concurrency utilization < 10% for 14 days |
lambda_excessive_timeout | MEDIUM | Timeout ā„ 10Ć average duration |
lambda_arm64_migration | LOW | x86_64 function with ARM64-compatible runtime |
oversized_ecs_task | MEDIUM | CPU/memory utilization analysis |
idle_ecs_service | HIGH | Zero running tasks for 7 days |
ecs_no_autoscaling | MEDIUM | Fargate service without Application Auto Scaling |
ecs_container_insights_waste | LOW | Container Insights on dev/small/idle cluster |
oversized_ecs_memory | MEDIUM | Memory over-provisioned relative to usage |
idle_sagemaker_notebook | HIGH | InService with no activity |
idle_sagemaker_endpoint | HIGH | Zero invocations for 7 days |
stopped_sagemaker_notebook_storage | HIGH | Stopped notebook with EBS volume |
previous_gen_sagemaker_instance | HIGH | Previous-gen instance type (ml.m4/c4/t2/r4/p2) |
idle_workspace | HIGH | Zero connections for 30 days |
oversized_workspace | MEDIUM | WorkSpace bundle over-provisioned relative to usage |
workspaces_autostop_opportunity | HIGH | AlwaysOn WorkSpace with low usage ā switch to AutoStop billing |
workspaces_pool_overprovisioned_capacity | HIGH | WorkSpaces Pool with excess provisioned capacity above p95 demand |
workspaces_windows_license_optimization | MEDIUM | Windows license-included desktops eligible for BYOL or Linux alternatives |
idle_lightsail | HIGH | CPU < 5% for 14 days |
lightsail_unattached_static_ip | LOW | Allocated but not attached to any instance |
lightsail_unattached_disk | MEDIUM | Block storage disk not attached to any instance |
lightsail_old_snapshot | MEDIUM | Manual snapshot older than 90 days |
lightsail_idle_load_balancer | MEDIUM | Load balancer with zero healthy instances |
lightsail_idle_database | HIGH | Database with <1% CPU and zero connections for 14 days |
idle_beanstalk | HIGH | Unhealthy environment (Grey/Red health) |
beanstalk_idle_traffic | HIGH | Zero requests for 14 days |
beanstalk_unnecessary_alb | HIGH | Load-balanced with single instance (min=max=1) |
beanstalk_previous_gen_instances | HIGH | Using previous-generation instance types (t2, m4, c4, etc.) |
beanstalk_over_provisioned | MEDIUM | CPU utilization <25% for 14 days |
beanstalk_orphaned_rds | HIGH | Orphaned or unused RDS from Beanstalk environment |
eks_extended_support_cost | HIGH | EKS cluster version in warning window or active Extended Support surcharge state |
Storage Optimization (26 Detectors)ā
| Waste Type | Confidence | Detection |
|---|---|---|
unattached_ebs | HIGH | Volume status = available |
old_ebs_snapshot | HIGH | Snapshot age > 90 days |
orphaned_ebs_snapshot | HIGH | Source volume deleted ā snapshot serves no recovery purpose |
ami_orphaned_snapshot | HIGH | AMI deregistered ā backing snapshot no longer needed |
gp2_migration | HIGH | Volume type = gp2 |
over_provisioned_iops | MEDIUM | Provisioned IOPS underutilized |
no_lifecycle_policy | HIGH | S3 bucket without lifecycle rules |
incomplete_multipart | MEDIUM | Failed multipart uploads |
s3_rapid_growth | MEDIUM | Bucket growth >100% in 30 days, size >1 GB, absolute growth >10 GB, no expiration rules |
s3_wrong_storage_class | MEDIUM | S3 Standard >50 GB, >90% of total storage, no Intelligent-Tiering or lifecycle transitions |
s3_empty_bucket | HIGH | Empty bucket (0 objects, 0 bytes) older than 30 days, excluding infra buckets |
s3_high_request_and_transfer_cost | LOW | Non-storage costs (transfer + API requests) exceed storage costs (CUR-based) |
idle_efs | HIGH | Zero client connections for 14 days |
no_lifecycle_efs | HIGH | EFS filesystem without lifecycle policy |
idle_fsx | HIGH | Zero network I/O for 14 days |
oversized_fsx | MEDIUM | Storage utilization below 40% of provisioned capacity |
fsx_throughput_overprovisioned | MEDIUM | Throughput utilization below 30% of provisioned (Windows/ONTAP/OpenZFS) |
old_fsx_backup | MEDIUM | Manual FSx backup older than 90 days |
ecr_no_lifecycle_policy | MEDIUM | ECR repo without lifecycle policy |
old_ecr_images | MEDIUM | Images > 90 days old |
untagged_ecr_images | HIGH | ECR images without required tags |
old_backup | MEDIUM | Recovery point past retention window (> 90 days) |
redundant_backup | MEDIUM | Duplicate recovery points for same resource in same vault |
backup_no_lifecycle_tiering | LOW | Recovery point not transitioned to cold storage |
stale_backup_plan_assignment | LOW | Backup plan with selection rules matching zero resources |
backup_copy_policy_overreach | LOW | Cross-region copy rules duplicating backups unnecessarily |
Database Optimization (40 Detectors)ā
| Waste Type | Confidence | Detection |
|---|---|---|
idle_rds | HIGH | Zero connections for 14 days |
old_rds_snapshot | MEDIUM | Manual snapshot > 90 days |
aurora_io_optimization_opportunity | HIGH | Aurora Standard cluster where I/O charges >25% of total spend |
aurora_extended_support_cost | HIGH | Aurora cluster running EOL engine version under Extended Support |
aurora_serverless_opportunity | MEDIUM | Low-utilization provisioned Aurora cluster suitable for Serverless v2 |
aurora_to_rds_downgrade_opportunity | MEDIUM | Underutilized Aurora cluster where single-node RDS would be more cost-effective |
rds_extended_support_cost | HIGH | Non-Aurora RDS MySQL/PostgreSQL instance in warning window or active Extended Support surcharge state |
idle_dynamodb | HIGH | Zero consumed capacity for 14 days |
dynamodb_no_autoscaling | HIGH | Provisioned mode without auto-scaling |
over_provisioned_dynamodb | MEDIUM | Capacity utilization < 20% |
idle_elasticache | HIGH | Zero connections for 7 days |
oversized_elasticache | MEDIUM | ElastiCache node over-provisioned relative to usage |
elasticache_extended_support_cost | HIGH | ElastiCache Redis/Memcached cluster in warning window or active Extended Support surcharge state |
elasticache_replication_waste | HIGH | Non-production cluster with unnecessary replicas |
elasticache_engine_migration | HIGH | Redis OSS / Memcached cluster eligible for 20% cheaper Valkey |
elasticache_serverless_optimization | MEDIUM | Node-based cluster with spiky traffic better suited for Serverless |
elasticache_data_tiering_opportunity | MEDIUM | Memory-only R6g/R7g cluster eligible for R6gd data tiering (up to 52% savings) |
idle_redshift | HIGH | Zero connections for 14 days |
underutilized_redshift | MEDIUM | CPU < 10% for 14 days, non-zero connections |
oversized_redshift | MEDIUM | Redshift cluster over-provisioned relative to query load |
redshift_no_pause | MEDIUM | No pause schedule, >40% zero-connection hours |
redshift_spectrum_heavy | LOW | Spectrum cost >50% of compute cost |
redshift_legacy_dc2 | LOW | DC2 node type, recommend RA3 migration |
redshift_wlm_over_provisioned | MEDIUM | WLM queue near-empty, <50% concurrency slots used |
redshift_concurrency_scaling_waste | LOW | Concurrency scaling exceeds free 1h/day credit |
idle_opensearch | HIGH | Zero requests for 14 days |
opensearch_extended_support_cost | HIGH | OpenSearch/legacy Elasticsearch domain in warning window or active legacy support surcharge state |
oversized_opensearch | MEDIUM | Avg CPU < 20%, max CPU < 40% with active search traffic |
opensearch_ebs_overprovisioned | MEDIUM | Free storage > 60% with flat growth (< 0.1 GB/day) |
ri_opportunity_opensearch | LOW | On-demand domain eligible for Reserved Instance savings |
idle_neptune | HIGH | Zero requests for 14 days |
neptune_serverless_opportunity | HIGH | Provisioned cluster with avg CPU < 20%, max < 50%, cost > Serverless floor |
oversized_neptune | MEDIUM | Per-instance avg CPU < 20%, max < 40% with downsize path available |
neptune_old_snapshot | HIGH | Manual snapshot > 90 days |
idle_documentdb | HIGH | Zero connections for 14 days |
old_documentdb_snapshot | MEDIUM | Manual snapshot > 90 days |
overprovisioned_documentdb | MEDIUM | CPU/connection utilization < 20% for 14 days |
documentdb_extended_support_cost | HIGH | DocumentDB cluster in warning window or active Extended Support surcharge state |
idle_timestream | HIGH | Zero writes for 30 days |
idle_qldb | HIGH | Zero requests for 30 days |
Network Optimization (19 Detectors)ā
| Waste Type | Confidence | Detection |
|---|---|---|
unattached_eip | HIGH | No instance or ENI attached |
eip_on_stopped_instance | HIGH | EIP attached to stopped instance |
multiple_eips_per_instance | HIGH | Instance with > 1 EIP |
idle_nat_gateway | HIGH | Zero bytes transferred for 7 days |
idle_load_balancer | HIGH | Zero healthy targets or zero requests with healthy targets |
low_traffic_alb | MEDIUM | ALB/NLB with <100 requests in 14 days but healthy targets |
high_lcu_cost_alb | LOW | ALB where LCU cost exceeds 2Ć base fee |
classic_lb_migration | LOW | Classic Load Balancer ā migrate to ALB/NLB |
unused_distribution | HIGH | CloudFront with zero requests for 30 days |
unused_hosted_zone | HIGH | Route53 zone with only NS/SOA records |
unused_accelerator | HIGH | Global Accelerator with zero traffic |
idle_global_accelerator | HIGH | Global Accelerator deployed with endpoints but zero processed bytes for 30 days |
disabled_global_accelerator | MEDIUM | Global Accelerator disabled but still incurring fixed hourly charges |
idle_transfer_server | HIGH | Transfer Family server with zero file transfers for 30 days |
idle_transfer_no_activity | HIGH | Transfer Family server with zero file operations for 14 days |
idle_transfer_web_app | HIGH | Transfer Family web app with no activity |
unused_transfer_protocol | MEDIUM | Transfer server with unused enabled protocols |
unused_vpc_endpoint | HIGH | VPC endpoint with zero traffic for 14 days |
orphaned_dns_record | MEDIUM | DNS record pointing to non-existent resource |
Serverless Optimization (10 Detectors)ā
| Waste Type | Confidence | Detection |
|---|---|---|
unused_lambda | HIGH | Zero invocations for 30 days |
lambda_old_runtime | LOW | Function on deprecated/EOL runtime |
unused_api_gateway | HIGH | Zero requests for 30 days |
unused_appsync | HIGH | Zero queries for 30 days |
appsync_idle_cache | MEDIUM | Cache with < 100 hits in 14 days (hourly cost: $0.044ā$6.78) |
appsync_idle_subscriptions | MEDIUM | Active WebSocket connections with < 100 requests in 14 days |
idle_state_machine | HIGH | Zero executions for 30 days |
step_functions_retry_storm | HIGH | Retry ratio > 25% AND failure rate > 20% over 14 days |
step_functions_high_transition_density | MEDIUM | Avg transitions per success > 50 (configurable) |
step_functions_express_duration_waste | MEDIUM | Express p95 duration > 30s with high execution volume |
Analytics Optimization (24 Detectors)ā
| Waste Type | Confidence | Detection |
|---|---|---|
idle_emr_cluster | HIGH | Running cluster with zero steps |
long_running_emr | MEDIUM | EMR cluster running continuously without recent steps |
emr_over_provisioned | MEDIUM | Instance group CPU/memory utilization < 20% for 14 days |
emr_missing_auto_termination | HIGH | Running cluster without auto-termination policy configured |
emr_previous_gen_instances | HIGH | Cluster using previous-generation instance types (m3ām5, c3āc5, etc.) |
emr_spot_opportunity | MEDIUM | Task node instance groups running on On-Demand instead of Spot |
idle_kinesis_stream | HIGH | Zero records for 14 days |
over_provisioned_kinesis | MEDIUM | Kinesis stream with shard utilization < 20% |
kinesis_on_demand_downgrade | MEDIUM | On-Demand stream with stable throughput (CV < 0.3) ā switch to Provisioned |
kinesis_extended_retention_waste | HIGH | Extended retention (>24h) with zero GetRecords in 14 days |
kinesis_enhanced_fan_out_waste | HIGH | Enhanced fan-out consumer with zero reads for 14 days |
kinesis_firehose_idle | HIGH | Firehose delivery stream with zero records for 14 days |
idle_msk_cluster | HIGH | Zero messages for 7 days |
oversized_msk_cluster | MEDIUM | CPU < 20% AND network < 50% for 7 days |
idle_glue_dev_endpoint | HIGH | Dev endpoint in READY state |
old_glue_job | MEDIUM | Job not run for 90 days |
idle_glue_crawler | MEDIUM | Crawler not run for 90 days |
oversized_glue_job | MEDIUM | JVM heap < 30% avg OR short duration with large DPU allocation |
glue_job_missing_timeout | MEDIUM | Timeout ā„ 10Ć average execution duration |
failed_glue_job_retry | HIGH | ā„ 50% failure rate across recent runs with retries configured |
glue_dev_endpoint_migration | LOW | Any active dev endpoint (AWS recommends Interactive Sessions) |
glue_catalog_bloat | LOW | Data Catalog objects exceed 1M free tier threshold |
idle_mq_broker | HIGH | Zero connections for 14 days |
oversized_mq_broker | MEDIUM | Over-provisioned MQ broker ā downsize to smaller instance type |
ML/AI Optimization (6 Detectors)ā
| Waste Type | Confidence | Detection |
|---|---|---|
idle_sagemaker_notebook | HIGH | InService with no activity |
idle_sagemaker_endpoint | HIGH | Zero invocations for 7 days |
oversized_sagemaker_endpoint | MEDIUM | Low CPU/memory utilization |
stopped_sagemaker_notebook_storage | HIGH | Stopped notebook with EBS volume |
previous_gen_sagemaker_instance | HIGH | Previous-gen instance type (ml.m4/c4/t2/r4/p2) |
sp_opportunity_sagemaker | HIGH | AWS Cost Explorer SageMaker SP recommendation |
Management & Operations (7 Detectors)ā
| Waste Type | Confidence | Detection |
|---|---|---|
no_retention_log_group | MEDIUM/HIGH | Log group without retention policy |
old_log_group | MEDIUM | Log group with no recent logs |
excessive_retention_log_group | MEDIUM | Log group with retention exceeding recommended baseline |
empty_log_group | HIGH | Log group with zero log streams |
duplicate_cloudtrail | HIGH | Multiple trails logging same events |
cloudtrail_s3_no_lifecycle | MEDIUM | CloudTrail bucket without lifecycle |
unused_dashboard | MEDIUM | CloudWatch dashboard with no recent views |
Security & Compliance (12 Detectors)ā
| Waste Type | Confidence | Detection |
|---|---|---|
unused_secret | HIGH | Secret not accessed for 90 days |
unused_kms_key | HIGH | KMS key not used for 90 days |
unencrypted_ebs_volume | HIGH | EBS volume without encryption enabled |
unencrypted_rds_instance | HIGH | RDS instance without encryption at rest |
unencrypted_efs_filesystem | HIGH | EFS filesystem without encryption at rest |
s3_no_default_encryption | MEDIUM | S3 bucket without default encryption configuration |
opensearch_no_encryption_at_rest | HIGH | OpenSearch domain without encryption at rest |
unencrypted_documentdb_cluster | HIGH | DocumentDB cluster without encryption at rest |
rds_no_deletion_protection | HIGH | RDS instance without deletion protection enabled |
dynamodb_no_deletion_protection | HIGH | DynamoDB table without deletion protection enabled |
rds_publicly_accessible | HIGH | RDS instance with public accessibility enabled |
resource_without_backup_coverage | MEDIUM | Critical resource not covered by any AWS Backup plan |
AWS Compute Optimizer Integration (4 Detectors)ā
Note: These detectors require Online Mode with live AWS API access to Compute Optimizer.
| Waste Type | Confidence | Detection |
|---|---|---|
oversized_ec2_optimizer | HIGH | AWS ML-based rightsizing recommendation |
oversized_ebs_optimizer | HIGH | AWS ML-based IOPS/throughput analysis |
oversized_lambda_optimizer | HIGH | AWS ML-based memory optimization |
oversized_rds_optimizer | HIGH | AWS ML-based RDS rightsizing recommendation |
Reserved Instance, Savings Plans & Commitment Risk Intelligence (16 Detectors)ā
Note: Purchase recommendation and commitment risk detectors require Online Mode with live AWS API access to Cost Explorer, EC2, or Savings Plans APIs. CUR-based commitment detectors (
cur_unused_reservation,cur_savings_plan_waste) are available in all modes including Air-Gapped.
| Waste Type | Confidence | Detection |
|---|---|---|
ri_opportunity_ec2 | HIGH | AWS Cost Explorer EC2 RI recommendation |
ri_opportunity_rds | HIGH | AWS Cost Explorer RDS RI recommendation |
ri_opportunity_elasticache | HIGH | AWS Cost Explorer ElastiCache Reserved Node recommendation |
ri_opportunity_opensearch | HIGH | AWS Cost Explorer OpenSearch RI recommendation |
ri_opportunity_redshift | HIGH | AWS Cost Explorer Redshift Reserved Node recommendation |
sp_opportunity_compute | HIGH | AWS Cost Explorer Compute SP recommendation |
sp_opportunity_ec2 | HIGH | AWS Cost Explorer EC2 Instance SP recommendation |
sp_opportunity_sagemaker | HIGH | AWS Cost Explorer SageMaker SP recommendation |
unused_reserved_instance | HIGH | RI utilization < 20% over 30 days via Cost Explorer + EC2 API |
unused_savings_plan | HIGH | SP utilization < 20% over 30 days via Cost Explorer |
expiring_reserved_instance | HIGH | Active RI expiring within 90 days (URGENT/WARNING/NOTICE tiers) |
expiring_savings_plan | HIGH | Active SP expiring within 90 days (URGENT/WARNING/NOTICE tiers) |
convertible_ri_exchange_opportunity | MEDIUM | Convertible RI on previous-gen instance type eligible for free exchange |
savings_plan_coverage_gap | MEDIUM | SP coverage < 50% with > $100/mo uncovered on-demand spend |
cur_unused_reservation | MEDIUM | Unused RI hours detected in CUR billing data (all modes) |
cur_savings_plan_waste | MEDIUM | Unused SP commitment detected in CUR billing data (all modes) |
Complete Waste Type Coverage Matrix (191 Types)ā
This section provides the complete coverage matrix for all 191 unique waste types, showing availability across different detection modes. This is particularly useful for understanding what waste detection capabilities are available in Air-Gapped Mode vs Online Mode.
Mode Definitionsā
| Mode | Description | Requirements |
|---|---|---|
| CUR Only | Upload AWS Cost and Usage Report CSV | CUR file from S3 |
| + Describe Export | CUR + resource configuration data | CUR + export script (basic) |
| + CloudWatch Export | CUR + describe + utilization metrics | CUR + export script (full) |
| Online Mode | Full connected mode with live AWS access | IAM role connection |
Coverage Summaryā
| Mode | Waste Types Available | Coverage |
|---|---|---|
| Online Mode | 191 (all types) | 100% |
| Offline + Full Export | 171 waste types | 89% |
| Offline + Describe Only | 64 waste types | 50% |
| CUR Only | 4 insights | 3% |
Complete Coverage Matrixā
Legend:
- ā Available
- ā Not available
- š Online Mode only (requires live AWS API)
Compute Optimization (29 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 1 | idle_ec2 | ā | ā | ā | ā |
| 2 | stopped_ec2_with_ebs | ā | ā | ā | ā |
| 3 | previous_gen_ec2 | ā | ā | ā | ā |
| 4 | unused_lambda | ā | ā | ā | ā |
| 5 | over_provisioned_lambda | ā | ā | ā | ā |
| 5a | lambda_provisioned_concurrency_idle | ā | ā | ā | ā |
| 5b | lambda_excessive_timeout | ā | ā | ā | ā |
| 5c | lambda_arm64_migration | ā | ā | ā | ā |
| 6 | oversized_ecs_task | ā | ā | ā | ā |
| 7 | idle_sagemaker_notebook | ā | ā | ā | ā |
| 8 | idle_sagemaker_endpoint | ā | ā | ā | ā |
| 9 | idle_workspace | ā | ā | ā | ā |
| 10 | idle_lightsail | ā | ā | ā | ā |
| 10a | lightsail_unattached_static_ip | ā | ā | ā | ā |
| 10b | lightsail_unattached_disk | ā | ā | ā | ā |
| 10c | lightsail_old_snapshot | ā | ā | ā | ā |
| 10d | lightsail_idle_load_balancer | ā | ā | ā | ā |
| 10e | lightsail_idle_database | ā | ā | ā | ā |
| 11 | idle_beanstalk | ā | ā | ā | ā |
| 11a | beanstalk_idle_traffic | ā | ā | ā | ā |
| 11b | beanstalk_unnecessary_alb | ā | ā | ā | ā |
| 11c | beanstalk_previous_gen_instances | ā | ā | ā | ā |
| 11d | beanstalk_over_provisioned | ā | ā | ā | ā |
| 11e | beanstalk_orphaned_rds | ā | ā | ā | ā |
| 12 | oversized_workspace | ā | ā | ā | ā |
| 12a | workspaces_autostop_opportunity | ā | ā | ā | ā |
| 12b | workspaces_pool_overprovisioned_capacity | ā | ā | ā | ā |
| 12c | workspaces_windows_license_optimization | ā | ā | ā | ā |
| 12d | eks_extended_support_cost | ā | ā | ā | ā |
Storage Optimization (26 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 12 | unattached_ebs | ā | ā | ā | ā |
| 13 | old_ebs_snapshot | ā | ā | ā | ā |
| 14 | orphaned_ebs_snapshot | ā | ā | ā | ā |
| 15 | ami_orphaned_snapshot | ā | ā | ā | ā |
| 16 | gp2_migration | ā | ā | ā | ā |
| 17 | no_lifecycle_policy | ā | ā | ā | ā |
| 18 | incomplete_multipart | ā | ā | ā | ā |
| 19 | s3_rapid_growth | ā | ā | ā | ā |
| 20 | s3_wrong_storage_class | ā | ā | ā | ā |
| 21 | s3_empty_bucket | ā | ā | ā | ā |
| 22 | s3_high_request_and_transfer_cost | ā | ā | ā | ā |
| 23 | idle_efs | ā | ā | ā | ā |
| 24 | idle_fsx | ā | ā | ā | ā |
| 24a | oversized_fsx | ā | ā | ā | ā |
| 24b | fsx_throughput_overprovisioned | ā | ā | ā | ā |
| 24c | old_fsx_backup | ā | ā | ā | ā |
| 25 | ecr_no_lifecycle_policy | ā | ā | ā | ā |
| 26 | old_ecr_images | ā | ā | ā | ā |
| 26a | no_lifecycle_efs | ā | ā | ā | ā |
| 26b | over_provisioned_iops | ā | ā | ā | ā |
| 26d | old_backup | ā | ā | ā | ā |
| 26e | redundant_backup | ā | ā | ā | ā |
| 26f | untagged_ecr_images | ā | ā | ā | ā |
| 26g | backup_no_lifecycle_tiering | ā | ā | ā | ā |
| 26h | stale_backup_plan_assignment | ā | ā | ā | ā |
| 26i | backup_copy_policy_overreach | ā | ā | ā | ā |
Database Optimization (39 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 26 | idle_rds | ā | ā | ā | ā |
| 27 | old_rds_snapshot | ā | ā | ā | ā |
| 27a | aurora_to_rds_downgrade_opportunity | ā | ā | ā | ā |
| 28 | idle_dynamodb | ā | ā | ā | ā |
| 29 | dynamodb_no_autoscaling | ā | ā | ā | ā |
| 30 | over_provisioned_dynamodb | ā | ā | ā | ā |
| 31 | idle_elasticache | ā | ā | ā | ā |
| 32 | idle_redshift | ā | ā | ā | ā |
| 33 | underutilized_redshift | ā | ā | ā | ā |
| 34 | redshift_no_pause | ā | ā | ā | ā |
| 35 | redshift_spectrum_heavy | ā | ā | ā | ā |
| 36 | redshift_legacy_dc2 | ā | ā | ā | ā |
| 37 | redshift_wlm_over_provisioned | ā | ā | ā | ā |
| 38 | redshift_concurrency_scaling_waste | ā | ā | ā | ā |
| 39 | idle_opensearch | ā | ā | ā | ā |
| 40 | idle_neptune | ā | ā | ā | ā |
| 40a | neptune_serverless_opportunity | ā | ā | ā | ā |
| 40b | oversized_neptune | ā | ā | ā | ā |
| 40c | neptune_old_snapshot | ā | ā | ā | ā |
| 41 | idle_documentdb | ā | ā | ā | ā |
| 41a | old_documentdb_snapshot | ā | ā | ā | ā |
| 41b | overprovisioned_documentdb | ā | ā | ā | ā |
| 42 | idle_timestream | ā | ā | ā | ā |
| 43 | idle_qldb | ā | ā | ā | ā |
| 43c | oversized_elasticache | ā | ā | ā | ā |
| 43d | oversized_redshift | ā | ā | ā | ā |
| 43e | oversized_opensearch | ā | ā | ā | ā |
| 43f | rds_extended_support_cost | ā | ā | ā | ā |
| 43g | elasticache_extended_support_cost | ā | ā | ā | ā |
| 43h | elasticache_replication_waste | ā | ā | ā | ā |
| 43i | elasticache_engine_migration | ā | ā | ā | ā |
| 43j | elasticache_serverless_optimization | ā | ā | ā | ā |
| 43k | elasticache_data_tiering_opportunity | ā | ā | ā | ā |
| 43l | opensearch_extended_support_cost | ā | ā | ā | ā |
| 43m | documentdb_extended_support_cost | ā | ā | ā | ā |
| 43n | aurora_io_optimization_opportunity | ā | ā | ā | ā |
| 43o | aurora_extended_support_cost | ā | ā | ā | ā |
| 43p | aurora_serverless_opportunity | ā | ā | ā | ā |
| 43q | opensearch_ebs_overprovisioned | ā | ā | ā | ā |
Network Optimization (17 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 38 | unattached_eip | ā | ā | ā | ā |
| 39 | eip_on_stopped_instance | ā | ā | ā | ā |
| 40 | multiple_eips_per_instance | ā | ā | ā | ā |
| 41 | idle_nat_gateway | ā | ā | ā | ā |
| 42 | idle_load_balancer | ā | ā | ā | ā |
| 43 | low_traffic_alb | ā | ā | ā | ā |
| 44 | high_lcu_cost_alb | ā | ā | ā | ā |
| 45 | classic_lb_migration | ā | ā | ā | ā |
| 46 | unused_distribution | ā | ā | ā | ā |
| 47 | unused_hosted_zone | ā | ā | ā | ā |
| 48 | unused_accelerator | ā | ā | ā | ā |
| 48e | idle_global_accelerator | ā | ā | ā | ā |
| 48f | disabled_global_accelerator | ā | ā | ā | ā |
| 48a | idle_transfer_no_activity | ā | ā | ā | ā |
| 48b | idle_transfer_web_app | ā | ā | ā | ā |
| 48c | unused_transfer_protocol | ā | ā | ā | ā |
| 48d | unused_vpc_endpoint | ā | ā | ā | ā |
Serverless Optimization (11 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 49 | unused_lambda | ā | ā | ā | ā |
| 49a | lambda_old_runtime | ā | ā | ā | ā |
| 50 | unused_api_gateway | ā | ā | ā | ā |
| 51 | unused_appsync | ā | ā | ā | ā |
| 51a | appsync_idle_cache | ā | ā | ā | ā |
| 51b | appsync_idle_subscriptions | ā | ā | ā | ā |
| 52 | idle_state_machine | ā | ā | ā | ā |
| 52a | step_functions_retry_storm | ā | ā | ā | ā |
| 52b | step_functions_high_transition_density | ā | ā | ā | ā |
| 52c | step_functions_express_duration_waste | ā | ā | ā | ā |
| 53 | idle_transfer_server | ā | ā | ā | ā |
Analytics Optimization (24 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 54 | idle_emr_cluster | ā | ā | ā | ā |
| 54a | emr_over_provisioned | ā | ā | ā | ā |
| 54b | emr_missing_auto_termination | ā | ā | ā | ā |
| 54c | emr_previous_gen_instances | ā | ā | ā | ā |
| 54d | emr_spot_opportunity | ā | ā | ā | ā |
| 55 | idle_kinesis_stream | ā | ā | ā | ā |
| 56 | idle_msk_cluster | ā | ā | ā | ā |
| 56a | oversized_msk_cluster | ā | ā | ā | ā |
| 57 | idle_glue_dev_endpoint | ā | ā | ā | ā |
| 58 | old_glue_job | ā | ā | ā | ā |
| 59 | idle_glue_crawler | ā | ā | ā | ā |
| 59a | oversized_glue_job | ā | ā | ā | ā |
| 59b | glue_job_missing_timeout | ā | ā | ā | ā |
| 59c | failed_glue_job_retry | ā | ā | ā | ā |
| 59d | glue_dev_endpoint_migration | ā | ā | ā | ā |
| 59e | glue_catalog_bloat | ā | ā | ā | ā |
| 60 | idle_mq_broker | ā | ā | ā | ā |
| 60a | oversized_mq_broker | ā | ā | ā | ā |
| 60b | over_provisioned_kinesis | ā | ā | ā | ā |
| 60c | kinesis_on_demand_downgrade | ā | ā | ā | ā |
| 60d | kinesis_extended_retention_waste | ā | ā | ā | ā |
| 60e | kinesis_enhanced_fan_out_waste | ā | ā | ā | ā |
| 60f | kinesis_firehose_idle | ā | ā | ā | ā |
| 60b | long_running_emr | ā | ā | ā | ā |
ML/AI Optimization (5 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 61 | idle_sagemaker_notebook | ā | ā | ā | ā |
| 62 | idle_sagemaker_endpoint | ā | ā | ā | ā |
| 63 | oversized_sagemaker_endpoint | ā | ā | ā | ā |
| 64 | stopped_sagemaker_notebook_storage | ā | ā | ā | ā |
| 65 | previous_gen_sagemaker_instance | ā | ā | ā | ā |
Management & Operations (7 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 66 | no_retention_log_group | ā | ā | ā | ā |
| 67 | old_log_group | ā | ā | ā | ā |
| 68 | excessive_retention_log_group | ā | ā | ā | ā |
| 69 | empty_log_group | ā | ā | ā | ā |
| 70 | duplicate_cloudtrail | ā | ā | ā | ā |
| 71 | cloudtrail_s3_no_lifecycle | ā | ā | ā | ā |
| 71a | unused_dashboard | ā | ā | ā | ā |
Security & Compliance (12 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 72 | unused_secret | ā | ā | ā | ā |
| 73 | unused_kms_key | ā | ā | ā | ā |
| 73a | unencrypted_ebs_volume | ā | ā | ā | ā |
| 73b | unencrypted_rds_instance | ā | ā | ā | ā |
| 73c | unencrypted_efs_filesystem | ā | ā | ā | ā |
| 73d | s3_no_default_encryption | ā | ā | ā | ā |
| 73e | opensearch_no_encryption_at_rest | ā | ā | ā | ā |
| 73f | unencrypted_documentdb_cluster | ā | ā | ā | ā |
| 73g | rds_no_deletion_protection | ā | ā | ā | ā |
| 73h | dynamodb_no_deletion_protection | ā | ā | ā | ā |
| 73i | rds_publicly_accessible | ā | ā | ā | ā |
| 73j | resource_without_backup_coverage | ā | ā | ā | ā |
IPv4 Address Optimization (3 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 74 | unattached_eip | ā | ā | ā | ā |
| 75 | eip_on_stopped_instance | ā | ā | ā | ā |
| 76 | multiple_eips_per_instance | ā | ā | ā | ā |
Additional Compute Detectors (8 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 77 | over_provisioned_ec2 | ā | ā | ā | ā |
| 78 | unoptimized_ebs_iops | ā | ā | ā | ā |
| 79 | idle_ecs_service | ā | ā | ā | ā |
| 79a | ecs_no_autoscaling | ā | ā | ā | ā |
| 79b | ecs_container_insights_waste | ā | ā | ā | ā |
| 79c | oversized_ecs_memory | ā | ā | ā | ā |
| 80 | idle_eks_nodegroup | ā | ā | ā | ā |
| 81 | unattached_eni | ā | ā | ā | ā |
Additional Storage Detectors (4 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 82 | s3_intelligent_tiering_candidate | ā | ā | ā | ā |
| 83 | ebs_snapshot_public | ā | ā | ā | ā |
| 84 | old_ami | ā | ā | ā | ā |
| 85 | unattached_ebs_iops | ā | ā | ā | ā |
Additional Network Detectors (3 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 86 | idle_vpn_connection | ā | ā | ā | ā |
| 87 | idle_direct_connect | ā | ā | ā | ā |
| 88 | orphaned_dns_record | ā | ā | ā | ā |
CUR-Derived Insights (4 Types)ā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 89 | previous_gen_indicator | ā | ā | ā | ā |
| 90 | on_demand_candidate | ā | ā | ā | ā |
| 91 | untagged_resources | ā | ā | ā | ā |
| 92 | cost_anomaly | ā | ā | ā | ā |
AWS Compute Optimizer Integration (4 Types) šā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 93 | oversized_ec2_optimizer | ā | ā | ā | ā |
| 94 | oversized_ebs_optimizer | ā | ā | ā | ā |
| 95 | oversized_lambda_optimizer | ā | ā | ā | ā |
| 95a | oversized_rds_optimizer | ā | ā | ā | ā |
Reserved Instance & Savings Plans (16 Types) šā
| # | Waste Type | CUR | + Describe | + CloudWatch | Online |
|---|---|---|---|---|---|
| 96 | ri_opportunity_ec2 | ā | ā | ā | ā |
| 97 | ri_opportunity_rds | ā | ā | ā | ā |
| 98 | sp_opportunity_compute | ā | ā | ā | ā |
| 99 | sp_opportunity_ec2 | ā | ā | ā | ā |
| 99a | ri_opportunity_elasticache | ā | ā | ā | ā |
| 99b | ri_opportunity_opensearch | ā | ā | ā | ā |
| 99c | ri_opportunity_redshift | ā | ā | ā | ā |
| 99d | sp_opportunity_sagemaker | ā | ā | ā | ā |
| 99e | unused_reserved_instance | ā | ā | ā | ā |
| 99f | unused_savings_plan | ā | ā | ā | ā |
| 99g | expiring_reserved_instance | ā | ā | ā | ā |
| 99h | expiring_savings_plan | ā | ā | ā | ā |
| 99i | convertible_ri_exchange_opportunity | ā | ā | ā | ā |
| 99j | savings_plan_coverage_gap | ā | ā | ā | ā |
| 99k | cur_unused_reservation | ā | ā | ā | ā |
| 99l | cur_savings_plan_waste | ā | ā | ā | ā |
Online-Only Waste Types (20)ā
The following waste types require live AWS API access and are not available in Air-Gapped Mode:
| Waste Type | Reason | AWS API Required |
|---|---|---|
oversized_ec2_optimizer | AWS ML analysis | Compute Optimizer API |
oversized_ebs_optimizer | AWS ML analysis | Compute Optimizer API |
oversized_lambda_optimizer | AWS ML analysis | Compute Optimizer API |
oversized_rds_optimizer | AWS ML analysis | Compute Optimizer API |
ri_opportunity_ec2 | Purchase recommendations | Cost Explorer API |
ri_opportunity_rds | Purchase recommendations | Cost Explorer API |
ri_opportunity_elasticache | Purchase recommendations | Cost Explorer API |
ri_opportunity_opensearch | Purchase recommendations | Cost Explorer API |
ri_opportunity_redshift | Purchase recommendations | Cost Explorer API |
sp_opportunity_compute | Purchase recommendations | Cost Explorer API |
sp_opportunity_ec2 | Purchase recommendations | Cost Explorer API |
sp_opportunity_sagemaker | Purchase recommendations | Cost Explorer API |
unused_reserved_instance | Purchase commitment analysis | Cost Explorer + EC2 API |
unused_savings_plan | Purchase commitment analysis | Cost Explorer API |
expiring_reserved_instance | Real-time RI expiry tracking | EC2 API |
expiring_savings_plan | Real-time SP expiry tracking | Savings Plans API |
convertible_ri_exchange_opportunity | RI inventory + pricing | EC2 API + Pricing API |
savings_plan_coverage_gap | Coverage analysis | Cost Explorer API |
beanstalk_orphaned_rds | Cross-service EB/RDS correlation | RDS + CloudWatch API |
orphaned_dns_record | DNS cross-reference requires live Route53 API | Route53 API |
Tip: For maximum Air-Gapped Mode coverage (89%), run the full export script including CloudWatch metrics. See our Air-Gapped Mode Guide for setup instructions.
Detector Referenceā
Compute Detectorsā
Idle EC2 Instanceā
Confidence: HIGH
Detection: CloudWatch CPUUtilization average < 5% over 14 days
Savings: Full instance cost (from AWS Pricing API)
How it works:
- We query CloudWatch for 14 days of CPU metrics
- If average CPU is below the threshold, the instance is flagged
- Monthly cost is calculated using the AWS Pricing API for the exact instance type
Example:
Instance: i-0abc123def456
Type: t3.large
Region: us-east-1
Avg CPU: 2.3% over 14 days
Monthly Savings: $60.74 (100% of instance cost)
AWS Reference: EC2 On-Demand Pricing
Why HIGH confidence: We use actual CloudWatch metrics and exact AWS pricing. No estimation.
Stopped EC2 with EBS Storageā
Confidence: MEDIUM
Detection: Instance state = stopped with attached EBS volumes
Savings: EBS storage costs for attached volumes
How it works:
- We identify stopped instances with BlockDeviceMappings
- EBS volumes continue to charge even when the instance is stopped
- We flag this for review with estimated storage costs
Example:
Instance: i-0abc123def456
State: stopped
EBS Volumes: 2 (100GB gp3 + 500GB gp2)
Monthly Savings: $58.00 ($8 + $50)
AWS Reference: EBS Pricing
Why MEDIUM confidence: Volume sizes require additional API call; we estimate based on typical configurations.
Over-Provisioned Lambdaā
Confidence: MEDIUM
Detection: CloudWatch MemoryUtilization < 50% over 7 days
Savings: Proportional to memory reduction
How it works:
- We analyze Lambda memory utilization via CloudWatch metrics
- Functions using less than 50% of allocated memory are flagged
- Reducing memory allocation reduces cost per invocation
Example:
Function: my-api-handler
Memory Allocated: 1024MB
Memory Used (avg): 256MB (25%)
Monthly Savings: ~$15.00 (if reduced to 512MB)
AWS Reference: Lambda Pricing
Why MEDIUM confidence: Memory reduction may affect cold start times and CPU allocation.
Idle Provisioned Concurrencyā
Confidence: HIGH
Detection: Provisioned Concurrency utilization < 10% over 14 days
Savings: Based on wasted fraction of allocated Provisioned Concurrency
How it works:
- We list all Provisioned Concurrency configurations for each Lambda function
- CloudWatch
ProvisionedConcurrencyUtilizationis analyzed over 14 days - Configurations with average utilization below 10% are flagged as idle
- Savings are calculated from the wasted fraction of allocated concurrency
Example:
Function: my-api-handler:prod
Allocated PC: 50
Avg Utilization: 3%
Wasted Fraction: 97%
Monthly Savings: $48.60
AWS Reference: Lambda Pricing ā Provisioned Concurrency
Why HIGH confidence: Provisioned Concurrency has a fixed cost regardless of usage. Low utilization directly translates to waste.
Excessive Lambda Timeoutā
Confidence: MEDIUM
Detection: Timeout ā„ 10Ć average duration, with ā„ 100 invocations
Savings: Recommendation only (no direct cost savings from timeout change)
How it works:
- We compare the configured timeout against actual average execution duration
- Functions where the timeout is 10Ć or more of the average duration are flagged
- This helps avoid accidental runaway costs and improves error handling
- Recommended timeout is set to
max(10, avg_duration Ć 3)seconds
Example:
Function: my-batch-processor
Timeout: 900s (15 min)
Avg Duration: 2s
Ratio: 450Ć
Recommended: 10s
Monthly Savings: N/A (recommendation only)
AWS Reference: Lambda Configuration
Why MEDIUM confidence: Timeout changes do not directly reduce cost, but prevent runaway invocations and improve operational hygiene.
ARM64 Migration Opportunityā
Confidence: LOW
Detection: x86_64 function with ARM64-compatible runtime and monthly cost ā„ $1
Savings: ~20% cost reduction by migrating to Graviton2 (ARM64)
How it works:
- We check the function's architecture (x86_64) and runtime
- Functions on ARM64-compatible runtimes (Python, Node.js, Java, .NET, Ruby) are flagged
- Only functions with monthly cost ā„ $5 are included (to avoid noise)
- AWS offers a 20% price discount for Lambda on Graviton2
Example:
Function: my-data-pipeline
Runtime: python3.12
Architecture: x86_64
Monthly Cost: $45.00
Est. Savings: $9.00/month (20%)
AWS Reference: Lambda Graviton2
Why LOW confidence: Migration requires testing; some libraries may not support ARM64. Manual verification recommended.
Oversized ECS Taskā
Confidence: MEDIUM
Detection: CloudWatch CPU/Memory utilization analysis
Savings: Based on right-sized task definition
How it works:
- We analyze ECS task CPU and memory utilization metrics
- Tasks consistently using less than 50% of allocated resources are flagged
- Right-sizing reduces Fargate costs or EC2 capacity requirements
Example:
Service: my-web-service
Task CPU: 1024 units
Task Memory: 2048MB
Avg Utilization: 25% CPU, 30% Memory
Monthly Savings: ~$40.00
AWS Reference: ECS Pricing
Why MEDIUM confidence: Workload patterns may have periodic spikes not captured in averages.
Idle SageMaker Notebookā
Confidence: HIGH
Detection: Notebook instance InService with no kernel activity
Savings: Full instance cost
How it works:
- We identify SageMaker notebook instances in InService status
- Check for kernel activity or user connections
- Idle notebooks continue to charge full instance cost
Example:
Notebook: my-ml-notebook
Instance Type: ml.t3.medium
Status: InService
Activity (7 days): None
Monthly Savings: $37.00
AWS Reference: SageMaker Pricing
Why HIGH confidence: SageMaker notebooks charge continuously when InService, regardless of activity.
Idle SageMaker Endpointā
Confidence: HIGH
Detection: Zero invocations for 7 days
Savings: Full endpoint cost
How it works:
- We query CloudWatch for endpoint invocation metrics
- Endpoints with zero invocations for 7+ days are flagged
- Idle endpoints charge full compute cost continuously
Example:
Endpoint: my-model-endpoint
Instance Type: ml.m5.large
Invocations (7 days): 0
Monthly Savings: $96.00
AWS Reference: SageMaker Pricing
Why HIGH confidence: Zero invocations is unambiguous. Endpoint is not being used.
Idle WorkSpacesā
Confidence: HIGH
Detection: Zero connections for 30 days
Savings: Full WorkSpace cost
How it works:
- We check WorkSpaces connection status via CloudWatch
- WorkSpaces with no user connections for 30+ days are flagged
- Monthly running mode charges continuously whether used or not
Example:
WorkSpace: ws-abc123def
Bundle: Standard (Windows)
Connections (30 days): 0
Monthly Savings: $35.00
AWS Reference: WorkSpaces Pricing
Why HIGH confidence: No user connections means no one is using this virtual desktop.
Idle Lightsail Instanceā
Confidence: HIGH
Detection: CloudWatch CPUUtilization < 5% for 14 days
Savings: Full instance cost
How it works:
- We query CloudWatch for 14 days of CPU metrics
- Instances with consistently low CPU are flagged
- Lightsail charges fixed monthly rate regardless of usage
Example:
Instance: my-lightsail-instance
Plan: $5/month (1GB RAM, 1 vCPU)
Avg CPU: 2.1% over 14 days
Monthly Savings: $5.00
AWS Reference: Lightsail Pricing
Why HIGH confidence: Actual CloudWatch metrics confirm low utilization.
Unattached Lightsail Static IPā
Confidence: LOW
Detection: Static IP allocated but not attached to any instance
Savings: $3.65/month per unattached IP
How it works:
- We list all Lightsail static IPs
- IPs where
isAttachedis false are flagged - AWS charges $3.65/month for unattached static IPs
Example:
Static IP: my-static-ip
IP Address: 52.1.2.3
Attached: No
Monthly Savings: $3.65
AWS Reference: Lightsail Pricing
Why LOW confidence: IP may be reserved for future use.
Unattached Lightsail Diskā
Confidence: MEDIUM
Detection: Block storage disk not attached to any instance
Savings: $0.10/GB/month
How it works:
- We list all Lightsail block storage disks
- Disks where
isAttachedis false are flagged - Storage charges apply regardless of attachment
Example:
Disk: my-data-disk
Size: 80 GB
Attached: No
Monthly Savings: $8.00
AWS Reference: Lightsail Pricing
Why MEDIUM confidence: Disk may contain important data. Remediation creates a snapshot before deletion.
Old Lightsail Snapshotā
Confidence: MEDIUM
Detection: Manual snapshot older than 90 days
Savings: $0.05/GB/month
How it works:
- We list all Lightsail instance snapshots
- Manual snapshots older than 90 days are flagged
- Auto-snapshots are excluded (managed by Lightsail)
Example:
Snapshot: my-old-snapshot
Size: 20 GB
Age: 120 days
Type: Manual
Monthly Savings: $1.00
AWS Reference: Lightsail Pricing
Why MEDIUM confidence: Old snapshots may still be needed for compliance.
Idle Lightsail Load Balancerā
Confidence: MEDIUM
Detection: Load balancer with zero healthy target instances
Savings: $18.00/month flat rate
How it works:
- We list all Lightsail load balancers
- LBs with zero healthy instances are flagged
- Lightsail charges a flat monthly rate regardless of traffic
Example:
Load Balancer: my-lb
Healthy Instances: 0
Monthly Savings: $18.00
AWS Reference: Lightsail Pricing
Why MEDIUM confidence: LB may be temporarily without instances during maintenance.
Idle Lightsail Databaseā
Confidence: HIGH
Detection: CPU < 1% and zero database connections for 14 days
Savings: $15ā$115/month (doubled for HA)
How it works:
- We query CloudWatch CPUUtilization and DatabaseConnections metrics
- Databases with consistently near-zero CPU and connections are flagged
- Stopped databases are also detected
- High-availability databases cost 2x the standard bundle price
Example:
Database: my-idle-db
Bundle: medium_2_0
Engine: MySQL
HA: No
Avg CPU: 0.5% over 14 days
Connections: 0 for 14 days
Monthly Savings: $30.00
AWS Reference: Lightsail Pricing
Why HIGH confidence: Both CPU and connection metrics confirm zero usage.
Idle Elastic Beanstalk Environmentā
Confidence: HIGH
Detection: Unhealthy environment (Grey/Red health status)
Savings: Environment resource costs (EC2, ELB, etc.)
How it works:
- We check each Beanstalk environment's health status via the DataProvider
- Environments with Grey or Red health in Ready state are flagged as unhealthy
- Savings are estimated from the underlying instance type, count, and load balancer
Example:
Environment: my-app-env
Health: Grey (unhealthy)
Instance Type: t3.medium (1 instance)
Monthly Savings: $32.00
AWS Reference: Elastic Beanstalk Pricing
Why HIGH confidence: Grey/Red health is a definitive AWS signal that the environment is not functioning correctly.
Idle Beanstalk Environment (Zero Traffic)ā
Confidence: HIGH
Detection: Zero CloudWatch RequestCount for 14 consecutive days
Savings: Full environment resource costs (EC2, ALB, EBS)
How it works:
- We analyze CloudWatch RequestCount metrics for each Beanstalk environment
- Environments with zero requests for 14+ days are flagged as idle
- Savings include EC2 instances, ALB (if load-balanced), and EBS volumes
Example:
Environment: staging-api
Requests (14 days): 0
Instance Type: t3.small (2 instances, load-balanced)
Monthly Savings: $48.00
AWS Reference: Elastic Beanstalk Pricing
Why HIGH confidence: Zero requests over 14 days is a definitive signal of no traffic.
Unnecessary Load Balancer on Single-Instance Beanstalkā
Confidence: HIGH
Detection: Load-balanced environment with auto-scaling min=max=1
Savings: ~$22/month (ALB fixed cost + minimum LCU)
How it works:
- We check environment type (LoadBalanced) and Auto Scaling group settings
- If min=max=1, the ALB is serving a single instance with no scaling benefit
- Switching to SingleInstance type removes the ALB overhead
Example:
Environment: dev-api
Type: LoadBalanced, Min=1, Max=1
ALB Cost: $16.43/month + LCU: $5.84/month
Monthly Savings: $22.27
AWS Reference: Elastic Beanstalk Environment Types
Why HIGH confidence: Configuration analysis with no ambiguity ā min=max=1 means no scaling.
Beanstalk Previous-Generation Instancesā
Confidence: HIGH
Detection: Environment using previous-generation instance families (t2, m4, c4, etc.)
Savings: Price delta between old and new generation Ć instance count Ć 730 hours
How it works:
- We check the instance type configured for each environment
- If the family is in our previous-gen mapping (t2āt3, m4ām5, c4āc5, etc.), we flag it
- Current-gen instances offer better performance at the same or lower price
Example:
Environment: legacy-worker
Instance Type: t2.medium ā t3.medium
Monthly Savings: $3.50 per instance
AWS Reference: Previous Generation Instances
Why HIGH confidence: Instance family mapping is deterministic with known pricing.
Over-Provisioned Beanstalk Environmentā
Confidence: MEDIUM
Detection: Multi-instance environment with <25% avg CPU over 14 days
Savings: Excess instance costs (EC2 + EBS per removed instance)
How it works:
- We analyze CPUUtilization metrics for environments with 2+ instances
- If average CPU is below 25% for 14 days, the environment is over-provisioned
- We recommend a target count at 50% CPU headroom
Example:
Environment: api-production
Instances: 4 Ć t3.medium at 12% avg CPU
Recommended: 2 instances
Monthly Savings: $65.00
AWS Reference: Elastic Beanstalk Auto Scaling
Why MEDIUM confidence: CPU metrics are a strong indicator but don't capture memory or I/O bottlenecks.
Orphaned RDS from Beanstalk Environmentā
Confidence: HIGH
Detection: RDS with EB tags and zero DatabaseConnections for 14 days
Savings: RDS instance + storage costs
How it works:
- We scan RDS instances for Elastic Beanstalk-related tags
- We check CloudWatch DatabaseConnections ā zero for 14 days means no app is using it
- This detector is online-only (requires cross-service API calls)
Example:
RDS Instance: eb-prod-api-db (db.t3.micro)
Connections (14 days): 0
Associated Environment: prod-api (terminated)
Monthly Savings: $14.00
AWS Reference: Elastic Beanstalk with RDS
Why HIGH confidence: Zero connections for 14 days with EB tags is a definitive orphan signal.
EKS Extended Support Costā
Confidence: HIGH
Detection: EKS cluster version in extended support window
Savings: Extended support surcharge ($0.60/cluster-hour year 1ā2, $1.20 year 3)
How it works:
- We check each EKS cluster's Kubernetes version against the AWS lifecycle policy
- Versions past their standard support date incur an extended support surcharge
- Surcharge is per cluster-hour and can be significant for multi-cluster environments
Example:
Cluster: my-production-cluster
Version: 1.24 (extended support since 2024-01)
Nodes: 5
Monthly Surcharge: $438.00 ($0.60/hr Ć 730 hrs)
Action: Upgrade to supported version
AWS Reference: EKS Extended Support Pricing
Why HIGH confidence: Cluster version is deterministic metadata. Extended support dates are published by AWS.
Oversized WorkSpacesā
Confidence: MEDIUM
Detection: CloudWatch UserConnected = 0 or CPU < 5% over 14 days
Savings: Downsize to smaller bundle (Standard ā Value)
How it works:
- We analyze WorkSpaces usage via CloudWatch metrics
- WorkSpaces with very low utilization can be downsized to smaller bundles
- Switching from Performance to Standard or Value bundle reduces monthly cost
Example:
WorkSpace: ws-abc123def
Bundle: Performance (8 vCPU, 16 GB)
Avg CPU: 3% over 14 days
Recommended: Standard (2 vCPU, 4 GB)
Monthly Savings: $42.00
AWS Reference: WorkSpaces Pricing
Why MEDIUM confidence: Usage patterns may vary seasonally. Verify with the workspace owner before downsizing.
WorkSpaces AutoStop Opportunityā
Confidence: HIGH
Detection: AlwaysOn WorkSpace with > 7 days since last user connection
Savings: Difference between AlwaysOn and AutoStop billing mode
How it works:
- We identify AlwaysOn WorkSpaces where the user hasn't connected in over 7 days
- AlwaysOn charges a fixed monthly rate regardless of usage
- AutoStop charges a lower base fee plus hourly usage, saving money for infrequent users
- Break-even analysis determines if AutoStop would be cheaper
Example:
WorkSpace: ws-abc123def
Bundle: Performance (AlwaysOn)
Last Connection: 12 days ago
AlwaysOn Cost: $60.00/month
Est. AutoStop Cost: $15.25/month (base + ~20 hours)
Monthly Savings: $44.75
AWS Reference: WorkSpaces Pricing
Why HIGH confidence: Connection status directly from AWS API. AlwaysOn vs AutoStop pricing is deterministic.
WorkSpaces Pool Overprovisioned Capacityā
Confidence: HIGH
Detection: WorkSpaces Pool utilization < 75% with excess capacity ā„ 2 slots
Savings: Cost of excess idle pool slots
How it works:
- We check WorkSpaces Pool desired vs running session counts
- Pools with utilization below 75% and excess capacity above p95 demand are flagged
- Excess slots incur charges regardless of whether users are connected
Example:
Pool: engineering-pool
Configured Sessions: 20
Active Sessions: 8 (40% utilization)
Excess Slots: 12
Monthly Savings: $420.00
AWS Reference: WorkSpaces Pools Pricing
Why HIGH confidence: Pool capacity and session counts come directly from the WorkSpaces API.
WorkSpaces Windows License Optimizationā
Confidence: MEDIUM
Detection: ā„ 5 Windows license-included WorkSpaces (non-BYOL)
Savings: ~$4/month per desktop via BYOL or Linux alternatives
How it works:
- We count WorkSpaces running Windows with AWS-included licenses
- Organizations with Microsoft Volume Licensing or Software Assurance can bring their own licenses (BYOL)
- Alternatively, eligible users can migrate to Amazon Linux WorkSpaces to eliminate the license premium
Example:
Windows WorkSpaces (license-included): 25
License Premium: ~$4/desktop/month
Monthly Savings: $100.00
AWS Reference: WorkSpaces BYOL
Why MEDIUM confidence: BYOL eligibility depends on your Microsoft licensing agreement. This is an advisory finding ā verify licensing terms before acting.
Containers Detectorsā
Idle ECS Serviceā
Confidence: HIGH
Detection: ECS service with desiredCount > 0 but runningCount = 0
Savings: Full Fargate compute cost (vCPU + memory)
How it works:
- We identify ECS services where desired tasks are set but no tasks are actually running
- This indicates a deployment issue or abandoned service still incurring costs
- Fargate charges per-second for running tasks based on vCPU and memory
Example:
Service: my-api-service
Cluster: production
Desired: 2, Running: 0
Task CPU: 1 vCPU, Memory: 2 GB
Monthly Savings: $65.70
AWS Reference: ECS Pricing
Why HIGH confidence: A service actively trying to run tasks but failing is a clear operational issue with direct cost impact.
ECS Without Auto-Scalingā
Confidence: MEDIUM
Detection: ECS service with ā„ 2 running tasks and no Application Auto Scaling target
Savings: ~30% of monthly compute cost from auto-scaling optimization
How it works:
- We check ECS services with multiple running tasks
- Services without Application Auto Scaling registered are flagged
- Auto-scaling allows services to scale down during low-traffic periods
Example:
Service: my-web-frontend
Cluster: production
Running Tasks: 4 (constant)
Auto Scaling: Not configured
Est. Monthly Savings: $98.00 (30% of compute)
AWS Reference: ECS Auto Scaling
Why MEDIUM confidence: Some services intentionally run at fixed capacity (e.g., worker pools). Verify workload patterns before enabling auto-scaling.
Container Insights Wasteā
Confidence: HIGH
Detection: Container Insights enabled on dev/test cluster or cluster with < 3 services
Savings: ~$3.50/month per small cluster (based on estimated metric volume)
How it works:
- We check if Container Insights is enabled on ECS clusters
- Dev/staging/sandbox clusters or clusters with few services don't benefit enough to justify the monitoring cost
- Each custom metric costs $0.30/month and Container Insights generates ~50 metrics per service
Example:
Cluster: dev-cluster
Environment: development
Services: 1
Container Insights: Enabled
Est. Metrics: 70
Monthly Savings: $4.90
AWS Reference: CloudWatch Container Insights Pricing
Why HIGH confidence: Cluster environment and service count are deterministic. Container Insights cost is directly proportional to metric volume.
Oversized ECS Memoryā
Confidence: MEDIUM
Detection: CloudWatch MemoryUtilization max < 40% over 7 days
Savings: Proportional to memory reduction in Fargate task definition
How it works:
- We analyze ECS task memory utilization via CloudWatch metrics
- Tasks where peak memory usage stays below 40% of allocated memory are flagged
- Reducing memory allocation in the task definition directly reduces Fargate cost
Example:
Service: my-worker
Task Memory: 4 GB
Max Memory Used: 1.2 GB (30%)
Recommended: 2 GB
Monthly Savings: $23.40
AWS Reference: ECS Task Definition Parameters
Why MEDIUM confidence: Memory usage may spike under certain workloads not captured during the analysis window. Monitor after resizing.
Storage Detectorsā
Unattached EBS Volumeā
Confidence: HIGH
Detection: describe_volumes with status=available
Savings: size Ć price per GB (varies by volume type)
Pricing Table:
| Volume Type | Price per GB-month |
|---|---|
| gp3 | $0.08 |
| gp2 | $0.10 |
| io1/io2 | $0.125 + IOPS |
| st1 | $0.045 |
| sc1 | $0.015 |
Example:
Volume: vol-0abc123def456
Type: gp3
Size: 500GB
Status: available (not attached)
Monthly Savings: $40.00 (500 Ć $0.08)
AWS Reference: EBS Pricing
Why HIGH confidence: We get exact size and type from the API. Pricing is fixed per GB.
gp2 ā gp3 Migrationā
Confidence: HIGH
Detection: describe_volumes with volume-type=gp2
Savings: size Ć $0.02/GB-month (20% savings)
How it works:
- gp2 costs $0.10/GB-month
- gp3 costs $0.08/GB-month (20% cheaper)
- gp3 includes 3,000 IOPS and 125 MB/s baseline (same or better performance)
Example:
Volume: vol-0abc123def456
Type: gp2
Size: 1000GB
Current Cost: $100/month
After Migration: $80/month
Monthly Savings: $20.00
AWS Reference: EBS Pricing
Why HIGH confidence: AWS explicitly states gp3 is 20% cheaper with equal or better baseline performance.
Old EBS Snapshotsā
Confidence: HIGH
Detection: Snapshots older than threshold (default: 90 days)
Savings: VolumeSize Ć $0.05/GB-month
Example:
Snapshot: snap-0abc123def456
Age: 180 days
Size: 100GB
Monthly Savings: $5.00
AWS Reference: EBS Pricing (Snapshots)
Why HIGH confidence: Snapshot size is exact; pricing is fixed at $0.05/GB-month.
Orphaned EBS Snapshotsā
| Property | Value |
|---|---|
| Waste Type | orphaned_ebs_snapshot |
| Category | Storage Optimization |
| Confidence | HIGH |
| Risk Level | LOW |
| Savings Estimate | volume_size_gb Ć $0.05/month (upper-bound; actual size may be smaller due to incremental storage) |
What it detects: EBS snapshots whose source volume has been deleted. When a volume is terminated, its snapshots remain and continue incurring charges indefinitely. These orphaned snapshots cannot be used for incremental recovery of the original workload ā they only serve as standalone restore points.
Detection logic: Cross-references all snapshots against existing EBS volumes. If a snapshot's VolumeId refers to a volume that no longer exists, the snapshot is flagged as orphaned.
Recommended action: Delete orphaned snapshots that are no longer needed. Consider creating an AWS Data Lifecycle Manager policy to automate snapshot retention.
AWS CLI:
# List orphaned snapshots (source volume deleted)
aws ec2 describe-snapshots --owner-ids self \
--query 'Snapshots[*].{ID:SnapshotId,Vol:VolumeId,Size:VolumeSize,Created:StartTime}' \
--output table
AMI Orphaned Snapshotsā
| Property | Value |
|---|---|
| Waste Type | ami_orphaned_snapshot |
| Category | Storage Optimization |
| Confidence | HIGH |
| Risk Level | LOW |
| Savings Estimate | volume_size_gb Ć $0.05/month (upper-bound; actual size may be smaller due to incremental storage) |
What it detects: EBS snapshots that were created by CreateImage to back an AMI, but the AMI has since been deregistered. When an AMI is deregistered, its backing snapshots are NOT automatically deleted ā they remain and continue incurring charges.
Detection logic: Matches snapshot descriptions against the pattern Created by CreateImage(i-xxx) for ami-xxx, then checks whether the referenced AMI is still registered. If the AMI no longer exists, the snapshot is flagged.
Recommended action: Delete the orphaned snapshot after confirming no other AMIs or launch templates reference it. Consider automating AMI cleanup to include backing snapshot deletion.
AWS CLI:
# Find snapshots created by CreateImage for deregistered AMIs
aws ec2 describe-snapshots --owner-ids self \
--query 'Snapshots[?starts_with(Description, `Created by CreateImage`)].{ID:SnapshotId,Desc:Description,Size:VolumeSize}' \
--output table
ECR Repository Without Lifecycle Policyā
Confidence: MEDIUM
Detection: ECR repository with no lifecycle policy configured
Savings: Varies based on image count and size
How it works:
- We check each ECR repository for a lifecycle policy
- Repos without policies accumulate old images indefinitely
- This leads to unbounded storage costs over time
Example:
Repository: my-app-images
Lifecycle Policy: None
Total Images: 150 (120 untagged)
Estimated Size: 45GB
Monthly Savings: $4.50 (if cleaned up)
AWS Reference: ECR Pricing
Why MEDIUM confidence: Actual savings depend on cleanup behavior; lifecycle policies prevent future waste.
S3 Bucket Without Lifecycle Policyā
Confidence: HIGH
Detection: S3 bucket with no lifecycle rules configured
Savings: Varies based on object count and size
How it works:
- We check each S3 bucket for lifecycle configuration
- Buckets without lifecycle rules accumulate objects indefinitely
- Lifecycle rules can transition objects to cheaper storage or delete old data
Example:
Bucket: my-application-logs
Lifecycle Rules: None
Objects: 2.5 million
Size: 500GB
Monthly Cost: $11.50 (S3 Standard)
Recommendation: Add lifecycle to transition to Glacier after 90 days
Potential Savings: $10.00/month (87%)
AWS Reference: S3 Pricing
Why HIGH confidence: Lifecycle configuration is binary (exists or not). Storage costs are exact.
Incomplete Multipart Uploadsā
Confidence: MEDIUM
Detection: S3 multipart uploads not completed
Savings: Size of incomplete parts Ć storage rate
How it works:
- We list incomplete multipart uploads in S3 buckets
- Failed or abandoned uploads leave orphaned parts that accumulate
- These parts are charged as regular S3 storage
Example:
Bucket: my-large-files
Incomplete Uploads: 15
Total Size: 25GB
Monthly Savings: $0.58
Recommendation: Add lifecycle rule to abort incomplete uploads after 7 days
AWS Reference: S3 Pricing
Why MEDIUM confidence: Uploads may be legitimately in progress. Short time threshold reduces false positives.
Idle EFS File Systemā
Confidence: HIGH
Detection: Zero client connections for 14 days
Savings: Full storage cost
How it works:
- We query CloudWatch for EFS client connections
- File systems with zero connections for 14+ days are flagged
- EFS charges for stored data even when not accessed
Example:
File System: fs-0abc123
Size: 100GB
Connections (14 days): 0
Monthly Savings: $30.00 (Standard) or $1.60 (IA)
AWS Reference: EFS Pricing
Why HIGH confidence: Zero client connections is definitive. No one is accessing this file system.
Idle FSx File Systemā
Confidence: HIGH
Detection: Zero network I/O for 14 days
Savings: Full file system cost
How it works:
- We query CloudWatch for FSx network metrics
- File systems with zero I/O for 14+ days are flagged
- FSx charges continuously whether data is accessed or not
Example:
File System: fs-0abc123def456
Type: FSx for Windows
Capacity: 300GB
Network I/O (14 days): 0
Monthly Savings: $68.40
AWS Reference: FSx Pricing
Why HIGH confidence: Zero network I/O confirms no file access. FSx charges by provisioned capacity.
Oversized FSx Filesystemā
Confidence: MEDIUM
Detection: Storage utilization below 40% of provisioned capacity
Savings: Estimated from excess provisioned capacity
How it works:
- We compare provisioned storage against actual used capacity via CloudWatch FreeStorageCapacity
- Filesystems where used capacity stays below 40% of provisioned are flagged
- Type-aware for Windows, ONTAP, OpenZFS, and Lustre
Example:
File System: fs-0abc123def456
Type: FSx for Windows
Capacity: 2,000 GB provisioned
Used: 500 GB (25%)
Estimated Savings: $130.00/month
AWS Reference: FSx Pricing
Why MEDIUM confidence: Low utilization is a strong signal, but resizing requires operational caution. Some FSx types don't support in-place shrink.
FSx Throughput Overprovisionedā
Confidence: MEDIUM
Detection: Throughput utilization below 30% of provisioned capacity
Savings: Estimated from stepping down one throughput tier
How it works:
- We measure throughput utilization against provisioned throughput capacity
- Filesystems (Windows/ONTAP/OpenZFS) where average utilization stays below 30% are flagged
- Lustre excluded because its performance model is inherently bursty
Example:
File System: fs-0abc123def456
Type: FSx for ONTAP
Provisioned Throughput: 512 MB/s
Average Utilization: 12%
Estimated Savings: $33.28/month
AWS Reference: FSx Pricing
Why MEDIUM confidence: Throughput is a major cost driver, but requires validation of latency baselines before changing.
Old FSx Backupā
Confidence: MEDIUM
Detection: Manual FSx backup older than 90 days
Savings: Backup storage cost (varies by size)
How it works:
- We list all FSx backups and filter for manual/user-initiated type
- Backups older than 90 days are flagged
- Retention-tagged backups (compliance, retain, legal-hold) are excluded
- Automatic backups managed by FSx retention policies are excluded
Example:
Backup: backup-0abc123def456
Filesystem: fs-0abc123 (Windows)
Age: 180 days
Type: USER_INITIATED
Note: Backup deletion is irreversible. Verify backup is no longer needed for recovery.
AWS Reference: FSx Pricing
Why MEDIUM confidence: Manual backups past retention threshold are likely unnecessary, but deletion is irreversible.
Old ECR Imagesā
Confidence: MEDIUM
Detection: Container images older than 90 days
Savings: Image size Ć $0.10/GB-month
How it works:
- We list all images in ECR repositories
- Images older than 90 days are flagged for cleanup
- Old images often represent deprecated versions
Example:
Repository: my-app-images
Images > 90 days: 45
Total Size: 12GB
Monthly Savings: $1.20
AWS Reference: ECR Pricing
Why MEDIUM confidence: Old images may be needed for rollbacks. Consider keeping last N versions.
Over-Provisioned IOPSā
Confidence: HIGH
Detection: EBS volume with provisioned IOPS where peak usage < 50% over 14 days
Savings: Excess IOPS Ć pricing ($0.065/IOPS/month for io1/io2, $0.005/IOPS/month for gp3)
How it works:
- We identify EBS volumes with provisioned IOPS (io1, io2, gp3 with custom IOPS)
- CloudWatch VolumeReadOps and VolumeWriteOps are analyzed for 14 days
- If peak IOPS usage stays below 50% of provisioned, the volume is over-provisioned
Example:
Volume: vol-0abc123def456
Type: io2, Size: 500 GB
Provisioned IOPS: 10,000
Peak IOPS (14d): 3,200 (32%)
Recommended: 6,400 IOPS
Monthly Savings: $234.00
AWS Reference: EBS Pricing
Why HIGH confidence: CloudWatch IOPS metrics are precise. We use peak (not average) to avoid under-provisioning.
EFS Without Lifecycle Policyā
Confidence: MEDIUM
Detection: EFS filesystem ā„ 1 GB without Infrequent Access lifecycle transition
Savings: Standard ($0.30/GB) ā IA ($0.025/GB) on ~80% of infrequently accessed files
How it works:
- We check EFS filesystems for lifecycle configuration
- Filesystems without TransitionToIA rules keep all data in Standard storage class
- Most EFS data becomes infrequently accessed after 30 days ā significant savings opportunity
Example:
Filesystem: fs-0abc123def456
Size: 250 GB (Standard)
Lifecycle Policy: None
Est. IA-eligible: 200 GB (80%)
Monthly Savings: $55.00
AWS Reference: EFS Lifecycle Management
Why MEDIUM confidence: Actual access patterns determine IA eligibility. The 80% estimate is based on typical workloads.
Untagged ECR Imagesā
Confidence: HIGH
Detection: ECR repository with untagged images totaling > 0.1 GB
Savings: Untagged image size Ć $0.10/GB/month
How it works:
- We scan ECR repositories for images without tags
- Untagged images are typically intermediate build layers or superseded images
- These accumulate silently and can grow to significant storage costs
Example:
Repository: my-service
Untagged Images: 156
Total Size: 8.5 GB
Monthly Savings: $0.85
AWS Reference: ECR Pricing
Why HIGH confidence: Image tag status and size are exact metadata from the ECR API.
Old Backup Recovery Pointsā
Confidence: MEDIUM
Detection: AWS Backup recovery point older than 180 days without compliance hold
Savings: Backup storage at $0.05/GB/month
How it works:
- We list all AWS Backup recovery points across vaults
- Points older than 180 days without compliance-lock tag are flagged
- Organizations often forget to set lifecycle expiration on backup plans
Example:
Resource: arn:aws:rds:us-east-1:123456789:db:mydb
Recovery Point Age: 245 days
Size: 50 GB
Monthly Savings: $2.50
AWS Reference: AWS Backup Pricing
Why MEDIUM confidence: Compliance requirements vary. Some industries require longer retention. Verify before deleting.
Redundant Backup Recovery Pointsā
Confidence: MEDIUM
Detection: Multiple recovery points for same resource within 24-hour window
Savings: Redundant point storage at $0.05/GB/month
How it works:
- We group recovery points by resource ARN
- Multiple points taken within 24 hours of each other are likely redundant
- Keeping only one point per time window reduces storage costs
Example:
Resource: arn:aws:ec2:us-east-1:123456789:volume/vol-abc123
Points in 24hr window: 4
Redundant Size: 120 GB
Monthly Savings: $6.00
AWS Reference: AWS Backup Pricing
Why MEDIUM confidence: Some workloads intentionally create frequent backups for RPO requirements.
Backup Plan Without Lifecycle Tieringā
Confidence: MEDIUM
Detection: Backup plan with retention > 90 days but no cold storage transition
Savings: ~$0.01/GB/month savings from warm-to-cold transition
How it works:
- We analyze backup plan lifecycle rules
- Plans retaining backups beyond 90 days without cold storage transitions are flagged
- Cold storage is significantly cheaper but has longer retrieval times
Example:
Backup Plan: daily-rds-backup
Retention: 365 days
Cold Transition: None
Total Backup Size: 500 GB
Monthly Savings: $5.00
AWS Reference: AWS Backup Lifecycle
Why MEDIUM confidence: Cold storage retrieval is slower. Verify RTO requirements before enabling tiering.
Stale Backup Plan Assignmentā
Confidence: MEDIUM
Detection: Backup plan selection targeting resources that no longer exist
Savings: Backup plan resource selection overhead
How it works:
- We examine backup plan resource assignments
- Assignments referencing deleted resources are flagged
- These create unnecessary backup plan evaluation overhead
Example:
Backup Plan: weekly-ebs
Selection: vol-abc123 (DELETED)
Status: Resource not found
Action: Remove stale assignment
AWS Reference: AWS Backup Resource Assignment
Why MEDIUM confidence: Resource may have been replaced with a different ID. Verify the backup plan still covers the intended resources.
Backup Cross-Region Copy Overreachā
Confidence: LOW
Detection: Cross-region backup copies to regions with no operational presence
Savings: Cross-region transfer at $0.02/GB
How it works:
- We analyze backup plans with cross-region copy rules
- Copies to regions where no resources are deployed may be unnecessary
- Each cross-region copy incurs data transfer and storage charges
Example:
Backup Plan: critical-data
Source: us-east-1
Copy Destinations: eu-west-1, ap-southeast-1
Resources in eu-west-1: 0
Monthly Savings: $5.00 (transfer + storage)
AWS Reference: AWS Backup Cross-Region
Why LOW confidence: Cross-region copies serve DR purposes. Verify disaster recovery requirements before removing.
S3 Empty Bucketā
Confidence: LOW
Detection: S3 bucket with 0 objects, created 30+ days ago
Savings: Zero direct savings (hygiene finding)
How it works:
- We identify empty S3 buckets that have existed for 30+ days
- Infrastructure buckets (CDK, CloudFormation, logging prefixes) are excluded
- Empty buckets consume no storage but create management overhead
Example:
Bucket: old-data-export-2024
Objects: 0, Size: 0 bytes
Created: 2024-06-15 (300+ days ago)
Monthly Savings: $0.00 (cleanup only)
AWS Reference: S3 Pricing
Why LOW confidence: Buckets may be pre-provisioned for future use or referenced by application configuration.
S3 Wrong Storage Classā
Confidence: LOW
Detection: Bucket with > 90% Standard storage, > 50 GB, no Intelligent-Tiering
Savings: ~25% from Intelligent-Tiering automatic downtiering
How it works:
- We analyze S3 storage class distribution per bucket
- Buckets with predominantly Standard storage and no tiering transitions are flagged
- S3 Intelligent-Tiering automatically moves infrequently accessed objects to lower-cost tiers
Example:
Bucket: application-logs
Size: 250 GB (98% Standard)
Intelligent-Tiering: Not configured
Est. Monthly Savings: $1.44
AWS Reference: S3 Intelligent-Tiering
Why LOW confidence: Access patterns vary. Intelligent-Tiering adds a small monitoring fee ($0.0025/1000 objects).
S3 Rapid Growthā
Confidence: LOW
Detection: Bucket growth > 100% in 30 days with > 10 GB absolute growth
Savings: Variable ā depends on addressing root cause of growth
How it works:
- We compare bucket size metrics over rolling 30-day windows
- Buckets doubling in size with significant absolute growth are flagged
- Often indicates missing lifecycle rules, runaway logging, or misconfigured pipelines
Example:
Bucket: data-pipeline-output
30 days ago: 45 GB
Now: 120 GB (+167%)
Growth Rate: 2.5 GB/day
Monthly Savings: Potential $1.73+ if addressed
AWS Reference: S3 Storage Lens
Why LOW confidence: Growth may be expected (new data pipeline, migration). Requires investigation into root cause.
S3 High Request & Transfer Costā
Confidence: LOW
Detection: Non-storage costs (requests + transfer) exceed storage costs
Savings: Variable ā CDN, VPC endpoints, or caching can reduce transfer costs
How it works:
- We compare S3 request and transfer costs against storage costs
- When requests/transfer cost more than storage, architectural optimization may help
- Common solutions: CloudFront for public access, VPC endpoints for private access
Example:
Bucket: api-assets
Storage Cost: $5.00/month
Request + Transfer Cost: $12.00/month (240% of storage)
Monthly Savings: ~$6.00 with CloudFront
AWS Reference: S3 Request Pricing
Why LOW confidence: Architectural changes require significant effort. Cost ratio is an indicator, not a definitive finding.
Database Detectorsā
Idle RDS Instanceā
Confidence: HIGH
Detection: CloudWatch DatabaseConnections = 0 for 14+ days
Savings: Full instance cost
How it works:
- We query CloudWatch for database connection metrics
- Zero connections for 14+ days = definitely unused
- Monthly cost based on instance class and Multi-AZ configuration
Example:
Database: my-dev-database
Class: db.t3.medium
Engine: MySQL
Connections (14 days): 0
Monthly Savings: $49.28
AWS Reference: RDS Pricing
Why HIGH confidence: Zero connections is an unambiguous signal. No one is using this database.
Old RDS Snapshotsā
Confidence: MEDIUM
Detection: Manual snapshots older than threshold (default: 90 days)
Savings: AllocatedStorage Ć $0.095/GB-month
Example:
Snapshot: my-database-snapshot-2024
Age: 120 days
Size: 200GB
Monthly Savings: $19.00
AWS Reference: RDS Pricing (Backup Storage)
Why MEDIUM confidence: Old snapshots may be needed for compliance. Review before deleting.
Aurora I/O Optimization Opportunityā
Confidence: HIGH
Detection: Aurora Standard cluster where I/O charges exceed 25% of total spend
Savings: Difference between Standard (storage + I/O) and I/O-Optimized (storage only) costs
How it works:
- We identify Aurora clusters using Standard storage (not I/O-Optimized)
- Calculate the I/O-to-total-spend ratio from CloudWatch VolumeReadIOPs and VolumeWriteIOPs
- If I/O charges exceed 25% of total Aurora spend, switching to I/O-Optimized eliminates per-I/O charges
- I/O-Optimized storage costs ~30% more per GB but includes unlimited I/O
Example:
Cluster: my-aurora-cluster
Engine: aurora-postgresql
Storage Type: Aurora Standard
Monthly I/O Cost: $450.00
Monthly Storage Cost: $180.00
I/O Ratio: 71.4% (> 25% threshold)
Switch to I/O-Optimized Savings: $216.00/month
AWS Reference: Aurora Pricing ā I/O-Optimized
Why HIGH confidence: I/O costs are directly measurable from CloudWatch metrics. The 25% threshold is AWS's own recommended breakeven point.
Aurora Extended Support Costā
Confidence: HIGH
Detection: Aurora cluster running an end-of-life engine version under Extended Support
Savings: Per-vCPU hourly surcharge (Year 1-2: $0.100/vCPU-hr, Year 3+: $0.200/vCPU-hr)
How it works:
- We check the Aurora engine version against known EOL dates (PostgreSQL 12, MySQL 5.7)
- If the version has passed its standard support end date, AWS charges Extended Support fees
- Surcharges are per-vCPU per hour, escalating after year 2
- We also flag clusters within 90 days of EOL as a warning
Example:
Cluster: pg12-eol-cluster
Engine: aurora-postgresql 12.18
EOL Date: 2025-02-28
Extended Support: Year 1 rate
Instances: 2Ć db.r6g.xlarge (4 vCPUs each)
Surcharge: 8 vCPUs Ć $0.100/hr Ć 730 hrs = $584.00/month
AWS Reference: Aurora Extended Support Pricing
Why HIGH confidence: EOL dates and surcharge rates are published by AWS. vCPU counts are deterministic per instance class.
Aurora Serverless v2 Opportunityā
Confidence: MEDIUM
Detection: Low-utilization provisioned Aurora cluster suitable for Serverless v2 migration
Savings: Difference between provisioned instance cost and estimated Serverless v2 cost at 2 ACU baseline
How it works:
- We check CloudWatch CPU metrics: average < 15% and max < 40% over 14 days
- Excludes clusters already running Serverless, Global databases, and multi-writer setups
- Estimates Serverless v2 cost at 2 ACU baseline Ć $0.12/ACU-hr
- Compares against current provisioned instance cost
Example:
Cluster: low-traffic-aurora
Engine: aurora-postgresql
Instance: db.r6g.xlarge
CPU Avg: 8%, CPU Max: 22%
Provisioned Cost: $526.00/month
Estimated Serverless Cost: $175.20/month
Monthly Savings: $350.80
AWS Reference: Aurora Serverless v2 Pricing
Why MEDIUM confidence: Serverless v2 costs depend on actual ACU consumption which varies. The 2-ACU baseline is a conservative estimate. Workload spikes may increase actual Serverless costs.
DynamoDB Without Auto-Scalingā
Confidence: HIGH
Detection: Provisioned capacity DynamoDB table with no auto-scaling configured
Savings: Varies based on provisioned capacity and utilization
How it works:
- We identify DynamoDB tables using provisioned capacity mode
- Check if Application Auto Scaling is configured for the table
- Tables without auto-scaling may be over-provisioned or at risk of throttling
Example:
Table: my-provisioned-table
Mode: Provisioned (10 RCU, 10 WCU)
Auto-Scaling: Not configured
Monthly Cost: $9.40
Recommendation: Enable auto-scaling or switch to on-demand
AWS Reference: DynamoDB Pricing
Why HIGH confidence: Auto-scaling is a best practice for provisioned tables. Configuration is binary (enabled or not).
Idle DynamoDB Tableā
Confidence: HIGH
Detection: Zero consumed read/write capacity for 14 days
Savings: Full table cost (provisioned) or minimum charges (on-demand)
How it works:
- We query CloudWatch for ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits
- Tables with zero consumption for 14+ days are flagged
- Provisioned tables charge continuously; on-demand has storage costs
Example:
Table: my-old-table
Mode: Provisioned (5 RCU, 5 WCU)
Activity (14 days): 0 operations
Monthly Savings: $4.70
AWS Reference: DynamoDB Pricing
Why HIGH confidence: Zero consumed capacity means no reads or writes. Table is unused.
Over-Provisioned DynamoDBā
Confidence: MEDIUM
Detection: Capacity utilization < 20% for 14 days
Savings: Based on right-sized capacity
How it works:
- We compare provisioned capacity to consumed capacity via CloudWatch
- Tables using less than 20% of provisioned capacity are over-provisioned
- Right-sizing or switching to on-demand can reduce costs
Example:
Table: my-app-data
Provisioned: 100 RCU, 100 WCU
Used (avg): 15 RCU, 8 WCU
Utilization: 15%, 8%
Monthly Savings: ~$60.00
AWS Reference: DynamoDB Pricing
Why MEDIUM confidence: Traffic patterns may spike periodically. Review before right-sizing.
Idle ElastiCache Clusterā
Confidence: HIGH
Detection: Zero connections for 7 days
Savings: Full cluster cost
How it works:
- We query CloudWatch for CurrConnections metric
- Clusters with zero connections for 7+ days are flagged
- ElastiCache charges continuously for node hours
Example:
Cluster: my-redis-cluster
Node Type: cache.t3.micro
Nodes: 2
Cache Hits (14 days): 0
Monthly Savings: $24.82
AWS Reference: ElastiCache Pricing
Why HIGH confidence: Zero cache hits means no application is using this cache.
Idle Redshift Clusterā
Confidence: HIGH
Detection: Zero queries for 14 days
Savings: Full cluster cost
How it works:
- We query CloudWatch for query-related metrics
- Clusters with no query activity for 14+ days are flagged
- Redshift charges continuously for cluster nodes
Example:
Cluster: my-data-warehouse
Node Type: dc2.large
Nodes: 2
Queries (14 days): 0
Monthly Savings: $360.00
AWS Reference: Redshift Pricing
Why HIGH confidence: Zero queries means no one is using this data warehouse.
Idle OpenSearch Domainā
Confidence: HIGH
Detection: Zero requests for 14 days
Savings: Full domain cost
How it works:
- We query CloudWatch for OpenSearch request metrics
- Domains with zero requests for 14+ days are flagged
- OpenSearch charges continuously for instance hours
Example:
Domain: my-search-domain
Instance Type: t3.small.search
Instances: 2
Requests (14 days): 0
Monthly Savings: $51.10
AWS Reference: OpenSearch Pricing
Why HIGH confidence: Zero requests means no applications are using this search domain.
Oversized OpenSearch Domainā
Confidence: MEDIUM
Detection: Average CPU < 20% and max CPU < 40% over 14 days with active search traffic
Savings: Estimated from stepping down instance type (or 30% conservative fallback)
How it works:
- We collect CloudWatch CPUUtilization (average and maximum) over 14 days
- Domains with sustained low CPU but non-zero search traffic are flagged
- A step-down instance type map estimates savings from rightsizing
- If the instance type isn't in the step-down map, a conservative 30% savings is estimated
Example:
Domain: oversized-search
Instance Type: r6g.xlarge.search
Instances: 3
Avg CPU: 12.0% Max CPU: 30.0%
Recommendation: Downsize to r6g.large.search
Monthly Savings: $368.00
AWS Reference: OpenSearch Pricing
Why MEDIUM confidence: Low CPU is a strong signal but workloads may have periodic spikes. Resize one dimension at a time and observe.
OpenSearch EBS Overprovisionedā
Confidence: MEDIUM
Detection: Free storage > 60% with data growth < 0.1 GB/day (EBS-backed domains only)
Savings: Reclaimable GB Ć ~$0.10/GB/month (gp3 approximate rate)
How it works:
- We query CloudWatch FreeStorageSpace to compute the percentage of unused EBS storage
- We calculate the storage growth rate from a 14-day trend
- Domains with >60% free storage and flat growth (<0.1 GB/day) are flagged
- We keep 30% headroom and estimate reclaimable GB
Example:
Domain: logs-archive
EBS: gp3, 500 GB Ć 2 nodes (1000 GB total)
Free Storage: 75%
Growth: 0.02 GB/day
Reclaimable: ~450 GB
Monthly Savings: $45.00
AWS Reference: OpenSearch Pricing
Why MEDIUM confidence: Storage trends may change. Verify index lifecycle policies and retention needs before resizing.
RI Opportunity: OpenSearchā
Confidence: LOW
Detection: On-demand OpenSearch domain eligible for Reserved Instance purchase
Savings: Estimated annual savings from Cost Explorer RI recommendations
How it works:
- We query Cost Explorer for OpenSearch RI purchase recommendations
- Domains running on-demand that would benefit from a 1-year partial-upfront RI are flagged
- Savings estimates come directly from AWS Cost Explorer
Example:
Domain: production-search
Instance Type: r6g.2xlarge.search
Current: On-Demand
Recommendation: 1-year partial-upfront RI
Estimated Annual Savings: $4,800
AWS Reference: OpenSearch Reserved Instances
Why LOW confidence: RI commitments are non-refundable. Verify workload stability before purchasing.
Idle Neptune Clusterā
Confidence: HIGH
Detection: Zero requests for 14 days
Savings: Full cluster cost
How it works:
- We query CloudWatch for Neptune request metrics
- Clusters with zero requests for 14+ days are flagged
- Neptune charges continuously for instance hours
Example:
Cluster: my-graph-database
Instance Type: db.t3.medium
Requests (14 days): 0
Monthly Savings: $59.86
AWS Reference: Neptune Pricing
Why HIGH confidence: Zero requests means no applications are using this graph database.
Neptune Serverless Migration Opportunityā
Confidence: HIGH
Detection: Provisioned Neptune cluster with consistently low CPU utilization
Savings: 50ā65% for low-utilization clusters
How it works:
- We identify provisioned Neptune clusters (not already Serverless)
- CloudWatch
CPUUtilizationis analyzed over 14 days - Clusters with <20% avg CPU and <50% max CPU are flagged
- Savings are estimated by comparing provisioned cost vs. Serverless at 2 NCU average
- Only flags instances larger than db.t3.medium (Serverless floor $80.15/mo > t3.medium $59.86/mo)
Example:
Cluster: dev-graph-database
Instance Type: db.r6g.large
CPU Average (14 days): 8.3%
CPU Maximum (14 days): 15.2%
Provisioned Cost: $228.49/month
Estimated Serverless Cost: $160.30/month (2 NCU avg)
Monthly Savings: $68.19
Recommended Action: Create a new Neptune Serverless cluster, restore from snapshot, and update application endpoints. Verify your query language ā Serverless supports Gremlin and openCypher but NOT SPARQL.
AWS Reference: Neptune Serverless
Why HIGH confidence: CPU utilization and instance pricing are well-defined. The Serverless floor ($80.15/mo at 1 NCU idle) provides a clear comparison point. Only flags when provisioned cost significantly exceeds Serverless estimate.
Over-Provisioned Neptune Instanceā
Confidence: MEDIUM
Detection: Neptune instance with consistently low CPU utilization
Savings: 30ā50% per right-sizing step
How it works:
- We analyze CloudWatch
CPUUtilizationfor each provisioned Neptune cluster - Clusters with <20% avg CPU and <40% max CPU are flagged
- Each over-provisioned instance is flagged with a target smaller instance class
- Savings calculated from published per-hour pricing difference
Example:
Instance: graph-writer-1 (db.r6g.xlarge)
Cluster: prod-graph
CPU Average (14 days): 12.1%
CPU Maximum (14 days): 22.5%
Current Cost: $456.98/month
Target: db.r6g.large ($228.49/month)
Monthly Savings: $228.49
Recommended Action: Modify the instance class via ModifyDBInstance. Brief downtime during change (5ā15 minutes). Start with reader instances to minimize impact.
Click Fix This to downsize the instance with automatic backup snapshot and one-click rollback.
AWS Reference: Neptune Pricing
Why MEDIUM confidence: CPU is a good proxy but doesn't capture memory-bound graph traversals. Some workloads burst memory without high CPU. Validate with monitoring after downsizing.
Old Neptune Cluster Snapshotā
Confidence: HIGH
Detection: Manual Neptune cluster snapshot older than 90 days
Savings: $0.021/GB-month per snapshot
How it works:
- We query Neptune for manual cluster snapshots (
DescribeDBClusterSnapshots) - Snapshots older than 90 days are flagged
- Only manual snapshots ā automated backups are managed by retention policy
Example:
Snapshot: pre-upgrade-snap-2025-06
Cluster: my-graph-db
Age: 280 days
Storage: 100 GB
Monthly Cost: $2.10
Recommended Action: Delete the snapshot if the data is no longer needed. Verify it's not a critical disaster recovery backup.
Click Fix This to delete the old snapshot. Note: snapshot deletion is irreversible.
AWS Reference: Neptune Pricing ā Backup Storage
Why HIGH confidence: Snapshot age is deterministic. Manual snapshots are created for specific events and rarely needed after 90 days.
Idle DocumentDB Clusterā
Confidence: HIGH
Detection: Zero connections for 14 days
Savings: Full cluster cost
How it works:
- We query CloudWatch for DocumentDB connection metrics
- Clusters with zero connections for 14+ days are flagged
- DocumentDB charges continuously for instance hours
Example:
Cluster: my-docdb-cluster
Instance Type: db.t3.medium
Connections (14 days): 0
Monthly Savings: $59.86
AWS Reference: DocumentDB Pricing
Why HIGH confidence: Zero connections means no applications are connected to this database.
Old DocumentDB Snapshotā
Confidence: MEDIUM
Detection: Manual snapshot older than 90 days (configurable)
Savings: ~$0.02/GB-month backup storage
How it works:
- Lists all manual DocumentDB cluster snapshots
- Groups snapshots by source cluster
- Protects the newest manual snapshot per cluster (disaster-recovery safety net)
- Skips snapshots tagged with
retention,compliance,legal, ordo-not-delete - Flags remaining snapshots that exceed the age threshold (default 90 days)
Example:
Snapshot: my-docdb-cluster-backup-2024-01
Cluster: my-docdb-cluster
Age: 180 days (threshold: 90)
Storage: 50 GB
Monthly Savings: $1.00
AWS Reference: DocumentDB Pricing
Why MEDIUM confidence: Old snapshots probably aren't needed, but deletion is irreversible. Some may be kept for compliance.
Overprovisioned DocumentDB Clusterā
Confidence: MEDIUM
Detection: Avg CPU < 20%, Max CPU < 40%, connections < 50, IOPS < 100 over 14 days
Savings: ~30% of cluster cost (instance downsizing)
How it works:
- Fetches CloudWatch metrics for each DocumentDB cluster over 14 days
- Checks four signals: average CPU, peak CPU, connection count, and combined read+write IOPS
- If all metrics are below thresholds, the cluster is a rightsizing candidate
- Advisory-only in v1 ā no automated remediation actions are generated
Example:
Cluster: my-docdb-cluster
Instance Type: db.r5.2xlarge (2 instances)
Avg CPU: 8.2%
Max CPU: 15.0%
Connections: 12
IOPS: 30 (read: 20, write: 10)
Monthly Savings: $219.00 (30% of cluster cost)
AWS Reference: DocumentDB Pricing
Why MEDIUM confidence: Multi-signal analysis shows consistent under-utilization, but workload patterns may be bursty. Snapshot the cluster before resizing.
Idle Timestream Databaseā
Confidence: HIGH
Detection: Zero writes for 30 days
Savings: Storage costs only (writes are pay-per-use)
How it works:
- We query CloudWatch for Timestream write metrics
- Databases with no writes for 30+ days are flagged
- Timestream charges for memory store and magnetic store retention
Example:
Database: my-timeseries-db
Tables: 3
Writes (30 days): 0
Memory Store: 5GB
Monthly Savings: $6.25 (storage)
AWS Reference: Timestream Pricing
Why HIGH confidence: Zero writes means no new data is being ingested. Consider archiving or deleting.
Idle QLDB Ledgerā
Confidence: HIGH
Detection: Zero requests for 30 days
Savings: Storage and I/O costs
How it works:
- We query CloudWatch for QLDB request metrics
- Ledgers with zero requests for 30+ days are flagged
- QLDB charges for storage and I/O operations
Example:
Ledger: my-ledger
Requests (30 days): 0
Storage: 10GB
Monthly Savings: $2.50 (storage)
AWS Reference: QLDB Pricing
Why HIGH confidence: Zero requests means no applications are using this ledger.
Aurora to RDS Downgrade Opportunityā
Confidence: MEDIUM
Detection: Aurora cluster with single instance, low I/O, and no Aurora-specific features
Savings: 20ā40% by switching to RDS for the same engine
How it works:
- We identify Aurora clusters with a single db instance
- Clusters with low I/O throughput and no read replicas don't benefit from Aurora's architecture
- Standard RDS is cheaper for simple, single-instance workloads
Example:
Cluster: my-app-aurora
Engine: aurora-mysql
Instances: 1
Avg I/O: 50 IOPS
Monthly Savings: $85.00 (switch to RDS MySQL)
AWS Reference: Aurora Pricing vs RDS
Why MEDIUM confidence: Aurora provides higher availability and faster failover. Verify that the application doesn't rely on Aurora-specific features.
Oversized ElastiCache Clusterā
Confidence: MEDIUM
Detection: ElastiCache cluster CPU utilization < 10% over 7 days with active connections
Savings: 40ā50% from downsizing to smaller node type
How it works:
- We analyze ElastiCache CPU utilization and connection metrics via CloudWatch
- Clusters with consistently low CPU (< 10%) despite having connections are oversized
- Downsizing to a smaller node type reduces cost proportionally
Example:
Cluster: session-cache
Node Type: cache.r6g.xlarge
Nodes: 2
Avg CPU: 6% over 7 days
Recommended: cache.r6g.large
Monthly Savings: $198.00
AWS Reference: ElastiCache Pricing
Why MEDIUM confidence: Cache performance is sensitive to memory capacity. Verify hit rates after downsizing.
Oversized Redshift Clusterā
Confidence: MEDIUM
Detection: Redshift cluster CPU utilization < 40% average over 7ā14 days
Savings: 30ā50% from resizing to fewer or smaller nodes
How it works:
- We analyze Redshift CPU utilization via CloudWatch over 7ā14 days
- Clusters consistently below 40% CPU are likely over-provisioned
- Elastic resize or classic resize can reduce the node count
Example:
Cluster: analytics-dw
Node Type: ra3.xlplus
Nodes: 4
Avg CPU: 22% over 14 days
Recommended: 2 nodes
Monthly Savings: $1,460.00
AWS Reference: Redshift Pricing
Why MEDIUM confidence: Query workloads can be bursty. Analyze query queuing times before resizing.
Underutilized Redshift Clusterā
Confidence: MEDIUM
Detection: Redshift cluster CPU < 40% with low query throughput over 14 days
Savings: 30ā50% from downsizing or pausing during idle periods
How it works:
- We analyze both CPU utilization and query execution metrics
- Low CPU combined with low query throughput indicates genuine underutilization
- Consider pausing during off-hours or resizing to a smaller cluster
Example:
Cluster: reporting-dw
Node Type: ra3.xlplus
Nodes: 3
Avg CPU: 18%, Queries/day: 45
Monthly Savings: $1,095.00
AWS Reference: Redshift Pricing
Why MEDIUM confidence: Distinguishes from oversized_redshift by incorporating query volume. Still requires workload analysis.
Redshift Without Pause Scheduleā
Confidence: MEDIUM
Detection: Multi-node Redshift cluster with no scheduled pause action
Savings: 30ā50% from pausing during non-business hours
How it works:
- We check for scheduled actions on Redshift clusters with > 2 nodes
- Clusters without pause/resume schedules run 24/7 even when unused at night
- Pausing during non-business hours (e.g., 8pmā8am) can halve the cost
Example:
Cluster: analytics-dw
Nodes: 4
Scheduled Actions: None
Usage Pattern: Business hours only
Monthly Savings: $1,460.00 (12hr/day pause)
AWS Reference: Redshift Pause and Resume
Why MEDIUM confidence: Some ETL jobs run overnight. Verify usage patterns before enabling pause schedules.
Redshift Legacy DC2 Nodesā
Confidence: LOW
Detection: Redshift cluster running on DC2 node types
Savings: ~15ā20% from migration to RA3 with managed storage
How it works:
- We check the node type of Redshift clusters
- DC2 nodes use local SSD storage and are a legacy architecture
- RA3 nodes offer managed storage (pay separately) and are typically cheaper overall
Example:
Cluster: legacy-dw
Node Type: dc2.large
Nodes: 4
Monthly Savings: $292.00 (migration to ra3.xlplus)
AWS Reference: Redshift Node Types
Why LOW confidence: Migration requires cluster resize and potential query plan changes. Test thoroughly.
Redshift Concurrency Scaling Wasteā
Confidence: MEDIUM
Detection: Concurrency scaling queues with underutilized scaling capacity
Savings: Reduce unused concurrency scaling cluster costs
How it works:
- We analyze Redshift concurrency scaling metrics
- Persistent scaling cluster activity indicates possible queue misconfiguration
- Optimizing WLM queues can reduce scaling cluster usage
Example:
Cluster: analytics-dw
Concurrency Scaling: Enabled
Avg Scaling Usage: 15% of capacity
Monthly Savings: $120.00
AWS Reference: Redshift Concurrency Scaling
Why MEDIUM confidence: Concurrency scaling is billed per-second. Savings depend on workload patterns.
Redshift Spectrum Heavy Usageā
Confidence: LOW
Detection: High Redshift Spectrum (external table) query volume relative to cluster cost
Savings: Optimization via partitioning, compression, or moving data to native tables
How it works:
- We monitor Spectrum query costs relative to cluster costs
- High Spectrum usage may indicate data that should be loaded into native Redshift tables
- Spectrum charges per TB scanned ā partitioning and compression reduce scan volume
Example:
Cluster: analytics-dw
Spectrum Scans: 8 TB/month
Spectrum Cost: $40.00/month
Recommendation: Partition and compress external data
Monthly Savings: ~$20.00
AWS Reference: Redshift Spectrum Pricing
Why LOW confidence: External table architecture may be intentional. Requires understanding of data pipeline design.
Redshift WLM Over-Provisionedā
Confidence: MEDIUM
Detection: WLM queues with > 3 slots and < 50% average slot utilization
Savings: ~30% compute savings from reducing slot allocation
How it works:
- We analyze Workload Management queue configuration and utilization
- Queues with excessive slot allocation relative to usage waste cluster resources
- Reducing from 4ā5 slots to 2ā3 improves per-query performance and reduces overhead
Example:
Cluster: analytics-dw
Queue: etl_queue
Slots Allocated: 5
Avg Utilization: 35%
Recommended: 2 slots
Monthly Savings: $438.00
AWS Reference: Redshift WLM Configuration
Why MEDIUM confidence: WLM tuning affects query queuing behavior. Test with representative workloads.
RI Opportunity: ElastiCacheā
Confidence: HIGH
Detection: AWS Cost Explorer recommendation for ElastiCache Reserved Instance
Savings: 40ā55% discount vs On-Demand (exact AWS calculation)
How it works:
- AWS Cost Explorer analyzes 30 days of ElastiCache usage
- Identifies consistent node usage suitable for RI commitment
- Returns exact savings calculation for 1-year or 3-year terms
Example:
Node Type: cache.r6g.large
Current: On-Demand ($0.252/hr)
RI (1yr, No Upfront): $0.155/hr
Monthly Savings: $70.81
AWS Reference: ElastiCache Reserved Nodes
Why HIGH confidence: Uses AWS's own Cost Explorer analysis with exact pricing.
RI Opportunity: Redshiftā
Confidence: HIGH
Detection: AWS Cost Explorer recommendation for Redshift Reserved Instance
Savings: 25ā40% discount vs On-Demand (exact AWS calculation)
How it works:
- AWS Cost Explorer analyzes 30 days of Redshift cluster usage
- Identifies consistent node usage suitable for RI commitment
- Returns exact savings calculation for 1-year or 3-year terms
Example:
Node Type: ra3.xlplus
Nodes: 4
Current: On-Demand ($1.086/hr)
RI (1yr): $0.750/hr
Monthly Savings: $980.00
AWS Reference: Redshift Reserved Nodes
Why HIGH confidence: Uses AWS's own Cost Explorer analysis with exact pricing.
RDS Extended Support Costā
Confidence: HIGH
Detection: RDS engine version in or approaching extended support window
Savings: vCPU-based surcharge ($0.10ā$0.20 per vCPU-hour Ć 730 hours)
How it works:
- We check RDS engine versions against AWS lifecycle policy
- Versions past standard support incur per-vCPU extended support fees
- Warning is raised 90 days before a version enters extended support
Example:
Instance: mydb-production
Engine: mysql 5.7
Status: Extended support (since 2024-10)
vCPUs: 4
Monthly Surcharge: $292.00
Action: Upgrade to MySQL 8.0
AWS Reference: RDS Extended Support Pricing
Why HIGH confidence: Engine version and vCPU count are exact metadata. Surcharge rates are published by AWS.
DocumentDB Extended Support Costā
Confidence: MEDIUM
Detection: DocumentDB cluster version in extended support
Savings: ~10% surcharge on base cluster cost
How it works:
- We check DocumentDB cluster versions against the AWS lifecycle policy
- Versions past standard support incur extended support surcharges
- Upgrading to a supported version eliminates the surcharge
Example:
Cluster: my-docdb
Engine: docdb 4.0
Status: Extended support
Monthly Surcharge: $45.00 (10% of base cost)
Action: Upgrade to DocumentDB 5.0
AWS Reference: DocumentDB Pricing
Why MEDIUM confidence: Extended support surcharge rate varies by engine version and instance type.
ElastiCache Extended Support Costā
Confidence: MEDIUM
Detection: ElastiCache engine version in extended support
Savings: ~10% surcharge on node hourly cost
How it works:
- We check ElastiCache engine (Redis/Valkey) versions against the lifecycle policy
- End-of-life versions incur extended support surcharges per node
- Upgrading to a current version eliminates the surcharge
Example:
Cluster: session-cache
Engine: Redis 6.0 (extended support)
Nodes: 3
Monthly Surcharge: $32.85 (10% of $328.50)
Action: Upgrade to Redis 7.x
AWS Reference: ElastiCache Pricing
Why MEDIUM confidence: Surcharge rates depend on version and node type. AWS publishes specific rates.
ElastiCache Replication Waste (Non-Production)ā
Confidence: HIGH
Detection: Non-production cluster with unnecessary replicas
Savings: Full replica node cost (doubles or triples cluster price)
How it works:
- We check ElastiCache replication groups for replica configuration
- Clusters tagged with non-production environment names (dev, staging, test, sandbox) are flagged if they have replicas
- Each replica is a full node at the same hourly rate as the primary
Example:
Replication Group: dev-session-cache
Environment: dev (tag)
Node Type: cache.r6g.large
Shards: 1, Replicas per Shard: 2
Replica Cost: 2 Ć $0.164/hr Ć 730 hrs = $239.44/month
Monthly Savings: $239.44
AWS Reference: ElastiCache Pricing
Why HIGH confidence: Non-production environments don't need high availability replicas. The environment tag clearly identifies the cluster as non-production.
ElastiCache Valkey Migration Savingsā
Confidence: HIGH
Detection: Redis OSS or Memcached cluster eligible for Valkey migration
Savings: 20% of current cluster cost (node-based) or 33% (Serverless)
How it works:
- We check the engine type of each ElastiCache cluster
- Redis OSS and Memcached clusters are flagged as migration candidates
- Valkey is permanently 20% cheaper for node-based and 33% cheaper for Serverless, with Redis 7.x API compatibility
Example:
Cluster: session-cache
Engine: Redis OSS 7.0.7
Node Type: cache.r7g.xlarge Ć 2 nodes
Current Cost: $0.437/hr Ć 2 Ć 730 = $638.02/month
Valkey Cost: $0.3496/hr Ć 2 Ć 730 = $510.42/month
Monthly Savings: $127.60 (20%)
AWS Reference: ElastiCache Pricing
Why HIGH confidence: The 20% discount is a permanent pricing structure. Valkey is API-compatible with Redis 7.x. Existing Redis OSS reservations apply to Valkey.
ElastiCache Serverless Optimizationā
Confidence: MEDIUM
Detection: Node-based cluster with spiky traffic pattern
Savings: Varies (eliminates over-provisioned capacity during low-traffic periods)
How it works:
- We analyze CPU utilization and connection metrics via CloudWatch over 30 days
- Clusters with high traffic variance (low average, high peaks) are flagged
- Serverless eliminates paying for idle capacity and auto-scales for peaks
Example:
Cluster: api-cache
Node Type: cache.r6g.xlarge Ć 2 nodes
Avg CPU: 8.5% (over-provisioned)
Peak CPU: 72% (needs capacity for bursts)
Connection CV: 3.2 (highly variable)
Current Cost: $478.00/month
Estimated Serverless: $180.00/month
Monthly Savings: ~$298.00
AWS Reference: ElastiCache Serverless Pricing
Why MEDIUM confidence: Serverless cost depends heavily on actual ECPU consumption, which varies by workload. Estimate is conservative. Not available for Memcached.
ElastiCache Data Tiering Opportunityā
Confidence: MEDIUM
Detection: Memory-only cluster eligible for R6gd data tiering
Savings: Up to 52% for large datasets (fewer nodes needed)
How it works:
- We identify clusters using memory-optimized R5/R6g/R7g nodes at xlarge size or larger
- R6gd nodes combine memory + NVMe SSD with ~5Ć total storage capacity
- Multi-node clusters can often consolidate to fewer R6gd nodes
Example:
Cluster: large-dataset-cache
Current: 4 Ć cache.r6g.16xlarge ($5.254/hr each)
Current Cost: $15,342/month
With Data Tiering: 1 Ć cache.r6gd.16xlarge ($9.98/hr)
Data Tiering Cost: $7,285/month
Monthly Savings: $8,057 (52%)
AWS Reference: ElastiCache Data Tiering
Why MEDIUM confidence: Savings depend on data access patterns. Best for workloads where < 20% of data is accessed frequently. SSD-resident data has slightly higher first-access latency. Not available with Serverless.
OpenSearch Extended Support Costā
Confidence: MEDIUM
Detection: OpenSearch/Elasticsearch version in extended support
Savings: ~10% surcharge on domain hourly cost
How it works:
- We check OpenSearch domain engine versions against the lifecycle policy
- Legacy Elasticsearch versions (5.x, 6.x) and older OpenSearch versions may incur surcharges
- Upgrading to a current OpenSearch version eliminates the surcharge
Example:
Domain: search-logs
Engine: Elasticsearch 7.10 (extended support)
Instances: 3 Ć r6g.large.search
Monthly Surcharge: $56.00
Action: Upgrade to OpenSearch 2.x
AWS Reference: OpenSearch Pricing
Why MEDIUM confidence: Extended support for OpenSearch is versioned and may vary by instance type.
Network Detectorsā
Unattached Elastic IPā
Confidence: HIGH
Detection: describe_addresses where no InstanceId or NetworkInterfaceId
Savings: $3.65/month ($0.005/hour Ć 730 hours)
How it works:
- EIPs attached to running instances are FREE
- Unattached EIPs cost $0.005/hour
- This is a fixed AWS price, no estimation
Example:
Elastic IP: 52.123.45.67
Attached: No
Monthly Savings: $3.65
AWS Reference: VPC Pricing (Public IPv4)
Why HIGH confidence: Binary detection (attached or not) with fixed AWS pricing.
Idle NAT Gatewayā
Confidence: HIGH
Detection: CloudWatch BytesOutToDestination = 0 for 7 days
Savings: $32.40/month (base charge only)
How it works:
- NAT Gateways have a fixed hourly charge of $0.045/hour
- We check CloudWatch for any traffic over 7 days
- Zero traffic = wasting the base charge
Example:
NAT Gateway: nat-0abc123def456
Traffic (7 days): 0 bytes
Monthly Savings: $32.40
AWS Reference: VPC Pricing (NAT Gateway)
Why HIGH confidence: CloudWatch confirms zero usage. Fixed AWS pricing.
Idle Load Balancerā
Confidence: HIGH
Detection: No healthy targets via describe_target_health
Savings: ~$16.20/month (ALB/NLB base charge)
How it works:
- ALB/NLB have a base charge of ~$0.0225/hour
- If no healthy targets, the load balancer isn't serving traffic
- You're paying for infrastructure that does nothing
Example:
Load Balancer: my-unused-alb
Healthy Targets: 0
Monthly Savings: $16.20
AWS Reference: Elastic Load Balancing Pricing
Why HIGH confidence: No healthy targets = definitely not serving requests.
Low-Traffic Load Balancerā
Confidence: MEDIUM
Detection: <100 requests in 14 days with healthy targets
Savings: ~$16.20/month (ALB/NLB base charge)
How it works:
- Checks ALBs/NLBs that have registered healthy targets
- Queries CloudWatch
RequestCountover 14 days - Flags if total requests < 100 (approximately <7/day)
- Excludes LBs with zero targets (caught by
idle_load_balancer)
Example:
Load Balancer: staging-api-alb (application)
Healthy Targets: 3
Requests (14 days): 42
Monthly Savings: $16.20
AWS Reference: Elastic Load Balancing Pricing
Why MEDIUM confidence: Low traffic may be intentional (internal health checks, canary endpoints). Review before deleting.
High LCU Cost ALBā
Confidence: LOW
Detection: Average ConsumedLCUs cost exceeds 2Ć ALB base fee
Savings: ~30% of estimated LCU overage (conservative)
How it works:
- Queries CloudWatch
ConsumedLCUsmetric over 7 days - Calculates estimated monthly LCU cost:
avg_lcus Ć $0.008 Ć 730 hours - Flags if LCU cost > 2Ć base cost ($32.40)
- Suggests reviewing NLB migration or architecture changes
Example:
Load Balancer: prod-websocket-alb (application)
Average LCUs: 12.4
Estimated LCU Cost: $72.41/month
Base Fee: $16.43/month
Total: $88.84/month
Savings Potential: ~$21.72/month (30% of LCU cost)
AWS Reference: ALB Pricing ā LCU Details
Why LOW confidence: High LCU usage may be legitimate for the workload. This is an advisory detector ā the team should review whether the architecture is optimal.
Classic Load Balancer Migrationā
Confidence: LOW
Detection: Load balancer type is classic
Savings: ~$2.05/month base + consolidation savings
How it works:
- Detects any Classic Load Balancer (CLB)
- Recommends migration to ALB (HTTP/HTTPS) or NLB (TCP/TLS)
- CLBs cost $0.025/hr ($18.25/mo) vs ALBs at $0.0225/hr ($16.43/mo)
- Consolidating multiple CLBs into one ALB with routing rules saves significantly more
Example:
Load Balancer: legacy-web-clb (classic)
Scheme: internet-facing
Migration Target: ALB (for HTTP/HTTPS workloads)
Monthly Savings: $2.05 (base) + consolidation potential
AWS Reference: Classic Load Balancer Migration Guide
Why LOW confidence: Migration requires testing. This is a housekeeping recommendation, similar to deprecated runtime detection.
Unused CloudFront Distributionā
Confidence: HIGH
Detection: Zero requests for 30 days
Savings: Minimal base cost, prevents future data transfer charges
How it works:
- We query CloudWatch for CloudFront request metrics
- Distributions with zero requests for 30+ days are flagged
- CloudFront has minimal costs when idle, but active distributions incur data transfer fees
Example:
Distribution: E1ABCDEFGH2IJK
Origin: my-old-bucket.s3.amazonaws.com
Requests (30 days): 0
Monthly Savings: ~$0.01 (housekeeping)
AWS Reference: CloudFront Pricing
Why HIGH confidence: Zero requests means no users are accessing content through this CDN.
Unused Route53 Hosted Zoneā
Confidence: HIGH
Detection: Hosted zone with only NS and SOA records
Savings: $0.50/month per zone
How it works:
- We list all Route53 hosted zones
- Zones with only default NS and SOA records (no custom DNS records) are flagged
- Empty zones still incur the $0.50/month hosting fee
Example:
Hosted Zone: old-domain.example.com
Records: 2 (NS, SOA only)
Monthly Savings: $0.50
AWS Reference: Route53 Pricing
Why HIGH confidence: Binary detection. Zone either has custom records or it doesn't.
Unused Global Acceleratorā
Confidence: HIGH
Detection: Zero traffic for 30 days
Savings: ~$18.00/month (base hourly charge)
How it works:
- We query CloudWatch for accelerator traffic metrics
- Accelerators with zero traffic for 30+ days are flagged
- Global Accelerator charges $0.025/hour regardless of traffic
Example:
Accelerator: arn:aws:globalaccelerator::123456789:accelerator/abc123
Traffic (30 days): 0
Monthly Savings: $18.25
AWS Reference: Global Accelerator Pricing
Why HIGH confidence: Zero traffic means no applications are using this accelerator.
Idle Global Acceleratorā
Confidence: HIGH
Detection: Deployed accelerator with endpoints but zero processed bytes for 30 days
Savings: ~$25.55/month (accelerator + 2 IPv4 addresses)
How it works:
- We enumerate all Global Accelerators and their listeners/endpoint groups
- For accelerators with endpoints, we check CloudWatch ProcessedBytesIn and ProcessedBytesOut
- Accelerators with zero traffic for 30+ days despite having endpoints are flagged
Example:
Accelerator: arn:aws:globalaccelerator::123456789:accelerator/idle-abc
Endpoints: 3 (across 2 listeners)
ProcessedBytesIn (30d): 0
ProcessedBytesOut (30d): 0
Monthly Savings: $25.55
AWS Reference: Global Accelerator Pricing
Why HIGH confidence: Zero traffic with active endpoints indicates the accelerator is not used in any request path.
Disabled Global Acceleratorā
Confidence: HIGH
Detection: Accelerator with Enabled=false still incurring fixed hourly charges
Savings: ~$25.55/month (accelerator + 2 IPv4 addresses)
How it works:
- We check the Enabled flag on each deployed Global Accelerator
- Disabled accelerators continue to reserve static anycast IPs and incur hourly charges
- Accelerators that are disabled but not deleted are flagged
Example:
Accelerator: arn:aws:globalaccelerator::123456789:accelerator/disabled-xyz
Status: DEPLOYED
Enabled: false
IP Addresses: [75.2.x.x, 99.83.x.x]
Monthly Savings: $25.55
AWS Reference: Global Accelerator Pricing
Why HIGH confidence: A disabled accelerator cannot route traffic and serves no purpose while still incurring charges.
Orphaned DNS Recordā
Confidence: MEDIUM
Detection: Route 53 DNS record pointing to a resource that no longer exists
Savings: ~$0.50/month per stale hosted zone record (cleanup finding)
How it works:
- We scan Route 53 hosted zones for alias and CNAME records
- Target resources (EC2, ELB, CloudFront) are verified against current inventory
- Records pointing to deregistered or terminated resources are flagged
Example:
Hosted Zone: example.com
Record: api.example.com (CNAME)
Target: old-lb-123.us-east-1.elb.amazonaws.com (DELETED)
Action: Remove or update DNS record
AWS Reference: Route 53 Pricing
Why MEDIUM confidence: Orphaned DNS records may be waiting for a replacement resource. Verify with the application team. Also a security concern (dangling DNS can be hijacked).
Unused VPC Endpointā
Confidence: HIGH
Detection: VPC endpoint with zero data processed in 30 days
Savings: Interface endpoint ā $0.01/GB + $0.01/hour/AZ (~$7.30/month/AZ)
How it works:
- We check VPC endpoint data processing metrics over 30 days
- Interface endpoints with zero bytes processed are unused
- Gateway endpoints (S3, DynamoDB) are free ā only Interface endpoints incur charges
Example:
Endpoint: vpce-0abc123def456
Service: com.amazonaws.us-east-1.sqs
Type: Interface
AZs: 3
Data Processed (30d): 0 bytes
Monthly Savings: $21.90 ($7.30/AZ Ć 3)
AWS Reference: VPC Endpoint Pricing
Why HIGH confidence: Zero data processed definitively indicates no traffic through this endpoint.
Serverless Detectorsā
Unused Lambda Functionsā
Confidence: HIGH (detection), but LOW savings
Detection: CloudWatch Invocations = 0 for 30 days
Savings: Minimal (~$0.01-$0.08/month for storage)
Important Note: Lambda is pay-per-use. Unused functions cost essentially $0 for execution. We flag these for cleanup purposes, not cost savings.
Example:
Function: my-old-function
Invocations (30 days): 0
Monthly Savings: $0.01 (storage only)
Purpose: Housekeeping
AWS Reference: Lambda Pricing
Why flagged: Unused functions clutter your account and may contain outdated code or security issues.
Deprecated Lambda Runtimeā
Confidence: LOW
Detection: Function running on a deprecated or end-of-life runtime
Savings: Minimal ($0.01 ā housekeeping flag)
Important Note: This is a compliance and security detector, not a cost-savings detector. Deprecated runtimes may lose security patches and eventually become unsupported.
Deprecated Runtimes Detected:
python3.8ā EOL October 2024python3.7ā EOL November 2023python2.7ā EOL July 2022nodejs16.xā EOL June 2024nodejs14.xā EOL November 2023nodejs12.xā EOL March 2023dotnet6ā EOL February 2025ruby2.7ā EOL December 2023java8ā EOL December 2023
Example:
Function: my-legacy-handler
Runtime: python3.7
Status: EOL since November 2023
Monthly Savings: $0.01 (housekeeping)
Recommendation: Upgrade to python3.12 or later
AWS Reference: Lambda Runtime Support Policy
Why LOW confidence: Runtime upgrades may require code changes. Manual testing is essential before upgrading.
Unused API Gatewayā
Confidence: HIGH (detection), but LOW savings
Detection: CloudWatch Count = 0 for 30 days
Savings: Minimal ($0.01) unless caching is enabled
Important Note: API Gateway is pay-per-request. Unused APIs cost $0 for requests. Only caching adds fixed costs (~$14/month for 0.5GB cache).
AWS Reference: API Gateway Pricing
Unused AppSync APIā
Confidence: HIGH (detection), but LOW savings
Detection: Zero GraphQL queries for 30 days
Savings: Minimal (AppSync is pay-per-request)
How it works:
- We query CloudWatch for AppSync request metrics
- APIs with zero queries for 30+ days are flagged
- AppSync charges per query/mutation, so unused APIs cost $0
Example:
API: my-graphql-api
Queries (30 days): 0
Monthly Savings: $0.00
Purpose: Housekeeping
AWS Reference: AppSync Pricing
Why flagged: Unused APIs clutter your account and may have unnecessary permissions.
Idle AppSync Cacheā
Confidence: MEDIUM
Detection: Cache with < 100 hits in 14 days
Savings: $32ā$4,878/month depending on cache size
How it works:
- We enumerate all AppSync APIs and check for attached caches via
GetApiCache - For each cache, we query CloudWatch
CacheHitCountandCacheMissCountover 14 days - Caches with fewer than 100 total hits are flagged as idle
- Monthly savings are calculated from the cache instance type's hourly rate
Cache pricing (hourly):
| Type | $/hr |
|---|---|
| SMALL (t2.small) | $0.044 |
| MEDIUM (r4.large) | $0.182 |
| LARGE (r4.xlarge) | $0.365 |
| XLARGE (r4.2xlarge) | $0.730 |
| LARGE_2X (r4.4xlarge) | $1.461 |
| LARGE_4X (r4.8xlarge) | $2.921 |
| LARGE_8X | $4.339 |
| LARGE_12X | $6.775 |
Example:
API: my-graphql-api
Cache Type: MEDIUM (r4.large)
Cache Hits (14 days): 12
Monthly Savings: $131.04
AWS Reference: AppSync Caching
Why flagged: AppSync caches incur hourly charges regardless of usage. An idle cache provides no performance benefit while adding cost.
Idle AppSync Subscriptionsā
Confidence: MEDIUM
Detection: Active WebSocket connections with < 100 GraphQL requests in 14 days
Savings: Estimated from connection-minutes ($0.08 per million connection-minutes)
How it works:
- We check CloudWatch
ConnectSuccessmetric to identify APIs with active WebSocket connections - For those APIs, we check
Latency(SampleCount) to see if actual GraphQL operations are occurring - APIs with connections but fewer than 100 requests are flagged
- Monthly cost is estimated from
ActiveConnectionsaverage Ć connection-minutes
Example:
API: my-realtime-api
Active Connections (avg): 150
GraphQL Requests (14 days): 23
Estimated Monthly Cost: $5.26
AWS Reference: AppSync Real-time Subscriptions
Why flagged: WebSocket connections that rarely receive data suggest abandoned or misconfigured subscription clients, wasting connection-minute charges.
Idle Step Functions State Machineā
Confidence: HIGH (detection), but LOW savings
Detection: Zero executions for 30 days
Savings: Minimal (Step Functions is pay-per-transition)
How it works:
- We query CloudWatch for Step Functions execution metrics
- State machines with zero executions for 30+ days are flagged
- Step Functions charges per state transition, so idle machines cost $0
Example:
State Machine: my-workflow
Executions (30 days): 0
Monthly Savings: $0.00
Purpose: Housekeeping
AWS Reference: Step Functions Pricing
Why flagged: Unused workflows may be obsolete or indicate broken integrations.
Step Functions Retry Stormā
Confidence: HIGH
Detection: Retry ratio > 25% AND failure rate > 20% over 14 days
Savings: Estimated from avoidable transition volume
How it works:
- We collect CloudWatch metrics: ExecutionsFailed, ExecutionsTimedOut, StateTransition
- We estimate the proportion of transitions wasted on retry/failure paths
- State machines exceeding both thresholds are flagged
Example:
State Machine: order-processor
Failure Rate: 35%
Retry Ratio: 28% of transitions
Estimated Wasted Transitions: 14,000/month
Estimated Savings: $0.35/month
AWS Reference: Step Functions Pricing
Why flagged: Retry loops multiply transitions and downstream compute invocations without business value.
High Transition Density Workflowā
Confidence: MEDIUM
Detection: Average transitions per successful execution > 50 (configurable)
Savings: Estimated from reducible transitions
How it works:
- We compute average transitions per successful execution from CloudWatch
- Workflows with excessive state granularity are flagged
- Conservative 30% reduction estimate applied for savings
Example:
State Machine: data-pipeline
Avg Transitions per Success: 85
Monthly Executions: 2,000
Estimated Savings: $1.28/month
AWS Reference: Step Functions Pricing
Why flagged: Excessive state granularity inflates per-execution cost. Collapsing pass states and simplifying branching reduces transitions.
Express Workflow Duration Wasteā
Confidence: MEDIUM
Detection: Express workflow with p95 duration > 30s and high execution volume
Savings: Estimated from reducible duration component
How it works:
- We identify Express workflows via describe-state-machine
- We collect ExecutionTime CloudWatch metrics (Average, Maximum)
- Workflows with persistently high duration and sufficient volume are flagged
Example:
State Machine: realtime-processor (EXPRESS)
p95 Duration: 45,000ms
Monthly Executions: 15,000
Estimated Savings: $0.50/month
AWS Reference: Step Functions Pricing
Why flagged: Express billing includes per-request + duration charges. Unnecessary waits and payload overhead inflate duration costs.
Idle Transfer Family Serverā
Confidence: HIGH
Detection: Zero file transfers for 30 days
Savings: $0.30/hour (~$216/month) base charge
How it works:
- We query CloudWatch for Transfer Family file operation metrics
- Servers with zero transfers for 30+ days are flagged
- Transfer Family charges $0.30/hour regardless of activity
Example:
Server: s-0abc123def456
Protocol: SFTP
File Transfers (30 days): 0
Monthly Savings: $216.00
AWS Reference: Transfer Family Pricing
Why HIGH confidence: Zero file transfers means no clients are using this SFTP/FTPS server.
Idle Transfer Family Server ā Zero Activityā
Confidence: HIGH
Detection: Server has configured users but zero file transfers for 30 days
Savings: $216/month per protocol ($0.30/hr Ć 720 hrs)
How it works:
- We query Transfer Family for servers with configured users
- We check CloudWatch FilesIn/FilesOut metrics for the last 30 days
- Servers with users but zero transfers are flagged ā the most common Transfer waste pattern
Example:
Server: s-0abc123def456
Users: 3
Protocols: SFTP
File Transfers (30 days): 0
Monthly Savings: $216.00
AWS Reference: Transfer Family Pricing
Why HIGH confidence: Configured users with zero transfers for 30+ days strongly indicates the server is abandoned.
Unused Protocol on Transfer Family Serverā
Confidence: MEDIUM
Detection: Multi-protocol server with one or more protocols showing zero transfers for 30 days
Savings: $216/month per unused protocol
How it works:
- We identify servers with multiple protocols enabled (SFTP, FTPS, FTP, AS2)
- We check CloudWatch metrics per protocol for the last 30 days
- Protocols with zero transfers are flagged ā each unused protocol costs $216/month
Example:
Server: s-0abc123def456
Protocols: SFTP, FTPS
Active: SFTP (142 transfers)
Unused: FTPS (0 transfers)
Monthly Savings: $216.00
AWS Reference: Transfer Family Pricing
Why MEDIUM confidence: Removing a protocol is reversible but could break clients configured for that protocol.
Idle Transfer Family Web Appā
Confidence: HIGH
Detection: Web App with zero active sessions for 30 days
Savings: $360/month per provisioned unit ($0.50/hr Ć 720 hrs)
How it works:
- We query Transfer Family Web Apps and check provisioned units
- We check CloudWatch ActiveSessions metric for the last 30 days
- Web Apps with zero sessions are flagged ā they run continuously once created
Example:
Web App: webapp-0abc123def
Provisioned Units: 2
Active Sessions (30 days): 0
Monthly Savings: $720.00
AWS Reference: Transfer Family Pricing
Why HIGH confidence: Zero sessions for 30+ days means no users are accessing the Web App.
Analytics Detectorsā
Old Glue Jobā
Confidence: MEDIUM
Detection: AWS Glue job not run in 90+ days or never run (if created 30+ days ago)
Savings: Minimal (Glue jobs are pay-per-use)
How it works:
- We list all Glue jobs and check their run history
- Jobs that have never run (and were created 30+ days ago) are flagged
- Jobs that haven't run in 90+ days are also flagged
Example:
Job: etl-pipeline-v1
Last Run: 95 days ago (or never)
Monthly Savings: $0.00 (housekeeping)
Purpose: Cleanup unused ETL jobs
AWS Reference: Glue Pricing
Why MEDIUM confidence: Jobs may be scheduled for future use or disaster recovery.
Idle Glue Crawlerā
Confidence: MEDIUM
Detection: AWS Glue crawler not run in 90+ days or never run (if created 30+ days ago)
Savings: Minimal (Glue crawlers are pay-per-use)
How it works:
- We list all Glue crawlers and check their last run time
- Crawlers that haven't run in 90+ days are flagged
- Crawlers that have never run (and were created 30+ days ago) are also flagged
Example:
Crawler: data-catalog-crawler
Last Run: 120 days ago
Monthly Savings: $0.00 (housekeeping)
Purpose: Cleanup unused data catalog crawlers
AWS Reference: Glue Pricing
Why MEDIUM confidence: Crawlers may be scheduled periodically or kept for future use.
Idle EMR Clusterā
Confidence: HIGH
Detection: Running cluster with zero steps for extended period
Savings: Full cluster cost
How it works:
- We identify EMR clusters in RUNNING or WAITING state
- Clusters with no active or pending steps are flagged
- EMR charges for all EC2 instances in the cluster
Example:
Cluster: j-0ABC123DEF456
State: WAITING
Active Steps: 0
Instance Count: 5 (1 master, 4 core)
Monthly Savings: $730.00
AWS Reference: EMR Pricing
Why HIGH confidence: Clusters in WAITING state with no steps are consuming resources without processing data.
Idle Kinesis Streamā
Confidence: HIGH
Detection: Zero records for 14 days
Savings: Shard hours ($0.015/shard-hour)
How it works:
- We query CloudWatch for IncomingRecords metrics
- Streams with zero records for 14+ days are flagged
- Kinesis charges per shard-hour regardless of activity
Example:
Stream: my-data-stream
Shards: 4
Incoming Records (14 days): 0
Monthly Savings: $43.80
AWS Reference: Kinesis Data Streams Pricing
Why HIGH confidence: Zero incoming records means no data is being streamed.
Idle MSK Clusterā
Confidence: HIGH
Detection: Zero messages for 7 days
Savings: Full cluster cost (instance-type-aware)
How it works:
- We query CloudWatch for Kafka message metrics (MessagesInPerSec, BytesInPerSec)
- Clusters with zero messages and zero bytes for 7+ days are flagged
- MSK charges for broker instance hours and storage
Example:
Cluster: my-kafka-cluster
Brokers: 3 (kafka.m5.large)
Messages (7 days): 0
Monthly Savings: $459.90
AWS Reference: MSK Pricing
Why HIGH confidence: Zero messages means no producers or consumers are using this Kafka cluster.
Oversized MSK Clusterā
Confidence: MEDIUM
Detection: Broker CPU < 20% AND network throughput < 50% of instance capacity for 7 days (cluster is not idle)
Savings: Recommendation only (depends on target instance type)
How it works:
- We query CloudWatch for CpuUser, BytesInPerSec, and BytesOutPerSec metrics across all brokers
- Network utilization is calculated as (BytesIn + BytesOut) / instance network capacity ā each instance type has a known throughput ceiling
- If both average CPU is below 20% AND network throughput is below 50% of instance capacity over 7 days, the cluster may be over-provisioned
- We flag the cluster for manual review ā MSK is network-bound, so we check both signals to avoid false positives
Example:
Cluster: my-kafka-cluster
Brokers: 3 (kafka.m5.2xlarge)
Average CPU: 8.2%
Network Utilization: 4.3%
Messages/sec: 150
Current Cost: $1,839/month
Action: Consider downsizing to kafka.m5.large ($460/month)
AWS Reference: MSK Instance Types
Why MEDIUM confidence: MSK is network-bound, not CPU-bound. Low CPU alone doesn't indicate over-provisioning ā a cluster saturating its network at 10% CPU is properly sized. CloudWise requires both low CPU and low network utilization before flagging. Review consumer lag, partition count, and storage throughput before downsizing.
Idle Glue Dev Endpointā
Confidence: HIGH
Detection: Dev endpoint in READY state with no recent activity
Savings: DPU-hour charges (~$0.44/DPU-hour)
How it works:
- We identify Glue development endpoints in READY state
- Endpoints running without recent notebook activity are flagged
- Dev endpoints charge continuously while in READY state
Example:
Endpoint: my-dev-endpoint
Status: READY
DPUs: 5
Last Activity: 7+ days ago
Monthly Savings: $1,584.00
AWS Reference: Glue Pricing
Why HIGH confidence: Dev endpoints in READY state charge continuously. They should be terminated when not in use.
Oversized Glue Jobā
Confidence: MEDIUM
Detection: JVM heap utilization < 30% average over 14 days OR short execution with high DPU allocation
Savings: Proportional to excess DPUs Ć average run duration Ć frequency
How it works:
- We examine each Glue ETL job's DPU allocation (
MaxCapacityorNumberOfWorkers) - We query CloudWatch for
glue.ALL.jvm.heap.usageto measure worker memory utilization - If average heap usage is below 30%, the job has more workers than it needs
- We also apply a heuristic: jobs with ā„ 10 DPUs completing in under 30 minutes are likely oversized
- Savings are calculated based on the excess DPUs, average run duration, and estimated run frequency
Example:
Job: daily-customer-etl
Allocated: 20 DPUs (G.1X workers)
Avg JVM Heap: 15%
Avg Duration: 12 minutes
Recommended: 6 DPUs
Estimated Runs: 30/month
Monthly Excess Cost: $55.44
Recommended Action: Test with reduced DPUs in a development environment. If the job completes successfully within acceptable time, update production.
AWS Reference: Glue Job Properties | Monitoring with CloudWatch
Why MEDIUM confidence: JVM heap usage is a reliable signal, but some jobs have legitimate peak memory needs that don't show in averages. The heuristic fallback (short duration + high DPUs) is less reliable.
Glue Job ā Excessive Timeout Riskā
Confidence: MEDIUM
Detection: Configured timeout ā„ 10Ć average execution duration
Savings: Risk mitigation ā prevents catastrophic billing from stuck jobs
How it works:
- We compare the job's configured
Timeoutagainst its average execution duration from run history - Jobs where the timeout exceeds 10Ć the average duration are flagged
- Only jobs with recent successful runs and timeout ā„ 60 minutes are evaluated
- We recommend setting timeout to 3Ć average duration (industry best practice)
Example:
Job: nightly-data-sync
Configured Timeout: 2,880 minutes (48 hours ā the default)
Average Duration: 8 minutes
Timeout Ratio: 360Ć
Recommended Timeout: 24 minutes (3Ć average)
Risk: If this job hangs, it burns $211 before timing out
(10 DPUs Ć 48 hours Ć $0.44)
Recommended Action: Set the timeout to 3Ć your average duration. This protects against stuck jobs while allowing for normal execution variance.
Click Fix This to update the timeout configuration automatically, with full rollback capability.
AWS Reference: Glue Job Timeout
Why MEDIUM confidence: The timeout-to-duration ratio is a strong signal, but some jobs have legitimate long-tail executions on certain data partitions.
Failing Glue Job with Retriesā
Confidence: HIGH
Detection: ā„ 50% failure rate across recent runs with retries configured
Savings: Direct ā eliminates DPU cost from repeated failed runs
How it works:
- We examine the last 10 job runs for each Glue job
- Jobs with ā„ 50% failure rate and
MaxRetries > 0are flagged - We calculate the total DPU-hours wasted on failed runs
- Monthly waste is projected from the observed failure frequency
- The most recent error message is captured for root cause analysis
Example:
Job: inventory-etl
Failure Rate: 80% (8 of 10 runs FAILED)
MaxRetries: 3
DPUs: 10
Avg Failed Run Duration: 15 minutes
Monthly Wasted DPU-hours: 90
Monthly Waste: $39.60
Last Error: "S3 bucket 'inventory-raw' does not exist"
Recommended Action: Disable retries immediately to stop the cost bleeding, then investigate and fix the root cause error.
Click Fix This to disable retries automatically, with full rollback capability.
AWS Reference: Glue Job Retries
Why HIGH confidence: A sustained 50%+ failure rate across 10+ runs is a definitive pattern. Combined with retry configuration, this is guaranteed waste.
Migrate Dev Endpoint to Interactive Sessionsā
Confidence: LOW (advisory)
Detection: Any active Glue development endpoint
Savings: Difference between 24/7 billing and actual active-use billing
Important Note: AWS has deprecated development endpoints in favor of Interactive Sessions (Glue Studio Job Notebooks). Dev endpoints bill continuously at $0.44/DPU-hour ā they cannot be paused or stopped. Interactive Sessions use the same $0.44/DPU-hour rate but automatically stop after an idle timeout (default: 60 minutes).
How it works:
- We flag all active Glue development endpoints
- We calculate the continuous monthly cost:
DPUs Ć $0.44 Ć 730 hours - We estimate savings assuming the developer actively uses the endpoint ~4 hours per workday
- The difference between 24/7 billing and workday-only billing is the savings opportunity
Example:
Endpoint: dev-spark-env
Status: READY
DPUs: 5
Created: 180 days ago
Continuous Monthly Cost: $1,606
Estimated Active Use: 4 hours/day Ć 22 workdays = 88 hours/month
Active Monthly Cost: $193.60
Monthly Savings: $1,412.40 (88% reduction)
Recommended Action: Create a new Interactive Session in Glue Studio, migrate your development scripts, then delete the dev endpoint.
AWS Reference: Interactive Sessions | Dev Endpoint Deprecation
Why LOW confidence: All dev endpoints are flagged ā the savings estimate depends on actual usage patterns. Some teams may use endpoints near-continuously, reducing the savings delta.
Glue Data Catalog Bloatā
Confidence: LOW
Detection: Data Catalog object count exceeds 1M free tier, or table version growth approaching threshold
Savings: $1.00 per 100,000 objects above the free tier
How it works:
- We count total objects in the Data Catalog (databases, tables, table versions, partitions)
- Catalogs exceeding 1,000,000 objects are flagged with the calculated overage cost
- Catalogs with 500,000+ table versions get an early warning even if under the total threshold
- The primary bloat source is table versions ā every
UpdateTablecall creates a new version
Example:
Data Catalog Objects:
Databases: 12
Tables: 450
Table Versions: 1,200,000 ā The problem
Partitions: 350,000
Total: 1,550,462
Overage: 550,462 objects above free tier
Monthly Cost: $5.50
Recommended Action: Delete old table versions using aws glue batch-delete-table-version. Keep only the latest 5ā10 versions per table.
AWS Reference: Data Catalog Pricing | Table Versioning
Why LOW confidence: The cost is accurate but relatively low. The detector's value is catching uncontrolled growth before it becomes expensive.
Idle MQ Brokerā
Confidence: HIGH
Detection: Zero connections for 14 days
Savings: Full broker cost
How it works:
- We query CloudWatch for Amazon MQ connection metrics
- Brokers with zero connections for 14+ days are flagged
- MQ charges for broker instance hours
- Savings account for instance type pricing and HA deployment multipliers
Example:
Broker: my-activemq-broker
Engine: ActiveMQ
Instance Type: mq.t3.micro
Connections (14 days): 0
Monthly Savings: $24.82
AWS Reference: Amazon MQ Pricing
Why HIGH confidence: Zero connections means no applications are connected to this message broker.
Over-Provisioned MQ Brokerā
Confidence: MEDIUM
Detection: MQ broker with consistently low CPU utilization
Savings: $50ā$1,400/month per right-sizing step
How it works:
- We query Amazon MQ for all running brokers
- CloudWatch
CpuUtilizationis analyzed over 14 days - Brokers with <15% avg CPU on mq.m5.large or larger are flagged
- We identify the next smaller instance type as the downsizing target
- Savings multiply by the number of instances (HA = 2Ć, RabbitMQ cluster = 3Ć)
Example:
Broker: dev-event-bus
Engine: ActiveMQ
Instance Type: mq.m5.large
Deployment Mode: ACTIVE_STANDBY_MULTI_AZ
CPU Average (14 days): 4.2%
Current Cost: $402.96/month (2 instances Ć $201.48)
Target: mq.t3.micro ($49.64/month for 2 instances)
Monthly Savings: $353.32
Recommended Action: Create a new broker with the smaller instance type, migrate configuration and queue definitions, update application connection strings, verify message flow, then delete the old broker.
Amazon MQ does not support in-place instance type changes. Downsizing requires creating a new broker and migrating. Plan for brief message delivery interruption during cutover.
AWS Reference: Amazon MQ Pricing
Why MEDIUM confidence: CPU is a good proxy for message throughput but doesn't capture memory-bound scenarios (large message batches). Validate with monitoring after downsizing.
Long-Running EMR Clusterā
Confidence: MEDIUM
Detection: EMR cluster in RUNNING state > 7 days without auto-termination
Savings: ~30% from using Spot instances or transient clusters
How it works:
- We identify EMR clusters that have been running continuously for 7+ days
- Clusters without auto-termination enabled run indefinitely after jobs complete
- Long-running clusters may be better served by transient (job-scoped) clusters
Example:
Cluster: j-ABC123DEF456
Name: analytics-etl
Running: 14 days
Auto-Terminate: Disabled
Instance Type: m5.xlarge Ć 5
Monthly Savings: $450.00 (Spot + transient)
AWS Reference: EMR Pricing
Why MEDIUM confidence: Some EMR clusters intentionally run continuously for interactive workloads (Spark Thrift Server, Presto). Verify workload pattern.
Over-Provisioned EMR Clusterā
Confidence: MEDIUM
Detection: YARN memory available > 50% averaged over 7 days
Savings: Proportional to instance count reduction
How it works:
- We pull CloudWatch
YARNMemoryAvailablePercentagefor each WAITING/RUNNING cluster - If average available memory exceeds 50%, the CORE group is over-provisioned
- Recommended new count = current Ć (1 ā available_pct / 100), minimum 1
Example:
Cluster: j-OVERPROV001
Name: data-pipeline
YARN Memory Available: 72%
CORE Nodes: 8 ā 4 recommended
Monthly Savings: $700.00
AWS Reference: EMR Instance Groups
Why MEDIUM confidence: YARN memory is a good proxy for utilisation but workloads may have burst patterns. Monitor after resize.
EMR Missing Auto-Terminationā
Confidence: MEDIUM
Detection: Keep-alive cluster with no AutoTerminationPolicy
Savings: Prevents indefinite idle costs
How it works:
- We check clusters with
KeepJobFlowAliveWhenNoSteps=True - If no
AutoTerminationPolicyis set, the cluster will never terminate on its own - Recommendation: set a 1-hour idle timeout
Example:
Cluster: j-NOAUTOTERM01
Name: adhoc-queries
Keep Alive: Yes
Auto-Termination: Not configured
Instances: 5
AWS Reference: EMR Auto-Termination
Why MEDIUM confidence: Some clusters are intentionally long-lived (e.g., serving Presto/Hive endpoints). Verify before applying.
EMR Previous-Generation Instancesā
Confidence: MEDIUM
Detection: Cluster using previous-generation instance families
Savings: ~15ā20% cost reduction with current-generation instances
How it works:
- We check instance types in all instance groups
- If any instance family is in the previous-gen map (m3ām5, m4ām5, c3āc5, c4āc5, r3ār5, r4ār5, i2āi3, d2ād3, p2āp3, g2āg4dn), we flag the cluster
- Current-gen instances offer better performance at lower cost
Example:
Cluster: j-PREVGEN001
Name: reports-cluster
Current Type: m4.xlarge ($0.240/hr)
Recommended: m5.xlarge ($0.192/hr)
Savings per Instance: $14.60/month
AWS Reference: EC2 Instance Types
Why MEDIUM confidence: Current-gen instances are generally drop-in replacements but EMR release compatibility should be verified.
EMR Spot Opportunityā
Confidence: MEDIUM
Detection: TASK instance group using On-Demand pricing
Savings: ~60ā70% cost reduction on task nodes
How it works:
- We identify TASK instance groups with
Market=ON_DEMAND - Task nodes are fault-tolerant ā YARN reschedules work on Spot interruption
- Spot pricing is typically 60ā70% cheaper than On-Demand
Example:
Cluster: j-SPOTOPPTY01
Task Nodes: 4 Ć r5.2xlarge (On-Demand)
On-Demand Cost: $0.504/hr per instance
Spot Estimate: ~$0.17/hr per instance
Monthly Savings: $964.00
AWS Reference: EMR Spot Instances
Why MEDIUM confidence: Spot instances can be interrupted. Verify workload is fault-tolerant before switching.
Over-Provisioned Kinesis Streamā
Confidence: MEDIUM
Detection: Kinesis Data Stream with per-shard utilization < 20% over 14 days
Savings: Excess shards Ć $11/shard/month
How it works:
- We analyze Kinesis stream IncomingBytes relative to shard capacity
- Each shard can handle 1 MB/s input and 2 MB/s output
- Streams with < 20% utilization per shard have excess capacity
Example:
Stream: event-ingestion
Shards: 10
Avg Throughput: 1.2 MB/s (12% per-shard utilization)
Recommended: 4 shards
Monthly Savings: $66.00 ($11/shard Ć 6 excess shards)
AWS Reference: Kinesis Data Streams Pricing
Why MEDIUM confidence: Stream throughput can be bursty. Analyze peak utilization and consider On-Demand mode for variable workloads.
Kinesis On-Demand Downgradeā
Confidence: MEDIUM
Detection: On-Demand stream with stable throughput (coefficient of variation < 0.3) for 14+ days, age ā„ 30 days
Savings: On-Demand vs Provisioned cost delta
How it works:
- Identifies On-Demand streams with predictable traffic patterns
- Calculates coefficient of variation (CV) of daily IncomingBytes over 14 days
- CV < 0.3 indicates stable throughput suitable for Provisioned mode
- Computes recommended shard count with 30% headroom
Example:
Stream: log-pipeline
Mode: ON_DEMAND
CV: 0.12 (stable)
On-Demand Cost: $58.20/month
Provisioned (2 shards): $21.90/month
Monthly Savings: $36.30
AWS Reference: Kinesis Data Streams Pricing
Kinesis Extended Retention Wasteā
Confidence: HIGH
Detection: Extended retention (>24h) with zero GetRecords in 14 days
Savings: Shards Ć $0.020/shard-hour Ć 730 hours/month
How it works:
- Identifies streams with retention period > 24 hours (default)
- Checks if any consumer is reading data (GetRecords metric)
- If zero reads for 14 days, extended retention is waste
Example:
Stream: clickstream-archive
Retention: 168 hours (7 days)
Shards: 4
GetRecords (14d): 0
Monthly Savings: $58.40 ($0.020 Ć 4 shards Ć 730 hours)
Kinesis Enhanced Fan-Out Wasteā
Confidence: HIGH
Detection: Enhanced fan-out consumer with zero reads for 14 days
Savings: Shards Ć $0.015/consumer-shard-hour Ć 730 hours/month
How it works:
- Lists enhanced fan-out consumers registered on each stream
- Checks per-consumer GetRecords metrics
- Consumers with zero reads for 14 days are candidates for deregistration
Example:
Stream: events/analytics-consumer
Shards: 4
Consumer: analytics-consumer (ACTIVE)
GetRecords (14d): 0
Monthly Savings: $43.80 ($0.015 Ć 4 shards Ć 730 hours)
Idle Kinesis Firehoseā
Confidence: HIGH
Detection: Firehose delivery stream with zero IncomingRecords for 14 days
Savings: $0 direct (pay-per-use), but eliminates forgotten infrastructure
How it works:
- Lists all Firehose delivery streams
- Checks IncomingRecords and IncomingBytes metrics over 14 days
- Idle streams are flagged as forgotten infrastructure
- Notes if Lambda transforms are configured (may have associated costs)
Example:
Delivery Stream: logs-to-s3-firehose
Status: ACTIVE
Records (14d): 0
Lambda Transform: Yes (may incur invocation costs)
ML/AI Detectorsā
Idle SageMaker Notebook Instanceā
Confidence: HIGH
Detection: Notebook instance InService with no kernel activity
Savings: Full instance cost
How it works:
- We identify SageMaker notebook instances in InService status
- Check for kernel activity or user connections
- Idle notebooks continue to charge full instance cost
Example:
Notebook: my-ml-notebook
Instance Type: ml.t3.medium
Status: InService
Activity (7 days): None
Monthly Savings: $37.00
AWS Reference: SageMaker Pricing
Why HIGH confidence: SageMaker notebooks charge continuously when InService, regardless of activity.
Idle SageMaker Endpointā
Confidence: HIGH
Detection: Zero invocations for 7 days
Savings: Full endpoint cost
How it works:
- We query CloudWatch for endpoint invocation metrics
- Endpoints with zero invocations for 7+ days are flagged
- Idle endpoints charge full compute cost continuously
Example:
Endpoint: my-model-endpoint
Instance Type: ml.m5.large
Invocations (7 days): 0
Monthly Savings: $96.00
AWS Reference: SageMaker Pricing
Why HIGH confidence: Zero invocations is unambiguous. Endpoint is not being used.
Oversized SageMaker Endpointā
Confidence: MEDIUM
Detection: Low CPU/memory utilization over 7 days
Savings: Based on right-sized instance
How it works:
- We analyze CloudWatch metrics for endpoint CPU and memory utilization
- Endpoints with consistently low utilization are flagged
- Right-sizing to smaller instance type reduces costs
Example:
Endpoint: my-inference-endpoint
Instance Type: ml.c5.xlarge
Avg CPU: 15%
Avg Memory: 20%
Monthly Savings: ~$70.00 (if downsized to ml.c5.large)
AWS Reference: SageMaker Pricing
Why MEDIUM confidence: Inference workloads may have periodic spikes. Review before right-sizing.
Stopped SageMaker Notebook Storageā
Confidence: HIGH
Detection: Stopped notebook instance with attached EBS volume
Savings: $0.58 ā $58/month per notebook (based on EBS volume size)
How it works:
- We identify SageMaker notebook instances in Stopped status
- Stopped notebooks still incur EBS storage charges ($0.116/GB-month for gp2/gp3)
- Notebooks stopped for >7 days are flagged
- Deleting the notebook removes the attached EBS volume
Example:
Notebook: dev-ml-notebook
Status: Stopped (30 days)
Volume Size: 50 GB
Monthly Storage Cost: $5.80
Action: Delete notebook instance to remove EBS volume
AWS Reference: SageMaker Pricing
Why HIGH confidence: Stopped status is binary and volume size is exact. No ambiguity.
Previous-Generation SageMaker Instanceā
Confidence: HIGH
Detection: Notebook or endpoint using previous-gen instance type (ml.m4, ml.c4, ml.t2, ml.r4, ml.p2)
Savings: 10-30% of current cost
How it works:
- We check instance types against a known previous-generation prefix list
- Each flagged resource includes a specific upgrade recommendation
- Covers both notebook instances and endpoint instances
- Upgrade map: ml.m4āml.m5, ml.c4āml.c5, ml.t2āml.t3, ml.r4āml.r5, ml.p2āml.p3
Example:
Notebook: legacy-training-nb
Instance Type: ml.m4.xlarge ($0.28/hr)
Recommended: ml.m5.xlarge ($0.23/hr)
Monthly Savings: $36.50 (18% reduction)
AWS Reference: SageMaker Pricing
Why HIGH confidence: Instance type is deterministic metadata. No ambiguity in detection.
SageMaker Savings Plan Opportunityā
Confidence: HIGH
Detection: AWS Cost Explorer recommendation for SageMaker Savings Plan
Savings: Exact AWS calculation (typically 20ā40% vs On-Demand)
How it works:
- AWS Cost Explorer analyzes 30 days of SageMaker usage (training, inference, notebooks)
- Identifies consistent usage patterns suitable for a SageMaker Savings Plan commitment
- Returns exact savings calculation for 1-year or 3-year terms
Example:
Current SageMaker Spend: $850.00/month (On-Demand)
Recommended SP: $550.00/month commitment
Monthly Savings: $300.00 (35%)
Term: 1-year, No Upfront
AWS Reference: SageMaker Savings Plans
Why HIGH confidence: Uses AWS's own Cost Explorer analysis with exact pricing data.
Security Detectorsā
Unused Secrets Manager Secretā
Confidence: HIGH
Detection: Secret not accessed for 90 days
Savings: $0.40/month per secret
How it works:
- We check the LastAccessedDate for each secret
- Secrets not accessed for 90+ days are flagged
- Secrets Manager charges $0.40/month per secret
Example:
Secret: old-api-key
Last Accessed: 120 days ago
Monthly Savings: $0.40
AWS Reference: Secrets Manager Pricing
Why HIGH confidence: Access date is tracked by AWS. 90+ days without access indicates unused secret.
Unused KMS Keyā
Confidence: HIGH
Detection: Customer-managed key not used for 90 days
Savings: $1.00/month per key
How it works:
- We check CloudTrail for KMS key usage events
- Customer-managed keys not used for 90+ days are flagged
- KMS charges $1.00/month per customer-managed key
Example:
Key: alias/old-encryption-key
Key ID: arn:aws:kms:us-east-1:123456789:key/abc123
Last Used: 95 days ago
Monthly Savings: $1.00
AWS Reference: KMS Pricing
Why HIGH confidence: Key usage is tracked via CloudTrail. 90+ days without use indicates unused key.
Unencrypted EBS Volumeā
Confidence: HIGH
Detection: EBS volume with encrypted = false in in-use state
Savings: Zero (security/compliance finding)
How it works:
- We check the
encryptedattribute on all EBS volumes - Volumes in
in-usestate without encryption at rest are flagged - EBS encryption has no performance penalty and uses AWS-managed or customer-managed KMS keys
Example:
Volume: vol-0abc123def456
State: in-use
Encrypted: No
Size: 500 GB
Action: Enable default EBS encryption, recreate volume from snapshot
AWS Reference: EBS Encryption
Why HIGH confidence: Encryption status is deterministic metadata from the EBS API.
Unencrypted RDS Instanceā
Confidence: HIGH
Detection: RDS instance with StorageEncrypted = false
Savings: Zero (security/compliance finding)
How it works:
- We check the
StorageEncryptedattribute on all RDS instances - Unencrypted instances cannot be encrypted in-place ā requires snapshot + restore
- This is a compliance requirement for most security frameworks (SOC 2, HIPAA, PCI)
Example:
Instance: mydb-production
Engine: mysql 8.0
Encrypted: No
Action: Create encrypted snapshot, restore encrypted instance
AWS Reference: RDS Encryption
Why HIGH confidence: Encryption status is exact API metadata. No estimation needed.
Unencrypted EFS Filesystemā
Confidence: HIGH
Detection: EFS filesystem with encrypted = false
Savings: Zero (security/compliance finding)
How it works:
- We check the
Encryptedattribute on all EFS filesystems - EFS encryption must be enabled at creation ā cannot be added to existing filesystems
- Migration requires creating a new encrypted filesystem and copying data
Example:
Filesystem: fs-0abc123def456
Encrypted: No
Size: 150 GB
Action: Create encrypted EFS, migrate data with DataSync
AWS Reference: EFS Encryption
Why HIGH confidence: Encryption status is deterministic. No false positives possible.
Unencrypted DocumentDB Clusterā
Confidence: HIGH
Detection: DocumentDB cluster with StorageEncrypted = false
Savings: Zero (security/compliance finding)
How it works:
- We check the
StorageEncryptedattribute on DocumentDB clusters - Unencrypted clusters cannot be encrypted in-place
- Migration requires creating an encrypted cluster and restoring from snapshot
Example:
Cluster: my-docdb-cluster
Encrypted: No
Action: Create encrypted cluster, restore from snapshot
AWS Reference: DocumentDB Encryption
Why HIGH confidence: Encryption status is exact metadata from the DocumentDB API.
S3 Without Default Encryptionā
Confidence: MEDIUM
Detection: S3 bucket without default encryption configuration
Savings: Zero (compliance finding)
How it works:
- We check for
ServerSideEncryptionConfigurationon each S3 bucket - Buckets without default encryption rely on individual object-level encryption
- Since January 2023, AWS enables SSE-S3 by default ā this detector catches legacy buckets
Example:
Bucket: legacy-data-bucket
Default Encryption: None
Action: Enable SSE-S3 or SSE-KMS default encryption
AWS Reference: S3 Default Encryption
Why MEDIUM confidence: Legacy buckets may have object-level encryption already applied. Bucket-level policy provides a safety net.
OpenSearch Encryption Not Enabledā
Confidence: HIGH
Detection: OpenSearch domain with EncryptionAtRestOptions.Enabled = false
Savings: Zero (security/compliance finding)
How it works:
- We check the
EncryptionAtRestOptionsconfiguration on OpenSearch domains - Domains without encryption at rest are flagged as a compliance issue
- Encryption can be enabled on existing domains (in-place, no downtime)
Example:
Domain: search-logs
Encryption at Rest: Disabled
Action: Enable encryption at rest in domain settings
AWS Reference: OpenSearch Encryption
Why HIGH confidence: Encryption configuration is exact API metadata.
RDS Without Deletion Protectionā
Confidence: HIGH
Detection: RDS instance with DeletionProtection = false
Savings: Zero (reliability/governance finding)
How it works:
- We check the
DeletionProtectionattribute on all RDS instances - Production databases without this safeguard risk accidental deletion
- Enabling deletion protection is a one-click change with no downtime
Example:
Instance: production-db
Deletion Protection: Disabled
Action: Enable in RDS console ā Modify ā Deletion Protection
AWS Reference: RDS Deletion Protection
Why HIGH confidence: Configuration attribute is exact metadata. All production databases should have this enabled.
DynamoDB Without Deletion Protectionā
Confidence: HIGH
Detection: DynamoDB table with DeletionProtectionEnabled = false
Savings: Zero (reliability/governance finding)
How it works:
- We check the
DeletionProtectionEnabledattribute on DynamoDB tables - Tables without protection can be accidentally deleted via API or console
- Enabling deletion protection prevents accidental table deletion
Example:
Table: user-sessions
Deletion Protection: Disabled
Action: Enable via UpdateTable API or console
AWS Reference: DynamoDB Deletion Protection
Why HIGH confidence: Attribute is exact boolean metadata from the DynamoDB API.
RDS Publicly Accessibleā
Confidence: HIGH
Detection: RDS instance with PubliclyAccessible = true
Savings: Zero (security finding)
How it works:
- We check the
PubliclyAccessibleattribute on RDS instances - Publicly accessible databases are reachable from the internet (if security groups allow)
- Best practice is to place databases in private subnets with VPC-only access
Example:
Instance: mydb-production
Publicly Accessible: Yes
Action: Modify instance ā set Publicly Accessible to No
AWS Reference: RDS Security Best Practices
Why HIGH confidence: Boolean attribute from the RDS API. Combined with open security groups, this is a critical security risk.
Resource Without Backup Coverageā
Confidence: MEDIUM
Detection: RDS/DynamoDB resource with no AWS Backup recovery points
Savings: Backup cost (~$0.50ā$5/month per resource)
How it works:
- We cross-reference RDS instances and DynamoDB tables against AWS Backup inventory
- Resources with zero recovery points across all vaults are flagged
- This is a governance/resilience finding rather than a cost finding
Example:
Resource: arn:aws:rds:us-east-1:123456789:db:mydb
Recovery Points: 0
Action: Create backup plan covering this resource
AWS Reference: AWS Backup
Why MEDIUM confidence: Some resources use native snapshots (RDS automated backups) rather than AWS Backup. Verify backup strategy.
AWS Compute Optimizer Integrationā
CloudWise integrates with AWS Compute Optimizer to provide ML-backed rightsizing recommendations. This is a FREE AWS service that analyzes 14 days of CloudWatch metrics to recommend optimal resource configurations.
Oversized EC2 (Compute Optimizer)ā
Confidence: HIGH
Detection: AWS Compute Optimizer ML analysis
Savings: Exact calculation based on recommended instance type
How it works:
- AWS Compute Optimizer analyzes 14 days of CPU, memory, network, and disk metrics
- Returns specific recommendations (e.g., "t3.large ā t3.medium")
- CloudWise surfaces these with exact savings calculations
Example:
Instance: i-0abc123def456
Current: m5.xlarge ($140.16/month)
Recommended: m5.large ($70.08/month)
Monthly Savings: $70.08 (50%)
Confidence: HIGH (AWS Compute Optimizer)
AWS Reference: Compute Optimizer
Why HIGH confidence: AWS ML model analyzes real utilization. Specific target instance provided.
Oversized EBS (Compute Optimizer)ā
Confidence: HIGH
Detection: AWS Compute Optimizer IOPS/throughput analysis
Savings: Based on recommended volume configuration
How it works:
- Compute Optimizer analyzes EBS IOPS and throughput patterns
- Recommends optimal volume type and size
- Exact savings calculated from current vs. recommended cost
AWS Reference: Compute Optimizer - EBS
Over-Provisioned Lambda (Compute Optimizer)ā
Confidence: HIGH
Detection: AWS Compute Optimizer memory analysis
Savings: Based on recommended memory configuration
How it works:
- Compute Optimizer analyzes Lambda memory utilization
- Recommends right-sized memory allocation
- Lower memory = lower cost per invocation
AWS Reference: Compute Optimizer - Lambda
Oversized RDS (Compute Optimizer)ā
Confidence: HIGH
Detection: AWS Compute Optimizer ML analysis of RDS instance
Savings: Exact calculation based on recommended instance type
How it works:
- AWS Compute Optimizer analyzes RDS instance CPU, memory, and I/O metrics
- Returns specific downsizing recommendations (e.g., "db.r6g.xlarge ā db.r6g.large")
- CloudWise surfaces these with exact savings calculations
Example:
Instance: mydb-production
Current: db.r6g.xlarge ($0.48/hr)
Recommended: db.r6g.large ($0.24/hr)
Monthly Savings: $175.20
AWS Reference: Compute Optimizer - RDS
Why HIGH confidence: AWS Compute Optimizer uses 14+ days of ML-backed analysis. Recommendations include exact pricing.
Reserved Instance & Savings Plans Recommendationsā
CloudWise provides official AWS recommendations for commitment-based pricing. These are based on 30 days of actual usage and provide exact savings calculations.
API Cost Note: These recommendations are refreshed weekly to minimize Cost Explorer API costs ($0.01/request). Data is cached for 7 days.
EC2 Reserved Instance Opportunityā
Confidence: HIGH
Detection: AWS Cost Explorer 30-day usage analysis
Savings: Exact AWS calculation (typically 30-40%)
How it works:
- AWS Cost Explorer analyzes 30 days of EC2 usage patterns
- Identifies consistent usage that would benefit from RI commitment
- Returns specific purchase recommendations with exact savings
Example:
Recommendation: Purchase 2x t3.large 1-Year No Upfront RI
Region: us-east-1
Monthly Savings: $42.40 (37% savings)
Break-even: 7 months
AWS Reference: EC2 Reserved Instances
Why HIGH confidence: AWS official recommendation based on actual usage patterns.
RDS Reserved Instance Opportunityā
Confidence: HIGH
Detection: AWS Cost Explorer 30-day usage analysis
Savings: Exact AWS calculation
How it works:
- Analyzes RDS instance usage patterns
- Recommends RI purchases for consistent database workloads
- Includes engine-specific recommendations (MySQL, PostgreSQL, etc.)
AWS Reference: RDS Reserved Instances
Compute Savings Plan Opportunityā
Confidence: HIGH
Detection: AWS Cost Explorer 30-day usage analysis
Savings: Exact AWS calculation (typically 20-30%)
How it works:
- Analyzes all compute usage (EC2, Fargate, Lambda)
- Recommends hourly commitment level
- Compute SP covers all regions and instance families (most flexible)
Example:
Recommendation: $10/hour Compute Savings Plan (1-Year)
Monthly Commitment: $7,300/month
Current On-Demand: $9,500/month
Monthly Savings: $2,200 (23%)
Coverage: EC2, Fargate, Lambda (all regions)
AWS Reference: Savings Plans
EC2 Instance Savings Plan Opportunityā
Confidence: HIGH
Detection: AWS Cost Explorer 30-day usage analysis
Savings: Higher than Compute SP (locked to instance family/region)
Note: Only recommended when savings exceed 25% due to reduced flexibility.
AWS Reference: Savings Plans
Unused Reserved Instanceā
Confidence: HIGH Detection: EC2 RI utilization < 20% over 30 days (via Cost Explorer Reservation Utilization API) Savings: Committed RI cost Ć (1 - utilization%)
How it works:
- CloudWise reads your Cost Explorer Reservation Utilization report weekly
- Any EC2 RI group with < 20% utilization over the past 30 days is flagged
- Monthly waste is calculated as the fraction of committed cost not offsetting on-demand
Example:
- Instance type: m4.xlarge
- Reserved Instances: 3 (No Upfront 1-year, ~$140/month each)
- RI Utilization: 5% (3 hours matched out of 24 Ć 30 = 720 hours)
- Monthly waste: $140 Ć 3 Ć 0.95 = $399/month
Why HIGH confidence: Uses AWS Cost Explorer's own utilization calculation, which tracks exact RI-to-instance matching.
What to do: Determine if the low utilization is permanent (migrate fleet, sell RI) or temporary (workload scaling back up). Standard RIs can be sold on the AWS Reserved Instance Marketplace. Convertible RIs can be exchanged for a different type.
AWS Reference: Reserved Instances in Amazon EC2
Expiring Reserved Instanceā
Confidence: HIGH Detection: Active EC2 RI with end date within 90 days Savings: Monthly cost delta between current RI rate and on-demand equivalent (cost avoidance)
How it works:
- CloudWise reads your active EC2 Reserved Instances weekly
- Any RI expiring within 90 days is flagged with an urgency level (URGENT ā¤30d / WARNING ā¤60d / NOTICE ā¤90d)
- The estimated impact shows what the covered instance(s) would cost at on-demand rates post-expiry
Example:
- Instance type: r5.4xlarge
- Reserved Instances: 2 (All Upfront 1-year)
- Expiry: 22 days from now ā URGENT
- Current RI rate: ~$320/month each
- On-demand equivalent: ~$960/month each
- Risk: +$1,280/month if not renewed
Why HIGH confidence: Expiry date is deterministic ā describe_reserved_instances returns exact ISO 8601 end timestamps.
What to do: (a) Purchase a replacement RI before expiry, (b) switch to a Compute Savings Plan for more instance-type flexibility, or (c) if the workload is ending, allow expiry and plan to terminate the instance promptly.
Convertible RI Exchange Opportunityā
Confidence: MEDIUM Detection: Convertible EC2 RI on a previous-generation instance family where a current-gen exchange is available Savings: On-demand price differential between old and new instance type Ć quantity
How it works:
- CloudWise identifies active Convertible RIs tied to previous-gen instance families (m4, r4, c5, etc.)
- Maps each to the recommended current-gen equivalent (m7i, r7i, c7i, etc.)
- If the target type has a lower on-demand reference price, the exchange is flagged
- Exchange is free per AWS policy ā no financial outlay required
Example:
- RI type: m4.xlarge Convertible (3-year, active, 18 months remaining)
- Recommended exchange: m7i.xlarge
- m4.xlarge on-demand: $0.192/hr ā $140.16/month
- m7i.xlarge on-demand: $0.1785/hr ā $130.31/month
- Savings on reference pricing: $9.85/month per RI
- Exchange cost: $0
Why MEDIUM confidence: Pricing differential relies on public on-demand pricing; actual RI discount rates are calculated by AWS at exchange time and may differ slightly.
What to do: Log in to EC2 Console ā Reserved Instances ā select the RI ā Actions ā Exchange Reserved Instance. Select the target instance type. Review the exchange ratio (AWS may require more than one old RI to match one new RI in some cases). Confirm the exchange ā it takes effect immediately.
Note: Graviton-family targets (t4g, m7g, c7g) are not recommended automatically as ARM64 compatibility requires code-level validation.
Expiring Savings Planā
Confidence: HIGH
Detection: Savings Plan with end date within 90 days
Savings: Monthly commitment amount at risk after expiry
How it works:
- We check all active Savings Plans for upcoming expiration dates
- Plans expiring within 30 days are flagged as URGENT, within 60 days as WARNING
- Early notification allows time to renew or adjust commitment strategy
Example:
Savings Plan: sp-abc123def456
Type: Compute Savings Plan
Commitment: $500.00/month
Expires: 2026-05-15 (37 days away)
Urgency: WARNING
Action: Renew or purchase replacement SP
AWS Reference: Savings Plans
Why HIGH confidence: Expiration dates are exact metadata from the Savings Plans API. No estimation involved.
Unused Savings Planā
Confidence: HIGH
Detection: Savings Plan utilization < 20% over 30 days
Savings: Unused daily commitment Ć 30
How it works:
- We analyze Savings Plan utilization via Cost Explorer over the past 30 days
- Plans with utilization below 20% are flagged ā the commitment is being paid but not offsetting On-Demand usage
- This often indicates workload changes, instance type shifts, or over-commitment
Example:
Savings Plan: sp-abc123def456
Type: EC2 Instance Savings Plan
Commitment: $300.00/month
Utilization: 12%
Unused: $264.00/month
Action: Review commitment strategy
AWS Reference: Savings Plans Utilization
Why HIGH confidence: Utilization data comes directly from AWS Cost Explorer with exact dollar amounts.
Savings Plan Coverage Gapā
Confidence: MEDIUM
Detection: Significant On-Demand spend not covered by existing Savings Plans
Savings: ~25% of uncovered On-Demand spend (conservative estimate)
How it works:
- We analyze Cost Explorer data for compute spend covered vs. uncovered by Savings Plans
- Accounts with < 50% SP coverage and > $100/month uncovered spend are flagged
- Purchasing additional SP commitment could reduce the uncovered portion by 25ā40%
Example:
Total Compute Spend: $2,000/month
SP Coverage: 40% ($800 covered)
Uncovered On-Demand: $1,200/month
Potential SP Savings: $300.00/month (25%)
AWS Reference: Savings Plans Coverage
Why MEDIUM confidence: Savings estimate depends on the SP type/term chosen and actual usage stability. The 25% is conservative.
CUR Unused Reservationā
Confidence: HIGH
Detection: CUR-based analysis showing Reserved Instance amortization with low utilization
Savings: Unused RI fee from CUR cost breakdown
How it works:
- We analyze CUR (Cost and Usage Report) line items for RI amortization
- RI fees without corresponding On-Demand equivalent usage indicate waste
- This provides more granular insight than the Cost Explorer summary
Example:
RI: ri-abc123def456
Type: m5.xlarge
Amortized Cost: $250.00/month
Utilized: 35%
Waste: $162.50/month
Action: Sell on RI Marketplace or modify
AWS Reference: Understanding CUR Reserved Instance Data
Why HIGH confidence: CUR data provides exact line-item cost allocation. No estimation needed.
CUR Savings Plan Wasteā
Confidence: MEDIUM
Detection: CUR-based analysis of Savings Plan commitment not fully utilized
Savings: Unused SP commitment from CUR breakdown
How it works:
- We analyze CUR line items for Savings Plan amortization and coverage
- SP commitment cost without matching covered usage indicates waste
- Provides daily granularity on SP utilization trends
Example:
Savings Plan: sp-abc123def456
Type: Compute SP
Monthly Commitment: $400.00
Utilized: 65%
Waste: $140.00/month
Action: Adjust workloads or reduce commitment at renewal
AWS Reference: Understanding CUR Savings Plans
Why MEDIUM confidence: CUR data is exact but utilization patterns may fluctuate month to month.
IPv4 Address Optimizationā
As of February 2024, AWS charges $0.005/hour ($3.65/month) for ALL public IPv4 addresses.
EIP on Stopped Instanceā
Confidence: HIGH
Detection: Elastic IP attached to stopped EC2 instance
Savings: $3.65/month per EIP
How it works:
- Identifies Elastic IPs attached to stopped instances
- Stopped instances don't need public IPs
- Release or detach to save $3.65/month per IP
Example:
Elastic IP: 52.123.45.67
Attached to: i-0abc123 (stopped)
Monthly Savings: $3.65
Action: Release EIP or terminate instance
AWS Reference: VPC Pricing - Public IPv4
Multiple EIPs per Instanceā
Confidence: HIGH
Detection: Instance with more than one Elastic IP attached
Savings: $3.65/month per extra EIP
How it works:
- Identifies instances with multiple EIPs attached
- Multiple public IPs is usually an anti-pattern
- Consolidate to single EIP to reduce costs
AWS Reference: VPC Pricing
Management Detectorsā
CloudWatch Log Group Without Retentionā
Confidence: MEDIUM (small groups) / HIGH (large groups)
Detection: CloudWatch log groups with no retention policy (logs stored forever)
Savings: Varies based on log volume ($0.03/GB-month for storage)
How it works:
- We identify all CloudWatch log groups without a retention policy
- Log groups without retention store logs indefinitely (growing storage costs)
- Larger log groups (>1GB) are flagged with higher confidence
Example:
Log Group: /aws/lambda/my-function
Size: 5.2GB
Retention: Never expires (no policy)
Monthly Savings: $0.16/month (storage) + preventing future growth
Recommendation: Set 30-90 day retention policy
AWS Reference: CloudWatch Pricing
Why flagged: Without retention, logs grow indefinitely. Most logs are only useful for 30-90 days.
Excessive Log Group Retentionā
Confidence: MEDIUM / HIGH (HIGH when >10 GB stored)
Detection: Log group with retention ā„ 365 days and ā„ 100 MB stored data
Savings: Proportional to stored data Ć retention reduction factor
How it works:
- We check each log group's retention policy via the CloudWatch Logs API
- Log groups with 365+ day retention and at least 100 MB of stored data are flagged
- Savings are calculated based on reducing retention to 30 days:
savings = storage_cost Ć (1 ā 30/current_retention_days)
Example:
Log Group: /aws/lambda/api-handler
Retention: 365 days
Stored Data: 50 GB
Current Storage Cost: $1.50/month
Recommended Retention: 30 days
Estimated Savings: $1.38/month (92% reduction)
Recommended Action: Reduce retention to 30 days. For compliance, consider S3 export + Glacier instead.
aws logs put-retention-policy \
--log-group-name '/aws/lambda/api-handler' \
--retention-in-days 30
Click Fix This to reduce retention automatically, with full rollback capability.
AWS Reference: CloudWatch Logs Pricing | PutRetentionPolicy API
Why flagged: 99% of log searches target the last 7 days. Storing 365+ days of logs in CloudWatch Logs is rarely cost-effective compared to S3 + Glacier for long-term archival.
Empty Log Groupā
Confidence: LOW
Detection: Log group with 0 bytes stored and no ingestion for 30+ days
Savings: $0.00 (hygiene ā quota reclamation)
How it works:
- We check each log group's stored bytes from the CloudWatch Logs API
- Log groups with exactly 0 bytes are evaluated for staleness
- If the log group was created 30+ days ago (or last received logs 30+ days ago), it's flagged
- No savings ā empty log groups are free. This is a housekeeping detector.
Example:
Log Group: /aws/lambda/deleted-processor
Stored Data: 0 bytes
Created: 180 days ago
Last Event: Never received logs
Status: Orphaned (Lambda function was deleted)
Recommended Action: Delete the empty log group to reduce clutter and free quota (10,000 log groups per region default limit).
aws logs delete-log-group \
--log-group-name '/aws/lambda/deleted-processor'
Click Fix This to delete the empty log group automatically. CloudWatch will recreate the log group automatically if the associated service resumes logging.
AWS Reference: CloudWatch Logs Quotas
Why flagged: While free, orphaned log groups consume regional quota (default: 10,000). Accounts with many microservices, Lambda functions, or ECS tasks can accumulate hundreds of empty log groups over time.
Old Log Groupā
Confidence: MEDIUM
Detection: CloudWatch log group with > 0.5 GB and no new log events for 90+ days
Savings: Storage cost at $0.03/GB/month
How it works:
- We check the last ingestion timestamp for each log group
- Groups with no new events for 90+ days and stored volume > 0.5 GB are flagged
- Stale data should be archived to S3 or deleted to reduce storage costs
Example:
Log Group: /aws/lambda/old-data-processor
Size: 2.3 GB
Last Event: 120 days ago
Monthly Savings: $0.07
Action: Export to S3 and set retention, or delete
AWS Reference: CloudWatch Logs Pricing
Why MEDIUM confidence: Some log groups are retained for audit/compliance. Verify retention requirements before deleting.
Unused CloudWatch Dashboardā
Confidence: MEDIUM
Detection: CloudWatch dashboard beyond the first 3 free dashboards, not modified in 90+ days
Savings: $3.00/month per unused dashboard
How it works:
- AWS provides 3 free CloudWatch dashboards ā additional dashboards cost $3.00/month each
- We identify paid-tier dashboards that haven't been modified in 90+ days
- Dashboards that haven't been updated may no longer reflect the current infrastructure
Example:
Dashboard: legacy-monitoring
Last Modified: 145 days ago
Tier: Paid (dashboard #5 of 8)
Monthly Savings: $3.00
Action: Delete or consolidate into active dashboards
AWS Reference: CloudWatch Dashboards Pricing
Why MEDIUM confidence: Dashboards may still be viewed even if not modified. The "not modified" heuristic may not capture viewing activity.
CloudTrail Optimizationā
Duplicate CloudTrail Trailsā
Confidence: HIGH
Detection: Multiple trails logging same events in same region
Savings: $2.00 per 100,000 events for duplicate trails
How it works:
- Analyzes CloudTrail configurations across the account
- Identifies trails with overlapping event selectors
- Recommends consolidation to single trail
Example:
Trail 1: my-trail (logging all management events)
Trail 2: security-trail (logging all management events)
Issue: Duplicate logging of same events
Savings: Depends on event volume
AWS Reference: CloudTrail Pricing
CloudTrail S3 Without Lifecycleā
Confidence: MEDIUM
Detection: CloudTrail S3 bucket without lifecycle policy
Savings: Depends on log volume and retention
How it works:
- Identifies S3 buckets receiving CloudTrail logs
- Checks for lifecycle policy to transition/expire old logs
- Recommends lifecycle policy to reduce storage costs
AWS Reference: S3 Lifecycle
š¤ AI Copilot Integrationā
The AI Copilot uses waste detection data to provide intelligent recommendations:
User: "What can I do to reduce my AWS costs?"
AI Copilot: "I found 3 high-confidence waste items that could save you $156/month:
1. **Unattached EBS Volume** (vol-abc123) - 500GB gp3 not attached to any instance
Savings: $40/month
Action: `aws ec2 delete-volume --volume-id vol-abc123`
2. **Idle NAT Gateway** (nat-xyz789) - No traffic for 7 days
Savings: $32.40/month
Action: Delete via VPC console if not needed
3. **2 Old EBS Snapshots** - Over 90 days old
Savings: $10/month
Action: Review and delete if no longer needed for recovery
Would you like me to explain any of these in more detail?"
The AI Copilot:
- Prioritizes HIGH confidence items first
- Explains the detection methodology when asked
- Provides ready-to-run AWS CLI commands
- Warns about potential impacts before destructive actions
š AWS Pricing Referencesā
š§ Threshold Configurationā
CloudWise uses a two-tier threshold system to give you complete control over waste detection and notifications:
Understanding the Two Threshold Systemsā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā TIER 1: DETECTION PHASE ā
ā Per-Account Setting (aws-accounts table) ā
ā ā
ā AWS Account Scan ā Find $0.20 EBS gp2āgp3 migration opportunity ā
ā Check: Is $0.20 >= min_waste_threshold_usd ($0.01)? ā
ā Result: YES ā Store in database ā ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā TIER 2: NOTIFICATION PHASE ā
ā Per-User Setting (users table ā notification_preferences) ā
ā ā
ā Check user's total waste findings ā Total: $45.00/month ā
ā Check: Is $45.00 >= min_savings_threshold ($50.00)? ā
ā Result: NO ā Don't send email notification ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
Tier 1: Account-Level Detection Thresholdā
These settings control what waste items are discovered and stored in the database during scans.
| Setting | Default | Range | Description |
|---|---|---|---|
cloudwatch_enabled | true | on/off | Enable CloudWatch metrics for accurate utilization analysis |
min_waste_threshold_usd | $0.01 | $0.01 - $100 | Minimum monthly savings to include a waste item |
Where to configure:
- Go to AWS Accounts page (Settings ā AWS Accounts)
- Click the Edit (pencil) icon on any connected account
- Scroll to Waste Detection Settings section
- Adjust the settings and click Update Account
Impact: Items below min_waste_threshold_usd are never stored in the database. With the default of $0.01, virtually all waste is captured.
Tier 2: User-Level Notification Thresholdā
These settings control when email and Slack notifications are sent about waste findings.
| Setting | Default | Range | Description |
|---|---|---|---|
min_savings_threshold | $10.00 | $1 - $1000 | Minimum total monthly savings to trigger notification |
frequency | daily | daily/weekly/monthly | How often to send waste summary notifications |
high_priority_immediate | true | on/off | Send immediately for high-priority findings |
high_priority_threshold | $50.00 | $10 - $500 | Monthly savings threshold for high priority |
Slack notifications: Waste detection can send summaries to a dedicated Slack channel. The message includes total findings, potential monthly savings, top 5 resources, and a direct link to the findings view in your Workspace.
To enable Slack for waste detection:
- Go to Settings ā Notifications
- Find the Waste Detection card
- Enable Slack notifications and paste a webhook URL
- Click Test Connection to verify
See the Slack Integration Guide for full multi-channel setup.
Where to configure:
- Go to Settings ā Notifications
- Find the Waste Findings section
- Adjust your notification preferences
- Click Save Changes
Impact: Emails are only sent if total waste across all accounts exceeds min_savings_threshold.
Recommended Settings by Use Caseā
| Use Case | Detection Threshold | Notification Threshold | Notes |
|---|---|---|---|
| Catch Everything (default) | $0.01 | $10.00 | See all waste, get notified for meaningful amounts |
| High-Volume Accounts | $1.00 | $50.00 | Reduce noise from micro-optimizations |
| Enterprise | $5.00 | $100.00 | Focus only on significant waste |
| Cost-Conscious Startup | $0.01 | $1.00 | Every penny counts |
Configuration Examplesā
Example 1: Default (Catch Everything)
Account Detection: $0.01 ā Stores $0.05 EBS snapshot, $2.30 idle EC2
User Notification: $10.00 ā Email sent if total waste > $10/month
Example 2: Enterprise (Reduce Noise)
Account Detection: $5.00 ā Ignores $0.05 EBS snapshot, stores $2.30 idle EC2
User Notification: $100.00 ā Email only for significant waste
CloudWatch Metrics Toggleā
The CloudWatch Metrics toggle affects detection accuracy:
| Setting | Detection Method | Confidence |
|---|---|---|
| Enabled (recommended) | Uses actual CPU, memory, network metrics | HIGH |
| Disabled | Uses configuration patterns only | MEDIUM |
When to disable CloudWatch:
- Accounts with strict IAM policies preventing CloudWatch access
- Reducing CloudWatch API costs (minimal impact, ~$0.01/scan)
š§ Additional Configurationā
Waste detection includes additional thresholds configurable by CloudWise support:
| Setting | Default | Description |
|---|---|---|
ec2_idle_cpu_threshold | 5% | CPU below this = idle |
ec2_idle_days | 14 | Days to analyze |
rds_idle_days | 14 | Days to analyze connections |
snapshot_age_days | 90 | Snapshots older than this are flagged |
Contact support to adjust these for your organization.
š” Best Practicesā
- Start with HIGH confidence items - These are guaranteed savings
- Review MEDIUM confidence items - Check for exceptions before acting
- Use LOW confidence items for cleanup - Not cost savings, but good hygiene
- Enable CloudWatch metrics - Required for idle detection
- Tag your resources - Helps identify ownership for waste items
- Schedule regular reviews - Run waste detection monthly
Last updated: January 2026