Waste Detection Guide

CloudWise Waste Detection automatically identifies AWS resources that are costing you money without providing value. This guide explains how each detector works, the confidence levels behind our recommendations, and the AWS pricing references that back up our calculations.

Try It Without Connecting Your AWS Account!

You can run Waste Detection in Air-Gapped Mode without granting CloudWise any access to your AWS account. Simply run our export script in AWS CloudShell, upload the results, and get instant analysis of your unused resources.

Learn about Air-Gapped Mode →

🎯 Our Philosophy: Accuracy Over Everything

We believe in transparent, verifiable savings. Every waste item we flag includes:

Confidence Level: How certain we are about the finding
Exact Savings Calculation: Based on actual AWS pricing, not estimates
AWS Pricing Reference: Link to official AWS documentation
Action Command: Ready-to-run AWS CLI command

We intentionally do not include detectors that make arbitrary assumptions (like "50% savings from downsizing") because misleading recommendations destroy trust.

📊 Confidence Levels Explained

CloudWise uses three confidence levels to help you prioritize actions:

HIGH Confidence ✅

These findings are 100% accurate with no assumptions:

Uses actual resource sizes/counts from AWS APIs
Uses fixed AWS pricing (published rates)
Clear binary detection (e.g., "attached or not")
You can act on these immediately

Examples: Unattached EBS volumes, Unattached Elastic IPs, Old snapshots

MEDIUM Confidence ⚠️

These findings are accurate with caveats:

Uses real CloudWatch metrics
Thresholds are configurable (e.g., "idle" means < 5% CPU)
May have legitimate exceptions (scheduled jobs, DR resources)
Review before taking action

Examples: Idle EC2 instances, Idle RDS databases, Log groups without retention

LOW Confidence ⚡

These are informational only or for cleanup purposes:

Minimal actual cost savings
Detection is for housekeeping, not cost reduction
Consider during maintenance windows

Examples: Unused Lambda functions (pay-per-use = $0 when unused)

📊 Complete Detector Inventory

CloudWise includes 191 waste detectors spanning 42 AWS services, organized into 10 service categories, each designed to identify specific cost optimization opportunities with verified savings calculations.

Detector Overview by Category

Category	Detectors	Description
Compute Optimization	34	EC2, Lambda, ECS, EKS (including Extended Support), WorkSpaces (5 detectors), Lightsail (6 detectors), Beanstalk (6 detectors)
Storage Optimization	26	EBS, S3 (lifecycle, growth, tiering, empty buckets), EFS, FSx, ECR, Snapshots, AWS Backup
Database Optimization	40	RDS (including Extended Support), Aurora (4 detectors), DynamoDB, ElastiCache (7 detectors including Extended Support), Redshift, OpenSearch (including Extended Support), Neptune (4 detectors), DocumentDB (including Extended Support), Timestream, QLDB
Network Optimization	19	EIP, NAT Gateway, Load Balancers (ALB/NLB/CLB), CloudFront, Route53, Global Accelerator (3 detectors), Transfer Family, VPC Endpoints
Serverless Optimization	10	Lambda, API Gateway, AppSync (3 detectors), Step Functions (4 detectors)
Analytics Optimization	24	EMR (6 detectors), Kinesis (6 detectors), MSK (idle + oversized), Glue (8 detectors), MQ (2 detectors), Firehose
ML/AI Optimization	6	SageMaker Notebooks, Endpoints, Oversized Endpoints, Stopped Notebook Storage, Previous-Gen Instances, SP Recommendations
Management & Operations	7	CloudWatch Logs (4 detectors), CloudTrail, CloudWatch Dashboards
Security & Compliance	12	Secrets Manager, KMS, Encryption-at-rest (EBS, RDS, EFS, S3, OpenSearch, DocumentDB), Deletion Protection (RDS, DynamoDB), Public Access (RDS), Backup Coverage
AWS Compute Optimizer	4	ML-backed rightsizing for EC2, Lambda, EBS, RDS
Reserved Instance/Savings Plans	16	RI & SP purchase recommendations, commitment utilization, expiry tracking, coverage analysis
IPv4 Optimization	2	EIP on stopped instance, Multiple EIPs per instance

Note: 171 detectors are available in Air-Gapped Mode with full export. 20 detectors require Online Mode (Compute Optimizer, RI/SP recommendations, Commitment Risk Intelligence, and orphaned DNS record cross-referencing). See the Coverage Matrix for details.

📴 Air-Gapped Mode vs Online Mode

Feature	Air-Gapped Mode	Online Mode
Setup Time	~5 minutes	~10 minutes
AWS Connection	None required	IAM role required
Detectors Available	171 of 191	All 191
Real-time Monitoring	❌	✅
Compute Optimizer	❌	✅
RI/SP Recommendations	❌	✅
Ideal For	Evaluation, compliance	Production monitoring

Air-Gapped Mode is perfect for:

Companies with strict security/compliance requirements
Evaluating CloudWise before full integration
One-time cost audits
Teams that need approval before granting external access

Get started with Air-Gapped Mode →

Complete Detector Reference Table

The table below lists all implemented waste detectors with their detection criteria and status.

Compute Optimization (34 Detectors)

Waste Type	Confidence	Detection
`idle_ec2`	HIGH	CPU < 5% for 14 days
`stopped_ec2_with_ebs`	MEDIUM	Stopped instance with EBS volumes
`unused_lambda`	HIGH	Zero invocations for 30 days
`over_provisioned_lambda`	MEDIUM	Memory utilization < 50%
`lambda_provisioned_concurrency_idle`	HIGH	Provisioned Concurrency utilization < 10% for 14 days
`lambda_excessive_timeout`	MEDIUM	Timeout ≥ 10× average duration
`lambda_arm64_migration`	LOW	x86_64 function with ARM64-compatible runtime
`oversized_ecs_task`	MEDIUM	CPU/memory utilization analysis
`idle_ecs_service`	HIGH	Zero running tasks for 7 days
`ecs_no_autoscaling`	MEDIUM	Fargate service without Application Auto Scaling
`ecs_container_insights_waste`	LOW	Container Insights on dev/small/idle cluster
`oversized_ecs_memory`	MEDIUM	Memory over-provisioned relative to usage
`idle_sagemaker_notebook`	HIGH	InService with no activity
`idle_sagemaker_endpoint`	HIGH	Zero invocations for 7 days
`stopped_sagemaker_notebook_storage`	HIGH	Stopped notebook with EBS volume
`previous_gen_sagemaker_instance`	HIGH	Previous-gen instance type (ml.m4/c4/t2/r4/p2)
`idle_workspace`	HIGH	Zero connections for 30 days
`oversized_workspace`	MEDIUM	WorkSpace bundle over-provisioned relative to usage
`workspaces_autostop_opportunity`	HIGH	AlwaysOn WorkSpace with low usage — switch to AutoStop billing
`workspaces_pool_overprovisioned_capacity`	HIGH	WorkSpaces Pool with excess provisioned capacity above p95 demand
`workspaces_windows_license_optimization`	MEDIUM	Windows license-included desktops eligible for BYOL or Linux alternatives
`idle_lightsail`	HIGH	CPU < 5% for 14 days
`lightsail_unattached_static_ip`	LOW	Allocated but not attached to any instance
`lightsail_unattached_disk`	MEDIUM	Block storage disk not attached to any instance
`lightsail_old_snapshot`	MEDIUM	Manual snapshot older than 90 days
`lightsail_idle_load_balancer`	MEDIUM	Load balancer with zero healthy instances
`lightsail_idle_database`	HIGH	Database with <1% CPU and zero connections for 14 days
`idle_beanstalk`	HIGH	Unhealthy environment (Grey/Red health)
`beanstalk_idle_traffic`	HIGH	Zero requests for 14 days
`beanstalk_unnecessary_alb`	HIGH	Load-balanced with single instance (min=max=1)
`beanstalk_previous_gen_instances`	HIGH	Using previous-generation instance types (t2, m4, c4, etc.)
`beanstalk_over_provisioned`	MEDIUM	CPU utilization <25% for 14 days
`beanstalk_orphaned_rds`	HIGH	Orphaned or unused RDS from Beanstalk environment
`eks_extended_support_cost`	HIGH	EKS cluster version in warning window or active Extended Support surcharge state

Storage Optimization (26 Detectors)

Waste Type	Confidence	Detection
`unattached_ebs`	HIGH	Volume status = available
`old_ebs_snapshot`	HIGH	Snapshot age > 90 days
`orphaned_ebs_snapshot`	HIGH	Source volume deleted — snapshot serves no recovery purpose
`ami_orphaned_snapshot`	HIGH	AMI deregistered — backing snapshot no longer needed
`gp2_migration`	HIGH	Volume type = gp2
`over_provisioned_iops`	MEDIUM	Provisioned IOPS underutilized
`no_lifecycle_policy`	HIGH	S3 bucket without lifecycle rules
`incomplete_multipart`	MEDIUM	Failed multipart uploads
`s3_rapid_growth`	MEDIUM	Bucket growth >100% in 30 days, size >1 GB, absolute growth >10 GB, no expiration rules
`s3_wrong_storage_class`	MEDIUM	S3 Standard >50 GB, >90% of total storage, no Intelligent-Tiering or lifecycle transitions
`s3_empty_bucket`	HIGH	Empty bucket (0 objects, 0 bytes) older than 30 days, excluding infra buckets
`s3_high_request_and_transfer_cost`	LOW	Non-storage costs (transfer + API requests) exceed storage costs (CUR-based)
`idle_efs`	HIGH	Zero client connections for 14 days
`no_lifecycle_efs`	HIGH	EFS filesystem without lifecycle policy
`idle_fsx`	HIGH	Zero network I/O for 14 days
`oversized_fsx`	MEDIUM	Storage utilization below 40% of provisioned capacity
`fsx_throughput_overprovisioned`	MEDIUM	Throughput utilization below 30% of provisioned (Windows/ONTAP/OpenZFS)
`old_fsx_backup`	MEDIUM	Manual FSx backup older than 90 days
`ecr_no_lifecycle_policy`	MEDIUM	ECR repo without lifecycle policy
`old_ecr_images`	MEDIUM	Images > 90 days old
`untagged_ecr_images`	HIGH	ECR images without required tags
`old_backup`	MEDIUM	Recovery point past retention window (> 90 days)
`redundant_backup`	MEDIUM	Duplicate recovery points for same resource in same vault
`backup_no_lifecycle_tiering`	LOW	Recovery point not transitioned to cold storage
`stale_backup_plan_assignment`	LOW	Backup plan with selection rules matching zero resources
`backup_copy_policy_overreach`	LOW	Cross-region copy rules duplicating backups unnecessarily

Database Optimization (40 Detectors)

Waste Type	Confidence	Detection
`idle_rds`	HIGH	Zero connections for 14 days
`old_rds_snapshot`	MEDIUM	Manual snapshot > 90 days
`aurora_io_optimization_opportunity`	HIGH	Aurora Standard cluster where I/O charges >25% of total spend
`aurora_extended_support_cost`	HIGH	Aurora cluster running EOL engine version under Extended Support
`aurora_serverless_opportunity`	MEDIUM	Low-utilization provisioned Aurora cluster suitable for Serverless v2
`aurora_to_rds_downgrade_opportunity`	MEDIUM	Underutilized Aurora cluster where single-node RDS would be more cost-effective
`rds_extended_support_cost`	HIGH	Non-Aurora RDS MySQL/PostgreSQL instance in warning window or active Extended Support surcharge state
`idle_dynamodb`	HIGH	Zero consumed capacity for 14 days
`dynamodb_no_autoscaling`	HIGH	Provisioned mode without auto-scaling
`over_provisioned_dynamodb`	MEDIUM	Capacity utilization < 20%
`idle_elasticache`	HIGH	Zero connections for 7 days
`oversized_elasticache`	MEDIUM	ElastiCache node over-provisioned relative to usage
`elasticache_extended_support_cost`	HIGH	ElastiCache Redis/Memcached cluster in warning window or active Extended Support surcharge state
`elasticache_replication_waste`	HIGH	Non-production cluster with unnecessary replicas
`elasticache_engine_migration`	HIGH	Redis OSS / Memcached cluster eligible for 20% cheaper Valkey
`elasticache_serverless_optimization`	MEDIUM	Node-based cluster with spiky traffic better suited for Serverless
`elasticache_data_tiering_opportunity`	MEDIUM	Memory-only R6g/R7g cluster eligible for R6gd data tiering (up to 52% savings)
`idle_redshift`	HIGH	Zero connections for 14 days
`underutilized_redshift`	MEDIUM	CPU < 10% for 14 days, non-zero connections
`oversized_redshift`	MEDIUM	Redshift cluster over-provisioned relative to query load
`redshift_no_pause`	MEDIUM	No pause schedule, >40% zero-connection hours
`redshift_spectrum_heavy`	LOW	Spectrum cost >50% of compute cost
`redshift_legacy_dc2`	LOW	DC2 node type, recommend RA3 migration
`redshift_wlm_over_provisioned`	MEDIUM	WLM queue near-empty, <50% concurrency slots used
`redshift_concurrency_scaling_waste`	LOW	Concurrency scaling exceeds free 1h/day credit
`idle_opensearch`	HIGH	Zero requests for 14 days
`opensearch_extended_support_cost`	HIGH	OpenSearch/legacy Elasticsearch domain in warning window or active legacy support surcharge state
`oversized_opensearch`	MEDIUM	Avg CPU < 20%, max CPU < 40% with active search traffic
`opensearch_ebs_overprovisioned`	MEDIUM	Free storage > 60% with flat growth (< 0.1 GB/day)
`ri_opportunity_opensearch`	LOW	On-demand domain eligible for Reserved Instance savings
`idle_neptune`	HIGH	Zero requests for 14 days
`neptune_serverless_opportunity`	HIGH	Provisioned cluster with avg CPU < 20%, max < 50%, cost > Serverless floor
`oversized_neptune`	MEDIUM	Per-instance avg CPU < 20%, max < 40% with downsize path available
`neptune_old_snapshot`	HIGH	Manual snapshot > 90 days
`idle_documentdb`	HIGH	Zero connections for 14 days
`old_documentdb_snapshot`	MEDIUM	Manual snapshot > 90 days
`overprovisioned_documentdb`	MEDIUM	CPU/connection utilization < 20% for 14 days
`documentdb_extended_support_cost`	HIGH	DocumentDB cluster in warning window or active Extended Support surcharge state
`idle_timestream`	HIGH	Zero writes for 30 days
`idle_qldb`	HIGH	Zero requests for 30 days

Network Optimization (19 Detectors)

Waste Type	Confidence	Detection
`unattached_eip`	HIGH	No instance or ENI attached
`eip_on_stopped_instance`	HIGH	EIP attached to stopped instance
`multiple_eips_per_instance`	HIGH	Instance with > 1 EIP
`idle_nat_gateway`	HIGH	Zero bytes transferred for 7 days
`idle_load_balancer`	HIGH	Zero healthy targets or zero requests with healthy targets
`low_traffic_alb`	MEDIUM	ALB/NLB with <100 requests in 14 days but healthy targets
`high_lcu_cost_alb`	LOW	ALB where LCU cost exceeds 2× base fee
`classic_lb_migration`	LOW	Classic Load Balancer — migrate to ALB/NLB
`unused_distribution`	HIGH	CloudFront with zero requests for 30 days
`unused_hosted_zone`	HIGH	Route53 zone with only NS/SOA records
`unused_accelerator`	HIGH	Global Accelerator with zero traffic
`idle_global_accelerator`	HIGH	Global Accelerator deployed with endpoints but zero processed bytes for 30 days
`disabled_global_accelerator`	MEDIUM	Global Accelerator disabled but still incurring fixed hourly charges
`idle_transfer_server`	HIGH	Transfer Family server with zero file transfers for 30 days
`idle_transfer_no_activity`	HIGH	Transfer Family server with zero file operations for 14 days
`idle_transfer_web_app`	HIGH	Transfer Family web app with no activity
`unused_transfer_protocol`	MEDIUM	Transfer server with unused enabled protocols
`unused_vpc_endpoint`	HIGH	VPC endpoint with zero traffic for 14 days
`orphaned_dns_record`	MEDIUM	DNS record pointing to non-existent resource

Serverless Optimization (10 Detectors)

Waste Type	Confidence	Detection
`unused_lambda`	HIGH	Zero invocations for 30 days
`lambda_old_runtime`	LOW	Function on deprecated/EOL runtime
`unused_api_gateway`	HIGH	Zero requests for 30 days
`unused_appsync`	HIGH	Zero queries for 30 days
`appsync_idle_cache`	MEDIUM	Cache with < 100 hits in 14 days (hourly cost: $0.044–$6.78)
`appsync_idle_subscriptions`	MEDIUM	Active WebSocket connections with < 100 requests in 14 days
`idle_state_machine`	HIGH	Zero executions for 30 days
`step_functions_retry_storm`	HIGH	Retry ratio > 25% AND failure rate > 20% over 14 days
`step_functions_high_transition_density`	MEDIUM	Avg transitions per success > 50 (configurable)
`step_functions_express_duration_waste`	MEDIUM	Express p95 duration > 30s with high execution volume

Analytics Optimization (24 Detectors)

Waste Type	Confidence	Detection
`idle_emr_cluster`	HIGH	Running cluster with zero steps
`long_running_emr`	MEDIUM	EMR cluster running continuously without recent steps
`emr_over_provisioned`	MEDIUM	Instance group CPU/memory utilization < 20% for 14 days
`emr_missing_auto_termination`	HIGH	Running cluster without auto-termination policy configured
`emr_previous_gen_instances`	HIGH	Cluster using previous-generation instance types (m3→m5, c3→c5, etc.)
`emr_spot_opportunity`	MEDIUM	Task node instance groups running on On-Demand instead of Spot
`idle_kinesis_stream`	HIGH	Zero records for 14 days
`over_provisioned_kinesis`	MEDIUM	Kinesis stream with shard utilization < 20%
`kinesis_on_demand_downgrade`	MEDIUM	On-Demand stream with stable throughput (CV < 0.3) — switch to Provisioned
`kinesis_extended_retention_waste`	HIGH	Extended retention (>24h) with zero GetRecords in 14 days
`kinesis_enhanced_fan_out_waste`	HIGH	Enhanced fan-out consumer with zero reads for 14 days
`kinesis_firehose_idle`	HIGH	Firehose delivery stream with zero records for 14 days
`idle_msk_cluster`	HIGH	Zero messages for 7 days
`oversized_msk_cluster`	MEDIUM	CPU < 20% AND network < 50% for 7 days
`idle_glue_dev_endpoint`	HIGH	Dev endpoint in READY state
`old_glue_job`	MEDIUM	Job not run for 90 days
`idle_glue_crawler`	MEDIUM	Crawler not run for 90 days
`oversized_glue_job`	MEDIUM	JVM heap < 30% avg OR short duration with large DPU allocation
`glue_job_missing_timeout`	MEDIUM	Timeout ≥ 10× average execution duration
`failed_glue_job_retry`	HIGH	≥ 50% failure rate across recent runs with retries configured
`glue_dev_endpoint_migration`	LOW	Any active dev endpoint (AWS recommends Interactive Sessions)
`glue_catalog_bloat`	LOW	Data Catalog objects exceed 1M free tier threshold
`idle_mq_broker`	HIGH	Zero connections for 14 days
`oversized_mq_broker`	MEDIUM	Over-provisioned MQ broker — downsize to smaller instance type

ML/AI Optimization (6 Detectors)

Waste Type	Confidence	Detection
`idle_sagemaker_notebook`	HIGH	InService with no activity
`idle_sagemaker_endpoint`	HIGH	Zero invocations for 7 days
`oversized_sagemaker_endpoint`	MEDIUM	Low CPU/memory utilization
`stopped_sagemaker_notebook_storage`	HIGH	Stopped notebook with EBS volume
`previous_gen_sagemaker_instance`	HIGH	Previous-gen instance type (ml.m4/c4/t2/r4/p2)
`sp_opportunity_sagemaker`	HIGH	AWS Cost Explorer SageMaker SP recommendation

Management & Operations (7 Detectors)

Waste Type	Confidence	Detection
`no_retention_log_group`	MEDIUM/HIGH	Log group without retention policy
`old_log_group`	MEDIUM	Log group with no recent logs
`excessive_retention_log_group`	MEDIUM	Log group with retention exceeding recommended baseline
`empty_log_group`	HIGH	Log group with zero log streams
`duplicate_cloudtrail`	HIGH	Multiple trails logging same events
`cloudtrail_s3_no_lifecycle`	MEDIUM	CloudTrail bucket without lifecycle
`unused_dashboard`	MEDIUM	CloudWatch dashboard with no recent views

Security & Compliance (12 Detectors)

Waste Type	Confidence	Detection
`unused_secret`	HIGH	Secret not accessed for 90 days
`unused_kms_key`	HIGH	KMS key not used for 90 days
`unencrypted_ebs_volume`	HIGH	EBS volume without encryption enabled
`unencrypted_rds_instance`	HIGH	RDS instance without encryption at rest
`unencrypted_efs_filesystem`	HIGH	EFS filesystem without encryption at rest
`s3_no_default_encryption`	MEDIUM	S3 bucket without default encryption configuration
`opensearch_no_encryption_at_rest`	HIGH	OpenSearch domain without encryption at rest
`unencrypted_documentdb_cluster`	HIGH	DocumentDB cluster without encryption at rest
`rds_no_deletion_protection`	HIGH	RDS instance without deletion protection enabled
`dynamodb_no_deletion_protection`	HIGH	DynamoDB table without deletion protection enabled
`rds_publicly_accessible`	HIGH	RDS instance with public accessibility enabled
`resource_without_backup_coverage`	MEDIUM	Critical resource not covered by any AWS Backup plan

AWS Compute Optimizer Integration (4 Detectors)

Note: These detectors require Online Mode with live AWS API access to Compute Optimizer.

Waste Type	Confidence	Detection
`oversized_ec2_optimizer`	HIGH	AWS ML-based rightsizing recommendation
`oversized_ebs_optimizer`	HIGH	AWS ML-based IOPS/throughput analysis
`oversized_lambda_optimizer`	HIGH	AWS ML-based memory optimization
`oversized_rds_optimizer`	HIGH	AWS ML-based RDS rightsizing recommendation

Reserved Instance, Savings Plans & Commitment Risk Intelligence (16 Detectors)

Note: Purchase recommendation and commitment risk detectors require Online Mode with live AWS API access to Cost Explorer, EC2, or Savings Plans APIs. CUR-based commitment detectors (cur_unused_reservation, cur_savings_plan_waste) are available in all modes including Air-Gapped.

Waste Type	Confidence	Detection
`ri_opportunity_ec2`	HIGH	AWS Cost Explorer EC2 RI recommendation
`ri_opportunity_rds`	HIGH	AWS Cost Explorer RDS RI recommendation
`ri_opportunity_elasticache`	HIGH	AWS Cost Explorer ElastiCache Reserved Node recommendation
`ri_opportunity_opensearch`	HIGH	AWS Cost Explorer OpenSearch RI recommendation
`ri_opportunity_redshift`	HIGH	AWS Cost Explorer Redshift Reserved Node recommendation
`sp_opportunity_compute`	HIGH	AWS Cost Explorer Compute SP recommendation
`sp_opportunity_ec2`	HIGH	AWS Cost Explorer EC2 Instance SP recommendation
`sp_opportunity_sagemaker`	HIGH	AWS Cost Explorer SageMaker SP recommendation
`unused_reserved_instance`	HIGH	RI utilization < 20% over 30 days via Cost Explorer + EC2 API
`unused_savings_plan`	HIGH	SP utilization < 20% over 30 days via Cost Explorer
`expiring_reserved_instance`	HIGH	Active RI expiring within 90 days (URGENT/WARNING/NOTICE tiers)
`expiring_savings_plan`	HIGH	Active SP expiring within 90 days (URGENT/WARNING/NOTICE tiers)
`convertible_ri_exchange_opportunity`	MEDIUM	Convertible RI on previous-gen instance type eligible for free exchange
`savings_plan_coverage_gap`	MEDIUM	SP coverage < 50% with > $100/mo uncovered on-demand spend
`cur_unused_reservation`	MEDIUM	Unused RI hours detected in CUR billing data (all modes)
`cur_savings_plan_waste`	MEDIUM	Unused SP commitment detected in CUR billing data (all modes)

Complete Waste Type Coverage Matrix (191 Types)

This section provides the complete coverage matrix for all 191 unique waste types, showing availability across different detection modes. This is particularly useful for understanding what waste detection capabilities are available in Air-Gapped Mode vs Online Mode.

Mode Definitions

Mode	Description	Requirements
CUR Only	Upload AWS Cost and Usage Report CSV	CUR file from S3
+ Describe Export	CUR + resource configuration data	CUR + export script (basic)
+ CloudWatch Export	CUR + describe + utilization metrics	CUR + export script (full)
Online Mode	Full connected mode with live AWS access	IAM role connection

Coverage Summary

Mode	Waste Types Available	Coverage
Online Mode	191 (all types)	100%
Offline + Full Export	171 waste types	89%
Offline + Describe Only	64 waste types	50%
CUR Only	4 insights	3%

Complete Coverage Matrix

Legend:

✅ Available
❌ Not available
🔌 Online Mode only (requires live AWS API)

Compute Optimization (29 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
1	`idle_ec2`	❌	❌	✅	✅
2	`stopped_ec2_with_ebs`	❌	✅	✅	✅
3	`previous_gen_ec2`	❌	✅	✅	✅
4	`unused_lambda`	❌	❌	✅	✅
5	`over_provisioned_lambda`	❌	❌	✅	✅
5a	`lambda_provisioned_concurrency_idle`	❌	✅	✅	✅
5b	`lambda_excessive_timeout`	❌	✅	✅	✅
5c	`lambda_arm64_migration`	❌	✅	✅	✅
6	`oversized_ecs_task`	❌	❌	✅	✅
7	`idle_sagemaker_notebook`	❌	✅	✅	✅
8	`idle_sagemaker_endpoint`	❌	❌	✅	✅
9	`idle_workspace`	❌	❌	✅	✅
10	`idle_lightsail`	❌	❌	✅	✅
10a	`lightsail_unattached_static_ip`	❌	✅	✅	✅
10b	`lightsail_unattached_disk`	❌	✅	✅	✅
10c	`lightsail_old_snapshot`	❌	✅	✅	✅
10d	`lightsail_idle_load_balancer`	❌	✅	✅	✅
10e	`lightsail_idle_database`	❌	❌	✅	✅
11	`idle_beanstalk`	❌	✅	✅	✅
11a	`beanstalk_idle_traffic`	❌	❌	✅	✅
11b	`beanstalk_unnecessary_alb`	❌	✅	❌	✅
11c	`beanstalk_previous_gen_instances`	❌	✅	❌	✅
11d	`beanstalk_over_provisioned`	❌	❌	✅	✅
11e	`beanstalk_orphaned_rds`	❌	❌	❌	✅
12	`oversized_workspace`	❌	❌	✅	✅
12a	`workspaces_autostop_opportunity`	❌	✅	✅	✅
12b	`workspaces_pool_overprovisioned_capacity`	❌	✅	✅	✅
12c	`workspaces_windows_license_optimization`	❌	✅	✅	✅
12d	`eks_extended_support_cost`	❌	✅	✅	✅

Storage Optimization (26 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
12	`unattached_ebs`	❌	✅	✅	✅
13	`old_ebs_snapshot`	❌	✅	✅	✅
14	`orphaned_ebs_snapshot`	❌	✅	✅	✅
15	`ami_orphaned_snapshot`	❌	✅	✅	✅
16	`gp2_migration`	❌	✅	✅	✅
17	`no_lifecycle_policy`	❌	✅	✅	✅
18	`incomplete_multipart`	❌	✅	✅	✅
19	`s3_rapid_growth`	❌	❌	✅	✅
20	`s3_wrong_storage_class`	❌	❌	✅	✅
21	`s3_empty_bucket`	❌	✅	✅	✅
22	`s3_high_request_and_transfer_cost`	❌	❌	✅	✅
23	`idle_efs`	❌	❌	✅	✅
24	`idle_fsx`	❌	❌	✅	✅
24a	`oversized_fsx`	❌	❌	✅	✅
24b	`fsx_throughput_overprovisioned`	❌	❌	✅	✅
24c	`old_fsx_backup`	❌	✅	✅	✅
25	`ecr_no_lifecycle_policy`	❌	✅	✅	✅
26	`old_ecr_images`	❌	✅	✅	✅
26a	`no_lifecycle_efs`	❌	✅	✅	✅
26b	`over_provisioned_iops`	❌	✅	✅	✅
26d	`old_backup`	❌	✅	✅	✅
26e	`redundant_backup`	❌	✅	✅	✅
26f	`untagged_ecr_images`	❌	✅	✅	✅
26g	`backup_no_lifecycle_tiering`	❌	✅	✅	✅
26h	`stale_backup_plan_assignment`	❌	✅	✅	✅
26i	`backup_copy_policy_overreach`	❌	✅	✅	✅

Database Optimization (39 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
26	`idle_rds`	❌	❌	✅	✅
27	`old_rds_snapshot`	❌	✅	✅	✅
27a	`aurora_to_rds_downgrade_opportunity`	❌	❌	✅	✅
28	`idle_dynamodb`	❌	❌	✅	✅
29	`dynamodb_no_autoscaling`	❌	✅	✅	✅
30	`over_provisioned_dynamodb`	❌	❌	✅	✅
31	`idle_elasticache`	❌	❌	✅	✅
32	`idle_redshift`	❌	❌	✅	✅
33	`underutilized_redshift`	❌	❌	✅	✅
34	`redshift_no_pause`	❌	❌	✅	✅
35	`redshift_spectrum_heavy`	❌	✅	✅	✅
36	`redshift_legacy_dc2`	❌	✅	✅	✅
37	`redshift_wlm_over_provisioned`	❌	❌	✅	✅
38	`redshift_concurrency_scaling_waste`	❌	❌	✅	✅
39	`idle_opensearch`	❌	❌	✅	✅
40	`idle_neptune`	❌	❌	✅	✅
40a	`neptune_serverless_opportunity`	❌	❌	✅	✅
40b	`oversized_neptune`	❌	❌	✅	✅
40c	`neptune_old_snapshot`	❌	✅	❌	✅
41	`idle_documentdb`	❌	❌	✅	✅
41a	`old_documentdb_snapshot`	❌	✅	✅	✅
41b	`overprovisioned_documentdb`	❌	❌	✅	✅
42	`idle_timestream`	❌	❌	✅	✅
43	`idle_qldb`	❌	❌	✅	✅
43c	`oversized_elasticache`	❌	❌	✅	✅
43d	`oversized_redshift`	❌	❌	✅	✅
43e	`oversized_opensearch`	❌	❌	✅	✅
43f	`rds_extended_support_cost`	❌	✅	✅	✅
43g	`elasticache_extended_support_cost`	❌	✅	✅	✅
43h	`elasticache_replication_waste`	❌	❌	✅	✅
43i	`elasticache_engine_migration`	❌	❌	✅	✅
43j	`elasticache_serverless_optimization`	❌	❌	✅	✅
43k	`elasticache_data_tiering_opportunity`	❌	❌	✅	✅
43l	`opensearch_extended_support_cost`	❌	✅	✅	✅
43m	`documentdb_extended_support_cost`	❌	✅	✅	✅
43n	`aurora_io_optimization_opportunity`	❌	❌	✅	✅
43o	`aurora_extended_support_cost`	❌	✅	✅	✅
43p	`aurora_serverless_opportunity`	❌	❌	✅	✅
43q	`opensearch_ebs_overprovisioned`	❌	❌	✅	✅

Network Optimization (17 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
38	`unattached_eip`	❌	✅	✅	✅
39	`eip_on_stopped_instance`	❌	✅	✅	✅
40	`multiple_eips_per_instance`	❌	✅	✅	✅
41	`idle_nat_gateway`	❌	❌	✅	✅
42	`idle_load_balancer`	❌	✅	✅	✅
43	`low_traffic_alb`	❌	✅	✅	✅
44	`high_lcu_cost_alb`	❌	❌	✅	✅
45	`classic_lb_migration`	❌	✅	✅	✅
46	`unused_distribution`	❌	❌	✅	✅
47	`unused_hosted_zone`	❌	✅	✅	✅
48	`unused_accelerator`	❌	❌	✅	✅
48e	`idle_global_accelerator`	❌	❌	✅	✅
48f	`disabled_global_accelerator`	❌	❌	✅	✅
48a	`idle_transfer_no_activity`	❌	❌	✅	✅
48b	`idle_transfer_web_app`	❌	✅	✅	✅
48c	`unused_transfer_protocol`	❌	✅	✅	✅
48d	`unused_vpc_endpoint`	❌	✅	✅	✅

Serverless Optimization (11 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
49	`unused_lambda`	❌	❌	✅	✅
49a	`lambda_old_runtime`	❌	✅	✅	✅
50	`unused_api_gateway`	❌	❌	✅	✅
51	`unused_appsync`	❌	❌	✅	✅
51a	`appsync_idle_cache`	❌	✅	✅	✅
51b	`appsync_idle_subscriptions`	❌	❌	✅	✅
52	`idle_state_machine`	❌	❌	✅	✅
52a	`step_functions_retry_storm`	❌	❌	✅	✅
52b	`step_functions_high_transition_density`	❌	❌	✅	✅
52c	`step_functions_express_duration_waste`	❌	❌	✅	✅
53	`idle_transfer_server`	❌	❌	✅	✅

Analytics Optimization (24 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
54	`idle_emr_cluster`	❌	✅	✅	✅
54a	`emr_over_provisioned`	❌	✅	✅	✅
54b	`emr_missing_auto_termination`	❌	✅	✅	✅
54c	`emr_previous_gen_instances`	❌	✅	✅	✅
54d	`emr_spot_opportunity`	❌	✅	✅	✅
55	`idle_kinesis_stream`	❌	❌	✅	✅
56	`idle_msk_cluster`	❌	❌	✅	✅
56a	`oversized_msk_cluster`	❌	❌	✅	✅
57	`idle_glue_dev_endpoint`	❌	✅	✅	✅
58	`old_glue_job`	❌	✅	✅	✅
59	`idle_glue_crawler`	❌	✅	✅	✅
59a	`oversized_glue_job`	❌	✅	✅	✅
59b	`glue_job_missing_timeout`	❌	✅	❌	✅
59c	`failed_glue_job_retry`	❌	✅	❌	✅
59d	`glue_dev_endpoint_migration`	❌	✅	❌	✅
59e	`glue_catalog_bloat`	❌	✅	❌	✅
60	`idle_mq_broker`	❌	❌	✅	✅
60a	`oversized_mq_broker`	❌	❌	✅	✅
60b	`over_provisioned_kinesis`	❌	❌	✅	✅
60c	`kinesis_on_demand_downgrade`	❌	❌	✅	✅
60d	`kinesis_extended_retention_waste`	❌	❌	✅	✅
60e	`kinesis_enhanced_fan_out_waste`	❌	❌	✅	✅
60f	`kinesis_firehose_idle`	❌	❌	✅	✅
60b	`long_running_emr`	❌	✅	✅	✅

ML/AI Optimization (5 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
61	`idle_sagemaker_notebook`	❌	✅	✅	✅
62	`idle_sagemaker_endpoint`	❌	❌	✅	✅
63	`oversized_sagemaker_endpoint`	❌	❌	✅	✅
64	`stopped_sagemaker_notebook_storage`	❌	✅	✅	✅
65	`previous_gen_sagemaker_instance`	❌	✅	✅	✅

Management & Operations (7 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
66	`no_retention_log_group`	❌	✅	✅	✅
67	`old_log_group`	❌	✅	✅	✅
68	`excessive_retention_log_group`	❌	✅	✅	✅
69	`empty_log_group`	❌	✅	✅	✅
70	`duplicate_cloudtrail`	❌	✅	✅	✅
71	`cloudtrail_s3_no_lifecycle`	❌	✅	✅	✅
71a	`unused_dashboard`	❌	✅	✅	✅

Security & Compliance (12 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
72	`unused_secret`	❌	✅	✅	✅
73	`unused_kms_key`	❌	✅	✅	✅
73a	`unencrypted_ebs_volume`	❌	✅	✅	✅
73b	`unencrypted_rds_instance`	❌	✅	✅	✅
73c	`unencrypted_efs_filesystem`	❌	✅	✅	✅
73d	`s3_no_default_encryption`	❌	✅	✅	✅
73e	`opensearch_no_encryption_at_rest`	❌	✅	✅	✅
73f	`unencrypted_documentdb_cluster`	❌	✅	✅	✅
73g	`rds_no_deletion_protection`	❌	✅	✅	✅
73h	`dynamodb_no_deletion_protection`	❌	✅	✅	✅
73i	`rds_publicly_accessible`	❌	✅	✅	✅
73j	`resource_without_backup_coverage`	❌	✅	✅	✅

IPv4 Address Optimization (3 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
74	`unattached_eip`	❌	✅	✅	✅
75	`eip_on_stopped_instance`	❌	✅	✅	✅
76	`multiple_eips_per_instance`	❌	✅	✅	✅

Additional Compute Detectors (8 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
77	`over_provisioned_ec2`	❌	❌	✅	✅
78	`unoptimized_ebs_iops`	❌	❌	✅	✅
79	`idle_ecs_service`	❌	❌	✅	✅
79a	`ecs_no_autoscaling`	❌	✅	✅	✅
79b	`ecs_container_insights_waste`	❌	✅	✅	✅
79c	`oversized_ecs_memory`	❌	❌	✅	✅
80	`idle_eks_nodegroup`	❌	❌	✅	✅
81	`unattached_eni`	❌	✅	✅	✅

Additional Storage Detectors (4 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
82	`s3_intelligent_tiering_candidate`	❌	✅	✅	✅
83	`ebs_snapshot_public`	❌	✅	✅	✅
84	`old_ami`	❌	✅	✅	✅
85	`unattached_ebs_iops`	❌	✅	✅	✅

Additional Network Detectors (3 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
86	`idle_vpn_connection`	❌	❌	✅	✅
87	`idle_direct_connect`	❌	❌	✅	✅
88	`orphaned_dns_record`	❌	❌	❌	✅

CUR-Derived Insights (4 Types)

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
89	`previous_gen_indicator`	✅	✅	✅	✅
90	`on_demand_candidate`	✅	✅	✅	✅
91	`untagged_resources`	✅	✅	✅	✅
92	`cost_anomaly`	✅	✅	✅	✅

AWS Compute Optimizer Integration (4 Types) 🔌

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
93	`oversized_ec2_optimizer`	❌	❌	❌	✅
94	`oversized_ebs_optimizer`	❌	❌	❌	✅
95	`oversized_lambda_optimizer`	❌	❌	❌	✅
95a	`oversized_rds_optimizer`	❌	❌	❌	✅

Reserved Instance & Savings Plans (16 Types) 🔌

#	Waste Type	CUR	+ Describe	+ CloudWatch	Online
96	`ri_opportunity_ec2`	❌	❌	❌	✅
97	`ri_opportunity_rds`	❌	❌	❌	✅
98	`sp_opportunity_compute`	❌	❌	❌	✅
99	`sp_opportunity_ec2`	❌	❌	❌	✅
99a	`ri_opportunity_elasticache`	❌	❌	❌	✅
99b	`ri_opportunity_opensearch`	❌	❌	❌	✅
99c	`ri_opportunity_redshift`	❌	❌	❌	✅
99d	`sp_opportunity_sagemaker`	❌	❌	❌	✅
99e	`unused_reserved_instance`	❌	❌	❌	✅
99f	`unused_savings_plan`	❌	❌	❌	✅
99g	`expiring_reserved_instance`	❌	❌	❌	✅
99h	`expiring_savings_plan`	❌	❌	❌	✅
99i	`convertible_ri_exchange_opportunity`	❌	❌	❌	✅
99j	`savings_plan_coverage_gap`	❌	❌	❌	✅
99k	`cur_unused_reservation`	✅	✅	✅	✅
99l	`cur_savings_plan_waste`	✅	✅	✅	✅

Online-Only Waste Types (20)

The following waste types require live AWS API access and are not available in Air-Gapped Mode:

Waste Type	Reason	AWS API Required
`oversized_ec2_optimizer`	AWS ML analysis	Compute Optimizer API
`oversized_ebs_optimizer`	AWS ML analysis	Compute Optimizer API
`oversized_lambda_optimizer`	AWS ML analysis	Compute Optimizer API
`oversized_rds_optimizer`	AWS ML analysis	Compute Optimizer API
`ri_opportunity_ec2`	Purchase recommendations	Cost Explorer API
`ri_opportunity_rds`	Purchase recommendations	Cost Explorer API
`ri_opportunity_elasticache`	Purchase recommendations	Cost Explorer API
`ri_opportunity_opensearch`	Purchase recommendations	Cost Explorer API
`ri_opportunity_redshift`	Purchase recommendations	Cost Explorer API
`sp_opportunity_compute`	Purchase recommendations	Cost Explorer API
`sp_opportunity_ec2`	Purchase recommendations	Cost Explorer API
`sp_opportunity_sagemaker`	Purchase recommendations	Cost Explorer API
`unused_reserved_instance`	Purchase commitment analysis	Cost Explorer + EC2 API
`unused_savings_plan`	Purchase commitment analysis	Cost Explorer API
`expiring_reserved_instance`	Real-time RI expiry tracking	EC2 API
`expiring_savings_plan`	Real-time SP expiry tracking	Savings Plans API
`convertible_ri_exchange_opportunity`	RI inventory + pricing	EC2 API + Pricing API
`savings_plan_coverage_gap`	Coverage analysis	Cost Explorer API
`beanstalk_orphaned_rds`	Cross-service EB/RDS correlation	RDS + CloudWatch API
`orphaned_dns_record`	DNS cross-reference requires live Route53 API	Route53 API

Tip: For maximum Air-Gapped Mode coverage (89%), run the full export script including CloudWatch metrics. See our Air-Gapped Mode Guide for setup instructions.

Detector Reference

Compute Detectors

Idle EC2 Instance

Confidence: HIGH
Detection: CloudWatch CPUUtilization average < 5% over 14 days
Savings: Full instance cost (from AWS Pricing API)

How it works:

We query CloudWatch for 14 days of CPU metrics
If average CPU is below the threshold, the instance is flagged
Monthly cost is calculated using the AWS Pricing API for the exact instance type

Example:

Instance: i-0abc123def456
Type: t3.large
Region: us-east-1
Avg CPU: 2.3% over 14 days
Monthly Savings: $60.74 (100% of instance cost)

AWS Reference: EC2 On-Demand Pricing

Why HIGH confidence: We use actual CloudWatch metrics and exact AWS pricing. No estimation.

Stopped EC2 with EBS Storage

Confidence: MEDIUM
Detection: Instance state = stopped with attached EBS volumes
Savings: EBS storage costs for attached volumes

How it works:

We identify stopped instances with BlockDeviceMappings
EBS volumes continue to charge even when the instance is stopped
We flag this for review with estimated storage costs

Example:

Instance: i-0abc123def456
State: stopped
EBS Volumes: 2 (100GB gp3 + 500GB gp2)
Monthly Savings: $58.00 ($8 + $50)

AWS Reference: EBS Pricing

Why MEDIUM confidence: Volume sizes require additional API call; we estimate based on typical configurations.

Over-Provisioned Lambda

Confidence: MEDIUM
Detection: CloudWatch MemoryUtilization < 50% over 7 days
Savings: Proportional to memory reduction

How it works:

We analyze Lambda memory utilization via CloudWatch metrics
Functions using less than 50% of allocated memory are flagged
Reducing memory allocation reduces cost per invocation

Example:

Function: my-api-handler
Memory Allocated: 1024MB
Memory Used (avg): 256MB (25%)
Monthly Savings: ~$15.00 (if reduced to 512MB)

AWS Reference: Lambda Pricing

Why MEDIUM confidence: Memory reduction may affect cold start times and CPU allocation.

Idle Provisioned Concurrency

Confidence: HIGH
Detection: Provisioned Concurrency utilization < 10% over 14 days
Savings: Based on wasted fraction of allocated Provisioned Concurrency

How it works:

We list all Provisioned Concurrency configurations for each Lambda function
CloudWatch ProvisionedConcurrencyUtilization is analyzed over 14 days
Configurations with average utilization below 10% are flagged as idle
Savings are calculated from the wasted fraction of allocated concurrency

Example:

Function: my-api-handler:prod
Allocated PC: 50
Avg Utilization: 3%
Wasted Fraction: 97%
Monthly Savings: $48.60

AWS Reference: Lambda Pricing — Provisioned Concurrency

Why HIGH confidence: Provisioned Concurrency has a fixed cost regardless of usage. Low utilization directly translates to waste.

Excessive Lambda Timeout

Confidence: MEDIUM
Detection: Timeout ≥ 10× average duration, with ≥ 100 invocations
Savings: Recommendation only (no direct cost savings from timeout change)

How it works:

We compare the configured timeout against actual average execution duration
Functions where the timeout is 10× or more of the average duration are flagged
This helps avoid accidental runaway costs and improves error handling
Recommended timeout is set to max(10, avg_duration × 3) seconds

Example:

Function: my-batch-processor
Timeout: 900s (15 min)
Avg Duration: 2s
Ratio: 450×
Recommended: 10s
Monthly Savings: N/A (recommendation only)

AWS Reference: Lambda Configuration

Why MEDIUM confidence: Timeout changes do not directly reduce cost, but prevent runaway invocations and improve operational hygiene.

ARM64 Migration Opportunity

Confidence: LOW
Detection: x86_64 function with ARM64-compatible runtime and monthly cost ≥ $1
Savings: ~20% cost reduction by migrating to Graviton2 (ARM64)

How it works:

We check the function's architecture (x86_64) and runtime
Functions on ARM64-compatible runtimes (Python, Node.js, Java, .NET, Ruby) are flagged
Only functions with monthly cost ≥ $5 are included (to avoid noise)
AWS offers a 20% price discount for Lambda on Graviton2

Example:

Function: my-data-pipeline
Runtime: python3.12
Architecture: x86_64
Monthly Cost: $45.00
Est. Savings: $9.00/month (20%)

AWS Reference: Lambda Graviton2

Why LOW confidence: Migration requires testing; some libraries may not support ARM64. Manual verification recommended.

Oversized ECS Task

Confidence: MEDIUM
Detection: CloudWatch CPU/Memory utilization analysis
Savings: Based on right-sized task definition

How it works:

We analyze ECS task CPU and memory utilization metrics
Tasks consistently using less than 50% of allocated resources are flagged
Right-sizing reduces Fargate costs or EC2 capacity requirements

Example:

Service: my-web-service
Task CPU: 1024 units
Task Memory: 2048MB
Avg Utilization: 25% CPU, 30% Memory
Monthly Savings: ~$40.00

AWS Reference: ECS Pricing

Why MEDIUM confidence: Workload patterns may have periodic spikes not captured in averages.

Idle SageMaker Notebook

Confidence: HIGH
Detection: Notebook instance InService with no kernel activity
Savings: Full instance cost

How it works:

We identify SageMaker notebook instances in InService status
Check for kernel activity or user connections
Idle notebooks continue to charge full instance cost

Example:

Notebook: my-ml-notebook
Instance Type: ml.t3.medium
Status: InService
Activity (7 days): None
Monthly Savings: $37.00

AWS Reference: SageMaker Pricing

Why HIGH confidence: SageMaker notebooks charge continuously when InService, regardless of activity.

Idle SageMaker Endpoint

Confidence: HIGH
Detection: Zero invocations for 7 days
Savings: Full endpoint cost

How it works:

We query CloudWatch for endpoint invocation metrics
Endpoints with zero invocations for 7+ days are flagged
Idle endpoints charge full compute cost continuously

Example:

Endpoint: my-model-endpoint
Instance Type: ml.m5.large
Invocations (7 days): 0
Monthly Savings: $96.00

AWS Reference: SageMaker Pricing

Why HIGH confidence: Zero invocations is unambiguous. Endpoint is not being used.

Idle WorkSpaces

Confidence: HIGH
Detection: Zero connections for 30 days
Savings: Full WorkSpace cost

How it works:

We check WorkSpaces connection status via CloudWatch
WorkSpaces with no user connections for 30+ days are flagged
Monthly running mode charges continuously whether used or not

Example:

WorkSpace: ws-abc123def
Bundle: Standard (Windows)
Connections (30 days): 0
Monthly Savings: $35.00

AWS Reference: WorkSpaces Pricing

Why HIGH confidence: No user connections means no one is using this virtual desktop.

Idle Lightsail Instance

Confidence: HIGH
Detection: CloudWatch CPUUtilization < 5% for 14 days
Savings: Full instance cost

How it works:

We query CloudWatch for 14 days of CPU metrics
Instances with consistently low CPU are flagged
Lightsail charges fixed monthly rate regardless of usage

Example:

Instance: my-lightsail-instance
Plan: $5/month (1GB RAM, 1 vCPU)
Avg CPU: 2.1% over 14 days
Monthly Savings: $5.00

AWS Reference: Lightsail Pricing

Why HIGH confidence: Actual CloudWatch metrics confirm low utilization.

Unattached Lightsail Static IP

Confidence: LOW
Detection: Static IP allocated but not attached to any instance
Savings: $3.65/month per unattached IP

How it works:

We list all Lightsail static IPs
IPs where isAttached is false are flagged
AWS charges $3.65/month for unattached static IPs

Example:

Static IP: my-static-ip
IP Address: 52.1.2.3
Attached: No
Monthly Savings: $3.65

AWS Reference: Lightsail Pricing

Why LOW confidence: IP may be reserved for future use.

Unattached Lightsail Disk

Confidence: MEDIUM
Detection: Block storage disk not attached to any instance
Savings: $0.10/GB/month

How it works:

We list all Lightsail block storage disks
Disks where isAttached is false are flagged
Storage charges apply regardless of attachment

Example:

Disk: my-data-disk
Size: 80 GB
Attached: No
Monthly Savings: $8.00

AWS Reference: Lightsail Pricing

Why MEDIUM confidence: Disk may contain important data. Remediation creates a snapshot before deletion.

Old Lightsail Snapshot

Confidence: MEDIUM
Detection: Manual snapshot older than 90 days
Savings: $0.05/GB/month

How it works:

We list all Lightsail instance snapshots
Manual snapshots older than 90 days are flagged
Auto-snapshots are excluded (managed by Lightsail)

Example:

Snapshot: my-old-snapshot
Size: 20 GB
Age: 120 days
Type: Manual
Monthly Savings: $1.00

AWS Reference: Lightsail Pricing

Why MEDIUM confidence: Old snapshots may still be needed for compliance.

Idle Lightsail Load Balancer

Confidence: MEDIUM
Detection: Load balancer with zero healthy target instances
Savings: $18.00/month flat rate

How it works:

We list all Lightsail load balancers
LBs with zero healthy instances are flagged
Lightsail charges a flat monthly rate regardless of traffic

Example:

Load Balancer: my-lb
Healthy Instances: 0
Monthly Savings: $18.00

AWS Reference: Lightsail Pricing

Why MEDIUM confidence: LB may be temporarily without instances during maintenance.

Idle Lightsail Database

Confidence: HIGH
Detection: CPU < 1% and zero database connections for 14 days
Savings: $15–$115/month (doubled for HA)

How it works:

We query CloudWatch CPUUtilization and DatabaseConnections metrics
Databases with consistently near-zero CPU and connections are flagged
Stopped databases are also detected
High-availability databases cost 2x the standard bundle price

Example:

Database: my-idle-db
Bundle: medium_2_0
Engine: MySQL
HA: No
Avg CPU: 0.5% over 14 days
Connections: 0 for 14 days
Monthly Savings: $30.00

AWS Reference: Lightsail Pricing

Why HIGH confidence: Both CPU and connection metrics confirm zero usage.

Idle Elastic Beanstalk Environment

Confidence: HIGH
Detection: Unhealthy environment (Grey/Red health status)
Savings: Environment resource costs (EC2, ELB, etc.)

How it works:

We check each Beanstalk environment's health status via the DataProvider
Environments with Grey or Red health in Ready state are flagged as unhealthy
Savings are estimated from the underlying instance type, count, and load balancer

Example:

Environment: my-app-env
Health: Grey (unhealthy)
Instance Type: t3.medium (1 instance)
Monthly Savings: $32.00

AWS Reference: Elastic Beanstalk Pricing

Why HIGH confidence: Grey/Red health is a definitive AWS signal that the environment is not functioning correctly.

Idle Beanstalk Environment (Zero Traffic)

Confidence: HIGH
Detection: Zero CloudWatch RequestCount for 14 consecutive days
Savings: Full environment resource costs (EC2, ALB, EBS)

How it works:

We analyze CloudWatch RequestCount metrics for each Beanstalk environment
Environments with zero requests for 14+ days are flagged as idle
Savings include EC2 instances, ALB (if load-balanced), and EBS volumes

Example:

Environment: staging-api
Requests (14 days): 0
Instance Type: t3.small (2 instances, load-balanced)
Monthly Savings: $48.00

AWS Reference: Elastic Beanstalk Pricing

Why HIGH confidence: Zero requests over 14 days is a definitive signal of no traffic.

Unnecessary Load Balancer on Single-Instance Beanstalk

Confidence: HIGH
Detection: Load-balanced environment with auto-scaling min=max=1
Savings: ~$22/month (ALB fixed cost + minimum LCU)

How it works:

We check environment type (LoadBalanced) and Auto Scaling group settings
If min=max=1, the ALB is serving a single instance with no scaling benefit
Switching to SingleInstance type removes the ALB overhead

Example:

Environment: dev-api
Type: LoadBalanced, Min=1, Max=1
ALB Cost: $16.43/month + LCU: $5.84/month
Monthly Savings: $22.27

AWS Reference: Elastic Beanstalk Environment Types

Why HIGH confidence: Configuration analysis with no ambiguity — min=max=1 means no scaling.

Beanstalk Previous-Generation Instances

Confidence: HIGH
Detection: Environment using previous-generation instance families (t2, m4, c4, etc.)
Savings: Price delta between old and new generation × instance count × 730 hours

How it works:

We check the instance type configured for each environment
If the family is in our previous-gen mapping (t2→t3, m4→m5, c4→c5, etc.), we flag it
Current-gen instances offer better performance at the same or lower price

Example:

Environment: legacy-worker
Instance Type: t2.medium → t3.medium
Monthly Savings: $3.50 per instance

AWS Reference: Previous Generation Instances

Why HIGH confidence: Instance family mapping is deterministic with known pricing.

Over-Provisioned Beanstalk Environment

Confidence: MEDIUM
Detection: Multi-instance environment with <25% avg CPU over 14 days
Savings: Excess instance costs (EC2 + EBS per removed instance)

How it works:

We analyze CPUUtilization metrics for environments with 2+ instances
If average CPU is below 25% for 14 days, the environment is over-provisioned
We recommend a target count at 50% CPU headroom

Example:

Environment: api-production
Instances: 4 × t3.medium at 12% avg CPU
Recommended: 2 instances
Monthly Savings: $65.00

AWS Reference: Elastic Beanstalk Auto Scaling

Why MEDIUM confidence: CPU metrics are a strong indicator but don't capture memory or I/O bottlenecks.

Orphaned RDS from Beanstalk Environment

Confidence: HIGH
Detection: RDS with EB tags and zero DatabaseConnections for 14 days
Savings: RDS instance + storage costs

How it works:

We scan RDS instances for Elastic Beanstalk-related tags
We check CloudWatch DatabaseConnections — zero for 14 days means no app is using it
This detector is online-only (requires cross-service API calls)

Example:

RDS Instance: eb-prod-api-db (db.t3.micro)
Connections (14 days): 0
Associated Environment: prod-api (terminated)
Monthly Savings: $14.00

AWS Reference: Elastic Beanstalk with RDS

Why HIGH confidence: Zero connections for 14 days with EB tags is a definitive orphan signal.

EKS Extended Support Cost

Confidence: HIGH
Detection: EKS cluster version in extended support window
Savings: Extended support surcharge ($0.60/cluster-hour year 1–2, $1.20 year 3)

How it works:

We check each EKS cluster's Kubernetes version against the AWS lifecycle policy
Versions past their standard support date incur an extended support surcharge
Surcharge is per cluster-hour and can be significant for multi-cluster environments

Example:

Cluster: my-production-cluster
Version: 1.24 (extended support since 2024-01)
Nodes: 5
Monthly Surcharge: $438.00 ($0.60/hr × 730 hrs)
Action: Upgrade to supported version

AWS Reference: EKS Extended Support Pricing

Why HIGH confidence: Cluster version is deterministic metadata. Extended support dates are published by AWS.

Oversized WorkSpaces

Confidence: MEDIUM
Detection: CloudWatch UserConnected = 0 or CPU < 5% over 14 days
Savings: Downsize to smaller bundle (Standard → Value)

How it works:

We analyze WorkSpaces usage via CloudWatch metrics
WorkSpaces with very low utilization can be downsized to smaller bundles
Switching from Performance to Standard or Value bundle reduces monthly cost

Example:

WorkSpace: ws-abc123def
Bundle: Performance (8 vCPU, 16 GB)
Avg CPU: 3% over 14 days
Recommended: Standard (2 vCPU, 4 GB)
Monthly Savings: $42.00

AWS Reference: WorkSpaces Pricing

Why MEDIUM confidence: Usage patterns may vary seasonally. Verify with the workspace owner before downsizing.

WorkSpaces AutoStop Opportunity

Confidence: HIGH
Detection: AlwaysOn WorkSpace with > 7 days since last user connection
Savings: Difference between AlwaysOn and AutoStop billing mode

How it works:

We identify AlwaysOn WorkSpaces where the user hasn't connected in over 7 days
AlwaysOn charges a fixed monthly rate regardless of usage
AutoStop charges a lower base fee plus hourly usage, saving money for infrequent users
Break-even analysis determines if AutoStop would be cheaper

Example:

WorkSpace: ws-abc123def
Bundle: Performance (AlwaysOn)
Last Connection: 12 days ago
AlwaysOn Cost: $60.00/month
Est. AutoStop Cost: $15.25/month (base + ~20 hours)
Monthly Savings: $44.75

AWS Reference: WorkSpaces Pricing

Why HIGH confidence: Connection status directly from AWS API. AlwaysOn vs AutoStop pricing is deterministic.

WorkSpaces Pool Overprovisioned Capacity

Confidence: HIGH
Detection: WorkSpaces Pool utilization < 75% with excess capacity ≥ 2 slots
Savings: Cost of excess idle pool slots

How it works:

We check WorkSpaces Pool desired vs running session counts
Pools with utilization below 75% and excess capacity above p95 demand are flagged
Excess slots incur charges regardless of whether users are connected

Example:

Pool: engineering-pool
Configured Sessions: 20
Active Sessions: 8 (40% utilization)
Excess Slots: 12
Monthly Savings: $420.00

AWS Reference: WorkSpaces Pools Pricing

Why HIGH confidence: Pool capacity and session counts come directly from the WorkSpaces API.

WorkSpaces Windows License Optimization

Confidence: MEDIUM
Detection: ≥ 5 Windows license-included WorkSpaces (non-BYOL)
Savings: ~$4/month per desktop via BYOL or Linux alternatives

How it works:

We count WorkSpaces running Windows with AWS-included licenses
Organizations with Microsoft Volume Licensing or Software Assurance can bring their own licenses (BYOL)
Alternatively, eligible users can migrate to Amazon Linux WorkSpaces to eliminate the license premium

Example:

Windows WorkSpaces (license-included): 25
License Premium: ~$4/desktop/month
Monthly Savings: $100.00

AWS Reference: WorkSpaces BYOL

Why MEDIUM confidence: BYOL eligibility depends on your Microsoft licensing agreement. This is an advisory finding — verify licensing terms before acting.

Containers Detectors

Idle ECS Service

Confidence: HIGH
Detection: ECS service with desiredCount > 0 but runningCount = 0
Savings: Full Fargate compute cost (vCPU + memory)

How it works:

We identify ECS services where desired tasks are set but no tasks are actually running
This indicates a deployment issue or abandoned service still incurring costs
Fargate charges per-second for running tasks based on vCPU and memory

Example:

Service: my-api-service
Cluster: production
Desired: 2, Running: 0
Task CPU: 1 vCPU, Memory: 2 GB
Monthly Savings: $65.70

AWS Reference: ECS Pricing

Why HIGH confidence: A service actively trying to run tasks but failing is a clear operational issue with direct cost impact.

ECS Without Auto-Scaling

Confidence: MEDIUM
Detection: ECS service with ≥ 2 running tasks and no Application Auto Scaling target
Savings: ~30% of monthly compute cost from auto-scaling optimization

How it works:

We check ECS services with multiple running tasks
Services without Application Auto Scaling registered are flagged
Auto-scaling allows services to scale down during low-traffic periods

Example:

Service: my-web-frontend
Cluster: production
Running Tasks: 4 (constant)
Auto Scaling: Not configured
Est. Monthly Savings: $98.00 (30% of compute)

AWS Reference: ECS Auto Scaling

Why MEDIUM confidence: Some services intentionally run at fixed capacity (e.g., worker pools). Verify workload patterns before enabling auto-scaling.

Container Insights Waste

Confidence: HIGH
Detection: Container Insights enabled on dev/test cluster or cluster with < 3 services
Savings: ~$3.50/month per small cluster (based on estimated metric volume)

How it works:

We check if Container Insights is enabled on ECS clusters
Dev/staging/sandbox clusters or clusters with few services don't benefit enough to justify the monitoring cost
Each custom metric costs $0.30/month and Container Insights generates ~50 metrics per service

Example:

Cluster: dev-cluster
Environment: development
Services: 1
Container Insights: Enabled
Est. Metrics: 70
Monthly Savings: $4.90

AWS Reference: CloudWatch Container Insights Pricing

Why HIGH confidence: Cluster environment and service count are deterministic. Container Insights cost is directly proportional to metric volume.

Oversized ECS Memory

Confidence: MEDIUM
Detection: CloudWatch MemoryUtilization max < 40% over 7 days
Savings: Proportional to memory reduction in Fargate task definition

How it works:

We analyze ECS task memory utilization via CloudWatch metrics
Tasks where peak memory usage stays below 40% of allocated memory are flagged
Reducing memory allocation in the task definition directly reduces Fargate cost

Example:

Service: my-worker
Task Memory: 4 GB
Max Memory Used: 1.2 GB (30%)
Recommended: 2 GB
Monthly Savings: $23.40

AWS Reference: ECS Task Definition Parameters

Why MEDIUM confidence: Memory usage may spike under certain workloads not captured during the analysis window. Monitor after resizing.

Storage Detectors

Unattached EBS Volume

Confidence: HIGH
Detection: describe_volumes with status=available
Savings: size × price per GB (varies by volume type)

Pricing Table:

Volume Type	Price per GB-month
gp3	$0.08
gp2	$0.10
io1/io2	$0.125 + IOPS
st1	$0.045
sc1	$0.015

Example:

Volume: vol-0abc123def456
Type: gp3
Size: 500GB
Status: available (not attached)
Monthly Savings: $40.00 (500 × $0.08)

AWS Reference: EBS Pricing

Why HIGH confidence: We get exact size and type from the API. Pricing is fixed per GB.

gp2 → gp3 Migration

Confidence: HIGH
Detection: describe_volumes with volume-type=gp2
Savings: size × $0.02/GB-month (20% savings)

How it works:

gp2 costs $0.10/GB-month
gp3 costs $0.08/GB-month (20% cheaper)
gp3 includes 3,000 IOPS and 125 MB/s baseline (same or better performance)

Example:

Volume: vol-0abc123def456
Type: gp2
Size: 1000GB
Current Cost: $100/month
After Migration: $80/month
Monthly Savings: $20.00

AWS Reference: EBS Pricing

Why HIGH confidence: AWS explicitly states gp3 is 20% cheaper with equal or better baseline performance.

Old EBS Snapshots

Confidence: HIGH
Detection: Snapshots older than threshold (default: 90 days)
Savings: VolumeSize × $0.05/GB-month

Example:

Snapshot: snap-0abc123def456
Age: 180 days
Size: 100GB
Monthly Savings: $5.00

AWS Reference: EBS Pricing (Snapshots)

Why HIGH confidence: Snapshot size is exact; pricing is fixed at $0.05/GB-month.

Orphaned EBS Snapshots

Property	Value
Waste Type	`orphaned_ebs_snapshot`
Category	Storage Optimization
Confidence	HIGH
Risk Level	LOW
Savings Estimate	`volume_size_gb × $0.05/month` (upper-bound; actual size may be smaller due to incremental storage)

What it detects: EBS snapshots whose source volume has been deleted. When a volume is terminated, its snapshots remain and continue incurring charges indefinitely. These orphaned snapshots cannot be used for incremental recovery of the original workload — they only serve as standalone restore points.

Detection logic: Cross-references all snapshots against existing EBS volumes. If a snapshot's VolumeId refers to a volume that no longer exists, the snapshot is flagged as orphaned.

Recommended action: Delete orphaned snapshots that are no longer needed. Consider creating an AWS Data Lifecycle Manager policy to automate snapshot retention.

AWS CLI:

# List orphaned snapshots (source volume deleted)
aws ec2 describe-snapshots --owner-ids self \
  --query 'Snapshots[*].{ID:SnapshotId,Vol:VolumeId,Size:VolumeSize,Created:StartTime}' \
  --output table

AMI Orphaned Snapshots

Property	Value
Waste Type	`ami_orphaned_snapshot`
Category	Storage Optimization
Confidence	HIGH
Risk Level	LOW
Savings Estimate	`volume_size_gb × $0.05/month` (upper-bound; actual size may be smaller due to incremental storage)

What it detects: EBS snapshots that were created by CreateImage to back an AMI, but the AMI has since been deregistered. When an AMI is deregistered, its backing snapshots are NOT automatically deleted — they remain and continue incurring charges.

Detection logic: Matches snapshot descriptions against the pattern Created by CreateImage(i-xxx) for ami-xxx, then checks whether the referenced AMI is still registered. If the AMI no longer exists, the snapshot is flagged.

Recommended action: Delete the orphaned snapshot after confirming no other AMIs or launch templates reference it. Consider automating AMI cleanup to include backing snapshot deletion.

AWS CLI:

# Find snapshots created by CreateImage for deregistered AMIs
aws ec2 describe-snapshots --owner-ids self \
  --query 'Snapshots[?starts_with(Description, `Created by CreateImage`)].{ID:SnapshotId,Desc:Description,Size:VolumeSize}' \
  --output table

ECR Repository Without Lifecycle Policy

Confidence: MEDIUM
Detection: ECR repository with no lifecycle policy configured
Savings: Varies based on image count and size

How it works:

We check each ECR repository for a lifecycle policy
Repos without policies accumulate old images indefinitely
This leads to unbounded storage costs over time

Example:

Repository: my-app-images
Lifecycle Policy: None
Total Images: 150 (120 untagged)
Estimated Size: 45GB
Monthly Savings: $4.50 (if cleaned up)

AWS Reference: ECR Pricing

Why MEDIUM confidence: Actual savings depend on cleanup behavior; lifecycle policies prevent future waste.

S3 Bucket Without Lifecycle Policy

Confidence: HIGH
Detection: S3 bucket with no lifecycle rules configured
Savings: Varies based on object count and size

How it works:

We check each S3 bucket for lifecycle configuration
Buckets without lifecycle rules accumulate objects indefinitely
Lifecycle rules can transition objects to cheaper storage or delete old data

Example:

Bucket: my-application-logs
Lifecycle Rules: None
Objects: 2.5 million
Size: 500GB
Monthly Cost: $11.50 (S3 Standard)
Recommendation: Add lifecycle to transition to Glacier after 90 days
Potential Savings: $10.00/month (87%)

AWS Reference: S3 Pricing

Why HIGH confidence: Lifecycle configuration is binary (exists or not). Storage costs are exact.

Incomplete Multipart Uploads

Confidence: MEDIUM
Detection: S3 multipart uploads not completed
Savings: Size of incomplete parts × storage rate

How it works:

We list incomplete multipart uploads in S3 buckets
Failed or abandoned uploads leave orphaned parts that accumulate
These parts are charged as regular S3 storage

Example:

Bucket: my-large-files
Incomplete Uploads: 15
Total Size: 25GB
Monthly Savings: $0.58
Recommendation: Add lifecycle rule to abort incomplete uploads after 7 days

AWS Reference: S3 Pricing

Why MEDIUM confidence: Uploads may be legitimately in progress. Short time threshold reduces false positives.

Idle EFS File System

Confidence: HIGH
Detection: Zero client connections for 14 days
Savings: Full storage cost

How it works:

We query CloudWatch for EFS client connections
File systems with zero connections for 14+ days are flagged
EFS charges for stored data even when not accessed

Example:

File System: fs-0abc123
Size: 100GB
Connections (14 days): 0
Monthly Savings: $30.00 (Standard) or $1.60 (IA)

AWS Reference: EFS Pricing

Why HIGH confidence: Zero client connections is definitive. No one is accessing this file system.

Idle FSx File System

Confidence: HIGH
Detection: Zero network I/O for 14 days
Savings: Full file system cost

How it works:

We query CloudWatch for FSx network metrics
File systems with zero I/O for 14+ days are flagged
FSx charges continuously whether data is accessed or not

Example:

File System: fs-0abc123def456
Type: FSx for Windows
Capacity: 300GB
Network I/O (14 days): 0
Monthly Savings: $68.40

AWS Reference: FSx Pricing

Why HIGH confidence: Zero network I/O confirms no file access. FSx charges by provisioned capacity.

Oversized FSx Filesystem

Confidence: MEDIUM
Detection: Storage utilization below 40% of provisioned capacity
Savings: Estimated from excess provisioned capacity

How it works:

We compare provisioned storage against actual used capacity via CloudWatch FreeStorageCapacity
Filesystems where used capacity stays below 40% of provisioned are flagged
Type-aware for Windows, ONTAP, OpenZFS, and Lustre

Example:

File System: fs-0abc123def456
Type: FSx for Windows
Capacity: 2,000 GB provisioned
Used: 500 GB (25%)
Estimated Savings: $130.00/month

AWS Reference: FSx Pricing

Why MEDIUM confidence: Low utilization is a strong signal, but resizing requires operational caution. Some FSx types don't support in-place shrink.

FSx Throughput Overprovisioned

Confidence: MEDIUM
Detection: Throughput utilization below 30% of provisioned capacity
Savings: Estimated from stepping down one throughput tier

How it works:

We measure throughput utilization against provisioned throughput capacity
Filesystems (Windows/ONTAP/OpenZFS) where average utilization stays below 30% are flagged
Lustre excluded because its performance model is inherently bursty

Example:

File System: fs-0abc123def456
Type: FSx for ONTAP
Provisioned Throughput: 512 MB/s
Average Utilization: 12%
Estimated Savings: $33.28/month

AWS Reference: FSx Pricing

Why MEDIUM confidence: Throughput is a major cost driver, but requires validation of latency baselines before changing.

Old FSx Backup

Confidence: MEDIUM
Detection: Manual FSx backup older than 90 days
Savings: Backup storage cost (varies by size)

How it works:

We list all FSx backups and filter for manual/user-initiated type
Backups older than 90 days are flagged
Retention-tagged backups (compliance, retain, legal-hold) are excluded
Automatic backups managed by FSx retention policies are excluded

Example:

Backup: backup-0abc123def456
Filesystem: fs-0abc123 (Windows)
Age: 180 days
Type: USER_INITIATED

Note: Backup deletion is irreversible. Verify backup is no longer needed for recovery.

AWS Reference: FSx Pricing

Why MEDIUM confidence: Manual backups past retention threshold are likely unnecessary, but deletion is irreversible.

Old ECR Images

Confidence: MEDIUM
Detection: Container images older than 90 days
Savings: Image size × $0.10/GB-month

How it works:

We list all images in ECR repositories
Images older than 90 days are flagged for cleanup
Old images often represent deprecated versions

Example:

Repository: my-app-images
Images > 90 days: 45
Total Size: 12GB
Monthly Savings: $1.20

AWS Reference: ECR Pricing

Why MEDIUM confidence: Old images may be needed for rollbacks. Consider keeping last N versions.

Over-Provisioned IOPS

Confidence: HIGH
Detection: EBS volume with provisioned IOPS where peak usage < 50% over 14 days
Savings: Excess IOPS × pricing ($0.065/IOPS/month for io1/io2, $0.005/IOPS/month for gp3)

How it works:

We identify EBS volumes with provisioned IOPS (io1, io2, gp3 with custom IOPS)
CloudWatch VolumeReadOps and VolumeWriteOps are analyzed for 14 days
If peak IOPS usage stays below 50% of provisioned, the volume is over-provisioned

Example:

Volume: vol-0abc123def456
Type: io2, Size: 500 GB
Provisioned IOPS: 10,000
Peak IOPS (14d): 3,200 (32%)
Recommended: 6,400 IOPS
Monthly Savings: $234.00

AWS Reference: EBS Pricing

Why HIGH confidence: CloudWatch IOPS metrics are precise. We use peak (not average) to avoid under-provisioning.

EFS Without Lifecycle Policy

Confidence: MEDIUM
Detection: EFS filesystem ≥ 1 GB without Infrequent Access lifecycle transition
Savings: Standard ($0.30/GB) → IA ($0.025/GB) on ~80% of infrequently accessed files

How it works:

We check EFS filesystems for lifecycle configuration
Filesystems without TransitionToIA rules keep all data in Standard storage class
Most EFS data becomes infrequently accessed after 30 days — significant savings opportunity

Example:

Filesystem: fs-0abc123def456
Size: 250 GB (Standard)
Lifecycle Policy: None
Est. IA-eligible: 200 GB (80%)
Monthly Savings: $55.00

AWS Reference: EFS Lifecycle Management

Why MEDIUM confidence: Actual access patterns determine IA eligibility. The 80% estimate is based on typical workloads.

Untagged ECR Images

Confidence: HIGH
Detection: ECR repository with untagged images totaling > 0.1 GB
Savings: Untagged image size × $0.10/GB/month

How it works:

We scan ECR repositories for images without tags
Untagged images are typically intermediate build layers or superseded images
These accumulate silently and can grow to significant storage costs

Example:

Repository: my-service
Untagged Images: 156
Total Size: 8.5 GB
Monthly Savings: $0.85

AWS Reference: ECR Pricing

Why HIGH confidence: Image tag status and size are exact metadata from the ECR API.

Old Backup Recovery Points

Confidence: MEDIUM
Detection: AWS Backup recovery point older than 180 days without compliance hold
Savings: Backup storage at $0.05/GB/month

How it works:

We list all AWS Backup recovery points across vaults
Points older than 180 days without compliance-lock tag are flagged
Organizations often forget to set lifecycle expiration on backup plans

Example:

Resource: arn:aws:rds:us-east-1:123456789:db:mydb
Recovery Point Age: 245 days
Size: 50 GB
Monthly Savings: $2.50

AWS Reference: AWS Backup Pricing

Why MEDIUM confidence: Compliance requirements vary. Some industries require longer retention. Verify before deleting.

Redundant Backup Recovery Points

Confidence: MEDIUM
Detection: Multiple recovery points for same resource within 24-hour window
Savings: Redundant point storage at $0.05/GB/month

How it works:

We group recovery points by resource ARN
Multiple points taken within 24 hours of each other are likely redundant
Keeping only one point per time window reduces storage costs

Example:

Resource: arn:aws:ec2:us-east-1:123456789:volume/vol-abc123
Points in 24hr window: 4
Redundant Size: 120 GB
Monthly Savings: $6.00

AWS Reference: AWS Backup Pricing

Why MEDIUM confidence: Some workloads intentionally create frequent backups for RPO requirements.

Backup Plan Without Lifecycle Tiering

Confidence: MEDIUM
Detection: Backup plan with retention > 90 days but no cold storage transition
Savings: ~$0.01/GB/month savings from warm-to-cold transition

How it works:

We analyze backup plan lifecycle rules
Plans retaining backups beyond 90 days without cold storage transitions are flagged
Cold storage is significantly cheaper but has longer retrieval times

Example:

Backup Plan: daily-rds-backup
Retention: 365 days
Cold Transition: None
Total Backup Size: 500 GB
Monthly Savings: $5.00

AWS Reference: AWS Backup Lifecycle

Why MEDIUM confidence: Cold storage retrieval is slower. Verify RTO requirements before enabling tiering.

Stale Backup Plan Assignment

Confidence: MEDIUM
Detection: Backup plan selection targeting resources that no longer exist
Savings: Backup plan resource selection overhead

How it works:

We examine backup plan resource assignments
Assignments referencing deleted resources are flagged
These create unnecessary backup plan evaluation overhead

Example:

Backup Plan: weekly-ebs
Selection: vol-abc123 (DELETED)
Status: Resource not found
Action: Remove stale assignment

AWS Reference: AWS Backup Resource Assignment

Why MEDIUM confidence: Resource may have been replaced with a different ID. Verify the backup plan still covers the intended resources.

Backup Cross-Region Copy Overreach

Confidence: LOW
Detection: Cross-region backup copies to regions with no operational presence
Savings: Cross-region transfer at $0.02/GB

How it works:

We analyze backup plans with cross-region copy rules
Copies to regions where no resources are deployed may be unnecessary
Each cross-region copy incurs data transfer and storage charges

Example:

Backup Plan: critical-data
Source: us-east-1
Copy Destinations: eu-west-1, ap-southeast-1
Resources in eu-west-1: 0
Monthly Savings: $5.00 (transfer + storage)

AWS Reference: AWS Backup Cross-Region

Why LOW confidence: Cross-region copies serve DR purposes. Verify disaster recovery requirements before removing.

S3 Empty Bucket

Confidence: LOW
Detection: S3 bucket with 0 objects, created 30+ days ago
Savings: Zero direct savings (hygiene finding)

How it works:

We identify empty S3 buckets that have existed for 30+ days
Infrastructure buckets (CDK, CloudFormation, logging prefixes) are excluded
Empty buckets consume no storage but create management overhead

Example:

Bucket: old-data-export-2024
Objects: 0, Size: 0 bytes
Created: 2024-06-15 (300+ days ago)
Monthly Savings: $0.00 (cleanup only)

AWS Reference: S3 Pricing

Why LOW confidence: Buckets may be pre-provisioned for future use or referenced by application configuration.

S3 Wrong Storage Class

Confidence: LOW
Detection: Bucket with > 90% Standard storage, > 50 GB, no Intelligent-Tiering
Savings: ~25% from Intelligent-Tiering automatic downtiering

How it works:

We analyze S3 storage class distribution per bucket
Buckets with predominantly Standard storage and no tiering transitions are flagged
S3 Intelligent-Tiering automatically moves infrequently accessed objects to lower-cost tiers

Example:

Bucket: application-logs
Size: 250 GB (98% Standard)
Intelligent-Tiering: Not configured
Est. Monthly Savings: $1.44

AWS Reference: S3 Intelligent-Tiering

Why LOW confidence: Access patterns vary. Intelligent-Tiering adds a small monitoring fee ($0.0025/1000 objects).

S3 Rapid Growth

Confidence: LOW
Detection: Bucket growth > 100% in 30 days with > 10 GB absolute growth
Savings: Variable — depends on addressing root cause of growth

How it works:

We compare bucket size metrics over rolling 30-day windows
Buckets doubling in size with significant absolute growth are flagged
Often indicates missing lifecycle rules, runaway logging, or misconfigured pipelines

Example:

Bucket: data-pipeline-output
30 days ago: 45 GB
Now: 120 GB (+167%)
Growth Rate: 2.5 GB/day
Monthly Savings: Potential $1.73+ if addressed

AWS Reference: S3 Storage Lens

Why LOW confidence: Growth may be expected (new data pipeline, migration). Requires investigation into root cause.

S3 High Request & Transfer Cost

Confidence: LOW
Detection: Non-storage costs (requests + transfer) exceed storage costs
Savings: Variable — CDN, VPC endpoints, or caching can reduce transfer costs

How it works:

We compare S3 request and transfer costs against storage costs
When requests/transfer cost more than storage, architectural optimization may help
Common solutions: CloudFront for public access, VPC endpoints for private access

Example:

Bucket: api-assets
Storage Cost: $5.00/month
Request + Transfer Cost: $12.00/month (240% of storage)
Monthly Savings: ~$6.00 with CloudFront

AWS Reference: S3 Request Pricing

Why LOW confidence: Architectural changes require significant effort. Cost ratio is an indicator, not a definitive finding.

Database Detectors

Idle RDS Instance

Confidence: HIGH
Detection: CloudWatch DatabaseConnections = 0 for 14+ days
Savings: Full instance cost

How it works:

We query CloudWatch for database connection metrics
Zero connections for 14+ days = definitely unused
Monthly cost based on instance class and Multi-AZ configuration

Example:

Database: my-dev-database
Class: db.t3.medium
Engine: MySQL
Connections (14 days): 0
Monthly Savings: $49.28

AWS Reference: RDS Pricing

Why HIGH confidence: Zero connections is an unambiguous signal. No one is using this database.

Old RDS Snapshots

Confidence: MEDIUM
Detection: Manual snapshots older than threshold (default: 90 days)
Savings: AllocatedStorage × $0.095/GB-month

Example:

Snapshot: my-database-snapshot-2024
Age: 120 days
Size: 200GB
Monthly Savings: $19.00

AWS Reference: RDS Pricing (Backup Storage)

Why MEDIUM confidence: Old snapshots may be needed for compliance. Review before deleting.

Aurora I/O Optimization Opportunity

Confidence: HIGH
Detection: Aurora Standard cluster where I/O charges exceed 25% of total spend
Savings: Difference between Standard (storage + I/O) and I/O-Optimized (storage only) costs

How it works:

We identify Aurora clusters using Standard storage (not I/O-Optimized)
Calculate the I/O-to-total-spend ratio from CloudWatch VolumeReadIOPs and VolumeWriteIOPs
If I/O charges exceed 25% of total Aurora spend, switching to I/O-Optimized eliminates per-I/O charges
I/O-Optimized storage costs ~30% more per GB but includes unlimited I/O

Example:

Cluster: my-aurora-cluster
Engine: aurora-postgresql
Storage Type: Aurora Standard
Monthly I/O Cost: $450.00
Monthly Storage Cost: $180.00
I/O Ratio: 71.4% (> 25% threshold)
Switch to I/O-Optimized Savings: $216.00/month

AWS Reference: Aurora Pricing — I/O-Optimized

Why HIGH confidence: I/O costs are directly measurable from CloudWatch metrics. The 25% threshold is AWS's own recommended breakeven point.

Aurora Extended Support Cost

Confidence: HIGH
Detection: Aurora cluster running an end-of-life engine version under Extended Support
Savings: Per-vCPU hourly surcharge (Year 1-2: $0.100/vCPU-hr, Year 3+: $0.200/vCPU-hr)

How it works:

We check the Aurora engine version against known EOL dates (PostgreSQL 12, MySQL 5.7)
If the version has passed its standard support end date, AWS charges Extended Support fees
Surcharges are per-vCPU per hour, escalating after year 2
We also flag clusters within 90 days of EOL as a warning

Example:

Cluster: pg12-eol-cluster
Engine: aurora-postgresql 12.18
EOL Date: 2025-02-28
Extended Support: Year 1 rate
Instances: 2× db.r6g.xlarge (4 vCPUs each)
Surcharge: 8 vCPUs × $0.100/hr × 730 hrs = $584.00/month

AWS Reference: Aurora Extended Support Pricing

Why HIGH confidence: EOL dates and surcharge rates are published by AWS. vCPU counts are deterministic per instance class.

Aurora Serverless v2 Opportunity

Confidence: MEDIUM
Detection: Low-utilization provisioned Aurora cluster suitable for Serverless v2 migration
Savings: Difference between provisioned instance cost and estimated Serverless v2 cost at 2 ACU baseline

How it works:

We check CloudWatch CPU metrics: average < 15% and max < 40% over 14 days
Excludes clusters already running Serverless, Global databases, and multi-writer setups
Estimates Serverless v2 cost at 2 ACU baseline × $0.12/ACU-hr
Compares against current provisioned instance cost

Example:

Cluster: low-traffic-aurora
Engine: aurora-postgresql
Instance: db.r6g.xlarge
CPU Avg: 8%, CPU Max: 22%
Provisioned Cost: $526.00/month
Estimated Serverless Cost: $175.20/month
Monthly Savings: $350.80

AWS Reference: Aurora Serverless v2 Pricing

Why MEDIUM confidence: Serverless v2 costs depend on actual ACU consumption which varies. The 2-ACU baseline is a conservative estimate. Workload spikes may increase actual Serverless costs.

DynamoDB Without Auto-Scaling

Confidence: HIGH
Detection: Provisioned capacity DynamoDB table with no auto-scaling configured
Savings: Varies based on provisioned capacity and utilization

How it works:

We identify DynamoDB tables using provisioned capacity mode
Check if Application Auto Scaling is configured for the table
Tables without auto-scaling may be over-provisioned or at risk of throttling

Example:

Table: my-provisioned-table
Mode: Provisioned (10 RCU, 10 WCU)
Auto-Scaling: Not configured
Monthly Cost: $9.40
Recommendation: Enable auto-scaling or switch to on-demand

AWS Reference: DynamoDB Pricing

Why HIGH confidence: Auto-scaling is a best practice for provisioned tables. Configuration is binary (enabled or not).

Idle DynamoDB Table

Confidence: HIGH
Detection: Zero consumed read/write capacity for 14 days
Savings: Full table cost (provisioned) or minimum charges (on-demand)

How it works:

We query CloudWatch for ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits
Tables with zero consumption for 14+ days are flagged
Provisioned tables charge continuously; on-demand has storage costs

Example:

Table: my-old-table
Mode: Provisioned (5 RCU, 5 WCU)
Activity (14 days): 0 operations
Monthly Savings: $4.70

AWS Reference: DynamoDB Pricing

Why HIGH confidence: Zero consumed capacity means no reads or writes. Table is unused.

Over-Provisioned DynamoDB

Confidence: MEDIUM
Detection: Capacity utilization < 20% for 14 days
Savings: Based on right-sized capacity

How it works:

We compare provisioned capacity to consumed capacity via CloudWatch
Tables using less than 20% of provisioned capacity are over-provisioned
Right-sizing or switching to on-demand can reduce costs

Example:

Table: my-app-data
Provisioned: 100 RCU, 100 WCU
Used (avg): 15 RCU, 8 WCU
Utilization: 15%, 8%
Monthly Savings: ~$60.00

AWS Reference: DynamoDB Pricing

Why MEDIUM confidence: Traffic patterns may spike periodically. Review before right-sizing.

Idle ElastiCache Cluster

Confidence: HIGH
Detection: Zero connections for 7 days
Savings: Full cluster cost

How it works:

We query CloudWatch for CurrConnections metric
Clusters with zero connections for 7+ days are flagged
ElastiCache charges continuously for node hours

Example:

Cluster: my-redis-cluster
Node Type: cache.t3.micro
Nodes: 2
Cache Hits (14 days): 0
Monthly Savings: $24.82

AWS Reference: ElastiCache Pricing

Why HIGH confidence: Zero cache hits means no application is using this cache.

Idle Redshift Cluster

Confidence: HIGH
Detection: Zero queries for 14 days
Savings: Full cluster cost

How it works:

We query CloudWatch for query-related metrics
Clusters with no query activity for 14+ days are flagged
Redshift charges continuously for cluster nodes

Example:

Cluster: my-data-warehouse
Node Type: dc2.large
Nodes: 2
Queries (14 days): 0
Monthly Savings: $360.00

AWS Reference: Redshift Pricing

Why HIGH confidence: Zero queries means no one is using this data warehouse.

Idle OpenSearch Domain

Confidence: HIGH
Detection: Zero requests for 14 days
Savings: Full domain cost

How it works:

We query CloudWatch for OpenSearch request metrics
Domains with zero requests for 14+ days are flagged
OpenSearch charges continuously for instance hours

Example:

Domain: my-search-domain
Instance Type: t3.small.search
Instances: 2
Requests (14 days): 0
Monthly Savings: $51.10

AWS Reference: OpenSearch Pricing

Why HIGH confidence: Zero requests means no applications are using this search domain.

Oversized OpenSearch Domain

Confidence: MEDIUM
Detection: Average CPU < 20% and max CPU < 40% over 14 days with active search traffic
Savings: Estimated from stepping down instance type (or 30% conservative fallback)

How it works:

We collect CloudWatch CPUUtilization (average and maximum) over 14 days
Domains with sustained low CPU but non-zero search traffic are flagged
A step-down instance type map estimates savings from rightsizing
If the instance type isn't in the step-down map, a conservative 30% savings is estimated

Example:

Domain: oversized-search
Instance Type: r6g.xlarge.search
Instances: 3
Avg CPU: 12.0%  Max CPU: 30.0%
Recommendation: Downsize to r6g.large.search
Monthly Savings: $368.00

AWS Reference: OpenSearch Pricing

Why MEDIUM confidence: Low CPU is a strong signal but workloads may have periodic spikes. Resize one dimension at a time and observe.

OpenSearch EBS Overprovisioned

Confidence: MEDIUM
Detection: Free storage > 60% with data growth < 0.1 GB/day (EBS-backed domains only)
Savings: Reclaimable GB × ~$0.10/GB/month (gp3 approximate rate)

How it works:

We query CloudWatch FreeStorageSpace to compute the percentage of unused EBS storage
We calculate the storage growth rate from a 14-day trend
Domains with >60% free storage and flat growth (<0.1 GB/day) are flagged
We keep 30% headroom and estimate reclaimable GB

Example:

Domain: logs-archive
EBS: gp3, 500 GB × 2 nodes (1000 GB total)
Free Storage: 75%
Growth: 0.02 GB/day
Reclaimable: ~450 GB
Monthly Savings: $45.00

AWS Reference: OpenSearch Pricing

Why MEDIUM confidence: Storage trends may change. Verify index lifecycle policies and retention needs before resizing.

RI Opportunity: OpenSearch

Confidence: LOW
Detection: On-demand OpenSearch domain eligible for Reserved Instance purchase
Savings: Estimated annual savings from Cost Explorer RI recommendations

How it works:

We query Cost Explorer for OpenSearch RI purchase recommendations
Domains running on-demand that would benefit from a 1-year partial-upfront RI are flagged
Savings estimates come directly from AWS Cost Explorer

Example:

Domain: production-search
Instance Type: r6g.2xlarge.search
Current: On-Demand
Recommendation: 1-year partial-upfront RI
Estimated Annual Savings: $4,800

AWS Reference: OpenSearch Reserved Instances

Why LOW confidence: RI commitments are non-refundable. Verify workload stability before purchasing.

Idle Neptune Cluster

Confidence: HIGH
Detection: Zero requests for 14 days
Savings: Full cluster cost

How it works:

We query CloudWatch for Neptune request metrics
Clusters with zero requests for 14+ days are flagged
Neptune charges continuously for instance hours

Example:

Cluster: my-graph-database
Instance Type: db.t3.medium
Requests (14 days): 0
Monthly Savings: $59.86

AWS Reference: Neptune Pricing

Why HIGH confidence: Zero requests means no applications are using this graph database.

Neptune Serverless Migration Opportunity

Confidence: HIGH
Detection: Provisioned Neptune cluster with consistently low CPU utilization
Savings: 50–65% for low-utilization clusters

How it works:

We identify provisioned Neptune clusters (not already Serverless)
CloudWatch CPUUtilization is analyzed over 14 days
Clusters with <20% avg CPU and <50% max CPU are flagged
Savings are estimated by comparing provisioned cost vs. Serverless at 2 NCU average
Only flags instances larger than db.t3.medium (Serverless floor $80.15/mo > t3.medium $59.86/mo)

Example:

Cluster: dev-graph-database
Instance Type: db.r6g.large
CPU Average (14 days): 8.3%
CPU Maximum (14 days): 15.2%
Provisioned Cost: $228.49/month
Estimated Serverless Cost: $160.30/month (2 NCU avg)
Monthly Savings: $68.19

Recommended Action: Create a new Neptune Serverless cluster, restore from snapshot, and update application endpoints. Verify your query language — Serverless supports Gremlin and openCypher but NOT SPARQL.

AWS Reference: Neptune Serverless

Why HIGH confidence: CPU utilization and instance pricing are well-defined. The Serverless floor ($80.15/mo at 1 NCU idle) provides a clear comparison point. Only flags when provisioned cost significantly exceeds Serverless estimate.

Over-Provisioned Neptune Instance

Confidence: MEDIUM
Detection: Neptune instance with consistently low CPU utilization
Savings: 30–50% per right-sizing step

How it works:

We analyze CloudWatch CPUUtilization for each provisioned Neptune cluster
Clusters with <20% avg CPU and <40% max CPU are flagged
Each over-provisioned instance is flagged with a target smaller instance class
Savings calculated from published per-hour pricing difference

Example:

Instance: graph-writer-1 (db.r6g.xlarge)
Cluster: prod-graph
CPU Average (14 days): 12.1%
CPU Maximum (14 days): 22.5%
Current Cost: $456.98/month
Target: db.r6g.large ($228.49/month)
Monthly Savings: $228.49

Recommended Action: Modify the instance class via ModifyDBInstance. Brief downtime during change (5–15 minutes). Start with reader instances to minimize impact.

Agentic Tier — Automated Fix Available

Click Fix This to downsize the instance with automatic backup snapshot and one-click rollback.

AWS Reference: Neptune Pricing

Why MEDIUM confidence: CPU is a good proxy but doesn't capture memory-bound graph traversals. Some workloads burst memory without high CPU. Validate with monitoring after downsizing.

Old Neptune Cluster Snapshot

Confidence: HIGH
Detection: Manual Neptune cluster snapshot older than 90 days
Savings: $0.021/GB-month per snapshot

How it works:

We query Neptune for manual cluster snapshots (DescribeDBClusterSnapshots)
Snapshots older than 90 days are flagged
Only manual snapshots — automated backups are managed by retention policy

Example:

Snapshot: pre-upgrade-snap-2025-06
Cluster: my-graph-db
Age: 280 days
Storage: 100 GB
Monthly Cost: $2.10

Recommended Action: Delete the snapshot if the data is no longer needed. Verify it's not a critical disaster recovery backup.

Agentic Tier — Automated Fix Available

Click Fix This to delete the old snapshot. Note: snapshot deletion is irreversible.

AWS Reference: Neptune Pricing — Backup Storage

Why HIGH confidence: Snapshot age is deterministic. Manual snapshots are created for specific events and rarely needed after 90 days.

Idle DocumentDB Cluster

Confidence: HIGH
Detection: Zero connections for 14 days
Savings: Full cluster cost

How it works:

We query CloudWatch for DocumentDB connection metrics
Clusters with zero connections for 14+ days are flagged
DocumentDB charges continuously for instance hours

Example:

Cluster: my-docdb-cluster
Instance Type: db.t3.medium
Connections (14 days): 0
Monthly Savings: $59.86

AWS Reference: DocumentDB Pricing

Why HIGH confidence: Zero connections means no applications are connected to this database.

Old DocumentDB Snapshot

Confidence: MEDIUM
Detection: Manual snapshot older than 90 days (configurable)
Savings: ~$0.02/GB-month backup storage

How it works:

Lists all manual DocumentDB cluster snapshots
Groups snapshots by source cluster
Protects the newest manual snapshot per cluster (disaster-recovery safety net)
Skips snapshots tagged with retention, compliance, legal, or do-not-delete
Flags remaining snapshots that exceed the age threshold (default 90 days)

Example:

Snapshot: my-docdb-cluster-backup-2024-01
Cluster: my-docdb-cluster
Age: 180 days (threshold: 90)
Storage: 50 GB
Monthly Savings: $1.00

AWS Reference: DocumentDB Pricing

Why MEDIUM confidence: Old snapshots probably aren't needed, but deletion is irreversible. Some may be kept for compliance.

Overprovisioned DocumentDB Cluster

Confidence: MEDIUM
Detection: Avg CPU < 20%, Max CPU < 40%, connections < 50, IOPS < 100 over 14 days
Savings: ~30% of cluster cost (instance downsizing)

How it works:

Fetches CloudWatch metrics for each DocumentDB cluster over 14 days
Checks four signals: average CPU, peak CPU, connection count, and combined read+write IOPS
If all metrics are below thresholds, the cluster is a rightsizing candidate
Advisory-only in v1 — no automated remediation actions are generated

Example:

Cluster: my-docdb-cluster
Instance Type: db.r5.2xlarge (2 instances)
Avg CPU: 8.2%
Max CPU: 15.0%
Connections: 12
IOPS: 30 (read: 20, write: 10)
Monthly Savings: $219.00 (30% of cluster cost)

AWS Reference: DocumentDB Pricing

Why MEDIUM confidence: Multi-signal analysis shows consistent under-utilization, but workload patterns may be bursty. Snapshot the cluster before resizing.

Idle Timestream Database

Confidence: HIGH
Detection: Zero writes for 30 days
Savings: Storage costs only (writes are pay-per-use)

How it works:

We query CloudWatch for Timestream write metrics
Databases with no writes for 30+ days are flagged
Timestream charges for memory store and magnetic store retention

Example:

Database: my-timeseries-db
Tables: 3
Writes (30 days): 0
Memory Store: 5GB
Monthly Savings: $6.25 (storage)

AWS Reference: Timestream Pricing

Why HIGH confidence: Zero writes means no new data is being ingested. Consider archiving or deleting.

Idle QLDB Ledger

Confidence: HIGH
Detection: Zero requests for 30 days
Savings: Storage and I/O costs

How it works:

We query CloudWatch for QLDB request metrics
Ledgers with zero requests for 30+ days are flagged
QLDB charges for storage and I/O operations

Example:

Ledger: my-ledger
Requests (30 days): 0
Storage: 10GB
Monthly Savings: $2.50 (storage)

AWS Reference: QLDB Pricing

Why HIGH confidence: Zero requests means no applications are using this ledger.

Aurora to RDS Downgrade Opportunity

Confidence: MEDIUM
Detection: Aurora cluster with single instance, low I/O, and no Aurora-specific features
Savings: 20–40% by switching to RDS for the same engine

How it works:

We identify Aurora clusters with a single db instance
Clusters with low I/O throughput and no read replicas don't benefit from Aurora's architecture
Standard RDS is cheaper for simple, single-instance workloads

Example:

Cluster: my-app-aurora
Engine: aurora-mysql
Instances: 1
Avg I/O: 50 IOPS
Monthly Savings: $85.00 (switch to RDS MySQL)

AWS Reference: Aurora Pricing vs RDS

Why MEDIUM confidence: Aurora provides higher availability and faster failover. Verify that the application doesn't rely on Aurora-specific features.

Oversized ElastiCache Cluster

Confidence: MEDIUM
Detection: ElastiCache cluster CPU utilization < 10% over 7 days with active connections
Savings: 40–50% from downsizing to smaller node type

How it works:

We analyze ElastiCache CPU utilization and connection metrics via CloudWatch
Clusters with consistently low CPU (< 10%) despite having connections are oversized
Downsizing to a smaller node type reduces cost proportionally

Example:

Cluster: session-cache
Node Type: cache.r6g.xlarge
Nodes: 2
Avg CPU: 6% over 7 days
Recommended: cache.r6g.large
Monthly Savings: $198.00

AWS Reference: ElastiCache Pricing

Why MEDIUM confidence: Cache performance is sensitive to memory capacity. Verify hit rates after downsizing.

Oversized Redshift Cluster

Confidence: MEDIUM
Detection: Redshift cluster CPU utilization < 40% average over 7–14 days
Savings: 30–50% from resizing to fewer or smaller nodes

How it works:

We analyze Redshift CPU utilization via CloudWatch over 7–14 days
Clusters consistently below 40% CPU are likely over-provisioned
Elastic resize or classic resize can reduce the node count

Example:

Cluster: analytics-dw
Node Type: ra3.xlplus
Nodes: 4
Avg CPU: 22% over 14 days
Recommended: 2 nodes
Monthly Savings: $1,460.00

AWS Reference: Redshift Pricing

Why MEDIUM confidence: Query workloads can be bursty. Analyze query queuing times before resizing.

Underutilized Redshift Cluster

Confidence: MEDIUM
Detection: Redshift cluster CPU < 40% with low query throughput over 14 days
Savings: 30–50% from downsizing or pausing during idle periods

How it works:

We analyze both CPU utilization and query execution metrics
Low CPU combined with low query throughput indicates genuine underutilization
Consider pausing during off-hours or resizing to a smaller cluster

Example:

Cluster: reporting-dw
Node Type: ra3.xlplus
Nodes: 3
Avg CPU: 18%, Queries/day: 45
Monthly Savings: $1,095.00

AWS Reference: Redshift Pricing

Why MEDIUM confidence: Distinguishes from oversized_redshift by incorporating query volume. Still requires workload analysis.

Redshift Without Pause Schedule

Confidence: MEDIUM
Detection: Multi-node Redshift cluster with no scheduled pause action
Savings: 30–50% from pausing during non-business hours

How it works:

We check for scheduled actions on Redshift clusters with > 2 nodes
Clusters without pause/resume schedules run 24/7 even when unused at night
Pausing during non-business hours (e.g., 8pm–8am) can halve the cost

Example:

Cluster: analytics-dw
Nodes: 4
Scheduled Actions: None
Usage Pattern: Business hours only
Monthly Savings: $1,460.00 (12hr/day pause)

AWS Reference: Redshift Pause and Resume

Why MEDIUM confidence: Some ETL jobs run overnight. Verify usage patterns before enabling pause schedules.

Redshift Legacy DC2 Nodes

Confidence: LOW
Detection: Redshift cluster running on DC2 node types
Savings: ~15–20% from migration to RA3 with managed storage

How it works:

We check the node type of Redshift clusters
DC2 nodes use local SSD storage and are a legacy architecture
RA3 nodes offer managed storage (pay separately) and are typically cheaper overall

Example:

Cluster: legacy-dw
Node Type: dc2.large
Nodes: 4
Monthly Savings: $292.00 (migration to ra3.xlplus)

AWS Reference: Redshift Node Types

Why LOW confidence: Migration requires cluster resize and potential query plan changes. Test thoroughly.

Redshift Concurrency Scaling Waste

Confidence: MEDIUM
Detection: Concurrency scaling queues with underutilized scaling capacity
Savings: Reduce unused concurrency scaling cluster costs

How it works:

We analyze Redshift concurrency scaling metrics
Persistent scaling cluster activity indicates possible queue misconfiguration
Optimizing WLM queues can reduce scaling cluster usage

Example:

Cluster: analytics-dw
Concurrency Scaling: Enabled
Avg Scaling Usage: 15% of capacity
Monthly Savings: $120.00

AWS Reference: Redshift Concurrency Scaling

Why MEDIUM confidence: Concurrency scaling is billed per-second. Savings depend on workload patterns.

Redshift Spectrum Heavy Usage

Confidence: LOW
Detection: High Redshift Spectrum (external table) query volume relative to cluster cost
Savings: Optimization via partitioning, compression, or moving data to native tables

How it works:

We monitor Spectrum query costs relative to cluster costs
High Spectrum usage may indicate data that should be loaded into native Redshift tables
Spectrum charges per TB scanned — partitioning and compression reduce scan volume

Example:

Cluster: analytics-dw
Spectrum Scans: 8 TB/month
Spectrum Cost: $40.00/month
Recommendation: Partition and compress external data
Monthly Savings: ~$20.00

AWS Reference: Redshift Spectrum Pricing

Why LOW confidence: External table architecture may be intentional. Requires understanding of data pipeline design.

Redshift WLM Over-Provisioned

Confidence: MEDIUM
Detection: WLM queues with > 3 slots and < 50% average slot utilization
Savings: ~30% compute savings from reducing slot allocation

How it works:

We analyze Workload Management queue configuration and utilization
Queues with excessive slot allocation relative to usage waste cluster resources
Reducing from 4–5 slots to 2–3 improves per-query performance and reduces overhead

Example:

Cluster: analytics-dw
Queue: etl_queue
Slots Allocated: 5
Avg Utilization: 35%
Recommended: 2 slots
Monthly Savings: $438.00

AWS Reference: Redshift WLM Configuration

Why MEDIUM confidence: WLM tuning affects query queuing behavior. Test with representative workloads.

RI Opportunity: ElastiCache

Confidence: HIGH
Detection: AWS Cost Explorer recommendation for ElastiCache Reserved Instance
Savings: 40–55% discount vs On-Demand (exact AWS calculation)

How it works:

AWS Cost Explorer analyzes 30 days of ElastiCache usage
Identifies consistent node usage suitable for RI commitment
Returns exact savings calculation for 1-year or 3-year terms

Example:

Node Type: cache.r6g.large
Current: On-Demand ($0.252/hr)
RI (1yr, No Upfront): $0.155/hr
Monthly Savings: $70.81

AWS Reference: ElastiCache Reserved Nodes

Why HIGH confidence: Uses AWS's own Cost Explorer analysis with exact pricing.

RI Opportunity: Redshift

Confidence: HIGH
Detection: AWS Cost Explorer recommendation for Redshift Reserved Instance
Savings: 25–40% discount vs On-Demand (exact AWS calculation)

How it works:

AWS Cost Explorer analyzes 30 days of Redshift cluster usage
Identifies consistent node usage suitable for RI commitment
Returns exact savings calculation for 1-year or 3-year terms

Example:

Node Type: ra3.xlplus
Nodes: 4
Current: On-Demand ($1.086/hr)
RI (1yr): $0.750/hr
Monthly Savings: $980.00

AWS Reference: Redshift Reserved Nodes

Why HIGH confidence: Uses AWS's own Cost Explorer analysis with exact pricing.

RDS Extended Support Cost

Confidence: HIGH
Detection: RDS engine version in or approaching extended support window
Savings: vCPU-based surcharge ($0.10–$0.20 per vCPU-hour × 730 hours)

How it works:

We check RDS engine versions against AWS lifecycle policy
Versions past standard support incur per-vCPU extended support fees
Warning is raised 90 days before a version enters extended support

Example:

Instance: mydb-production
Engine: mysql 5.7
Status: Extended support (since 2024-10)
vCPUs: 4
Monthly Surcharge: $292.00
Action: Upgrade to MySQL 8.0

AWS Reference: RDS Extended Support Pricing

Why HIGH confidence: Engine version and vCPU count are exact metadata. Surcharge rates are published by AWS.

DocumentDB Extended Support Cost

Confidence: MEDIUM
Detection: DocumentDB cluster version in extended support
Savings: ~10% surcharge on base cluster cost

How it works:

We check DocumentDB cluster versions against the AWS lifecycle policy
Versions past standard support incur extended support surcharges
Upgrading to a supported version eliminates the surcharge

Example:

Cluster: my-docdb
Engine: docdb 4.0
Status: Extended support
Monthly Surcharge: $45.00 (10% of base cost)
Action: Upgrade to DocumentDB 5.0

AWS Reference: DocumentDB Pricing

Why MEDIUM confidence: Extended support surcharge rate varies by engine version and instance type.

ElastiCache Extended Support Cost

Confidence: MEDIUM
Detection: ElastiCache engine version in extended support
Savings: ~10% surcharge on node hourly cost

How it works:

We check ElastiCache engine (Redis/Valkey) versions against the lifecycle policy
End-of-life versions incur extended support surcharges per node
Upgrading to a current version eliminates the surcharge

Example:

Cluster: session-cache
Engine: Redis 6.0 (extended support)
Nodes: 3
Monthly Surcharge: $32.85 (10% of $328.50)
Action: Upgrade to Redis 7.x

AWS Reference: ElastiCache Pricing

Why MEDIUM confidence: Surcharge rates depend on version and node type. AWS publishes specific rates.

ElastiCache Replication Waste (Non-Production)

Confidence: HIGH
Detection: Non-production cluster with unnecessary replicas
Savings: Full replica node cost (doubles or triples cluster price)

How it works:

We check ElastiCache replication groups for replica configuration
Clusters tagged with non-production environment names (dev, staging, test, sandbox) are flagged if they have replicas
Each replica is a full node at the same hourly rate as the primary

Example:

Replication Group: dev-session-cache
Environment: dev (tag)
Node Type: cache.r6g.large
Shards: 1, Replicas per Shard: 2
Replica Cost: 2 × $0.164/hr × 730 hrs = $239.44/month
Monthly Savings: $239.44

AWS Reference: ElastiCache Pricing

Why HIGH confidence: Non-production environments don't need high availability replicas. The environment tag clearly identifies the cluster as non-production.

ElastiCache Valkey Migration Savings

Confidence: HIGH
Detection: Redis OSS or Memcached cluster eligible for Valkey migration
Savings: 20% of current cluster cost (node-based) or 33% (Serverless)

How it works:

We check the engine type of each ElastiCache cluster
Redis OSS and Memcached clusters are flagged as migration candidates
Valkey is permanently 20% cheaper for node-based and 33% cheaper for Serverless, with Redis 7.x API compatibility

Example:

Cluster: session-cache
Engine: Redis OSS 7.0.7
Node Type: cache.r7g.xlarge × 2 nodes
Current Cost: $0.437/hr × 2 × 730 = $638.02/month
Valkey Cost: $0.3496/hr × 2 × 730 = $510.42/month
Monthly Savings: $127.60 (20%)

AWS Reference: ElastiCache Pricing

Why HIGH confidence: The 20% discount is a permanent pricing structure. Valkey is API-compatible with Redis 7.x. Existing Redis OSS reservations apply to Valkey.

ElastiCache Serverless Optimization

Confidence: MEDIUM
Detection: Node-based cluster with spiky traffic pattern
Savings: Varies (eliminates over-provisioned capacity during low-traffic periods)

How it works:

We analyze CPU utilization and connection metrics via CloudWatch over 30 days
Clusters with high traffic variance (low average, high peaks) are flagged
Serverless eliminates paying for idle capacity and auto-scales for peaks

Example:

Cluster: api-cache
Node Type: cache.r6g.xlarge × 2 nodes
Avg CPU: 8.5% (over-provisioned)
Peak CPU: 72% (needs capacity for bursts)
Connection CV: 3.2 (highly variable)
Current Cost: $478.00/month
Estimated Serverless: $180.00/month
Monthly Savings: ~$298.00

AWS Reference: ElastiCache Serverless Pricing

Why MEDIUM confidence: Serverless cost depends heavily on actual ECPU consumption, which varies by workload. Estimate is conservative. Not available for Memcached.

ElastiCache Data Tiering Opportunity

Confidence: MEDIUM
Detection: Memory-only cluster eligible for R6gd data tiering
Savings: Up to 52% for large datasets (fewer nodes needed)

How it works:

We identify clusters using memory-optimized R5/R6g/R7g nodes at xlarge size or larger
R6gd nodes combine memory + NVMe SSD with ~5× total storage capacity
Multi-node clusters can often consolidate to fewer R6gd nodes

Example:

Cluster: large-dataset-cache
Current: 4 × cache.r6g.16xlarge ($5.254/hr each)
Current Cost: $15,342/month
With Data Tiering: 1 × cache.r6gd.16xlarge ($9.98/hr)
Data Tiering Cost: $7,285/month
Monthly Savings: $8,057 (52%)

AWS Reference: ElastiCache Data Tiering

Why MEDIUM confidence: Savings depend on data access patterns. Best for workloads where < 20% of data is accessed frequently. SSD-resident data has slightly higher first-access latency. Not available with Serverless.

OpenSearch Extended Support Cost

Confidence: MEDIUM
Detection: OpenSearch/Elasticsearch version in extended support
Savings: ~10% surcharge on domain hourly cost

How it works:

We check OpenSearch domain engine versions against the lifecycle policy
Legacy Elasticsearch versions (5.x, 6.x) and older OpenSearch versions may incur surcharges
Upgrading to a current OpenSearch version eliminates the surcharge

Example:

Domain: search-logs
Engine: Elasticsearch 7.10 (extended support)
Instances: 3 × r6g.large.search
Monthly Surcharge: $56.00
Action: Upgrade to OpenSearch 2.x

AWS Reference: OpenSearch Pricing

Why MEDIUM confidence: Extended support for OpenSearch is versioned and may vary by instance type.

Network Detectors

Unattached Elastic IP

Confidence: HIGH
Detection: describe_addresses where no InstanceId or NetworkInterfaceId
Savings: $3.65/month ($0.005/hour × 730 hours)

How it works:

EIPs attached to running instances are FREE
Unattached EIPs cost $0.005/hour
This is a fixed AWS price, no estimation

Example:

Elastic IP: 52.123.45.67
Attached: No
Monthly Savings: $3.65

AWS Reference: VPC Pricing (Public IPv4)

Why HIGH confidence: Binary detection (attached or not) with fixed AWS pricing.

Idle NAT Gateway

Confidence: HIGH
Detection: CloudWatch BytesOutToDestination = 0 for 7 days
Savings: $32.40/month (base charge only)

How it works:

NAT Gateways have a fixed hourly charge of $0.045/hour
We check CloudWatch for any traffic over 7 days
Zero traffic = wasting the base charge

Example:

NAT Gateway: nat-0abc123def456
Traffic (7 days): 0 bytes
Monthly Savings: $32.40

AWS Reference: VPC Pricing (NAT Gateway)

Why HIGH confidence: CloudWatch confirms zero usage. Fixed AWS pricing.

Idle Load Balancer

Confidence: HIGH
Detection: No healthy targets via describe_target_health
Savings: ~$16.20/month (ALB/NLB base charge)

How it works:

ALB/NLB have a base charge of ~$0.0225/hour
If no healthy targets, the load balancer isn't serving traffic
You're paying for infrastructure that does nothing

Example:

Load Balancer: my-unused-alb
Healthy Targets: 0
Monthly Savings: $16.20

AWS Reference: Elastic Load Balancing Pricing

Why HIGH confidence: No healthy targets = definitely not serving requests.

Low-Traffic Load Balancer

Confidence: MEDIUM
Detection: <100 requests in 14 days with healthy targets
Savings: ~$16.20/month (ALB/NLB base charge)

How it works:

Checks ALBs/NLBs that have registered healthy targets
Queries CloudWatch RequestCount over 14 days
Flags if total requests < 100 (approximately <7/day)
Excludes LBs with zero targets (caught by idle_load_balancer)

Example:

Load Balancer: staging-api-alb (application)
Healthy Targets: 3
Requests (14 days): 42
Monthly Savings: $16.20

AWS Reference: Elastic Load Balancing Pricing

Why MEDIUM confidence: Low traffic may be intentional (internal health checks, canary endpoints). Review before deleting.

High LCU Cost ALB

Confidence: LOW
Detection: Average ConsumedLCUs cost exceeds 2× ALB base fee
Savings: ~30% of estimated LCU overage (conservative)

How it works:

Queries CloudWatch ConsumedLCUs metric over 7 days
Calculates estimated monthly LCU cost: avg_lcus × $0.008 × 730 hours
Flags if LCU cost > 2× base cost ($32.40)
Suggests reviewing NLB migration or architecture changes

Example:

Load Balancer: prod-websocket-alb (application)
Average LCUs: 12.4
Estimated LCU Cost: $72.41/month
Base Fee: $16.43/month
Total: $88.84/month
Savings Potential: ~$21.72/month (30% of LCU cost)

AWS Reference: ALB Pricing — LCU Details

Why LOW confidence: High LCU usage may be legitimate for the workload. This is an advisory detector — the team should review whether the architecture is optimal.

Classic Load Balancer Migration

Confidence: LOW
Detection: Load balancer type is classic
Savings: ~$2.05/month base + consolidation savings

How it works:

Detects any Classic Load Balancer (CLB)
Recommends migration to ALB (HTTP/HTTPS) or NLB (TCP/TLS)
CLBs cost $0.025/hr ($18.25/mo) vs ALBs at $0.0225/hr ($16.43/mo)
Consolidating multiple CLBs into one ALB with routing rules saves significantly more

Example:

Load Balancer: legacy-web-clb (classic)
Scheme: internet-facing
Migration Target: ALB (for HTTP/HTTPS workloads)
Monthly Savings: $2.05 (base) + consolidation potential

AWS Reference: Classic Load Balancer Migration Guide

Why LOW confidence: Migration requires testing. This is a housekeeping recommendation, similar to deprecated runtime detection.

Unused CloudFront Distribution

Confidence: HIGH
Detection: Zero requests for 30 days
Savings: Minimal base cost, prevents future data transfer charges

How it works:

We query CloudWatch for CloudFront request metrics
Distributions with zero requests for 30+ days are flagged
CloudFront has minimal costs when idle, but active distributions incur data transfer fees

Example:

Distribution: E1ABCDEFGH2IJK
Origin: my-old-bucket.s3.amazonaws.com
Requests (30 days): 0
Monthly Savings: ~$0.01 (housekeeping)

AWS Reference: CloudFront Pricing

Why HIGH confidence: Zero requests means no users are accessing content through this CDN.

Unused Route53 Hosted Zone

Confidence: HIGH
Detection: Hosted zone with only NS and SOA records
Savings: $0.50/month per zone

How it works:

We list all Route53 hosted zones
Zones with only default NS and SOA records (no custom DNS records) are flagged
Empty zones still incur the $0.50/month hosting fee

Example:

Hosted Zone: old-domain.example.com
Records: 2 (NS, SOA only)
Monthly Savings: $0.50

AWS Reference: Route53 Pricing

Why HIGH confidence: Binary detection. Zone either has custom records or it doesn't.

Unused Global Accelerator

Confidence: HIGH
Detection: Zero traffic for 30 days
Savings: ~$18.00/month (base hourly charge)

How it works:

We query CloudWatch for accelerator traffic metrics
Accelerators with zero traffic for 30+ days are flagged
Global Accelerator charges $0.025/hour regardless of traffic

Example:

Accelerator: arn:aws:globalaccelerator::123456789:accelerator/abc123
Traffic (30 days): 0
Monthly Savings: $18.25

AWS Reference: Global Accelerator Pricing

Why HIGH confidence: Zero traffic means no applications are using this accelerator.

Idle Global Accelerator

Confidence: HIGH
Detection: Deployed accelerator with endpoints but zero processed bytes for 30 days
Savings: ~$25.55/month (accelerator + 2 IPv4 addresses)

How it works:

We enumerate all Global Accelerators and their listeners/endpoint groups
For accelerators with endpoints, we check CloudWatch ProcessedBytesIn and ProcessedBytesOut
Accelerators with zero traffic for 30+ days despite having endpoints are flagged

Example:

Accelerator: arn:aws:globalaccelerator::123456789:accelerator/idle-abc
Endpoints: 3 (across 2 listeners)
ProcessedBytesIn (30d): 0
ProcessedBytesOut (30d): 0
Monthly Savings: $25.55

AWS Reference: Global Accelerator Pricing

Why HIGH confidence: Zero traffic with active endpoints indicates the accelerator is not used in any request path.

Disabled Global Accelerator

Confidence: HIGH
Detection: Accelerator with Enabled=false still incurring fixed hourly charges
Savings: ~$25.55/month (accelerator + 2 IPv4 addresses)

How it works:

We check the Enabled flag on each deployed Global Accelerator
Disabled accelerators continue to reserve static anycast IPs and incur hourly charges
Accelerators that are disabled but not deleted are flagged

Example:

Accelerator: arn:aws:globalaccelerator::123456789:accelerator/disabled-xyz
Status: DEPLOYED
Enabled: false
IP Addresses: [75.2.x.x, 99.83.x.x]
Monthly Savings: $25.55

AWS Reference: Global Accelerator Pricing

Why HIGH confidence: A disabled accelerator cannot route traffic and serves no purpose while still incurring charges.

Orphaned DNS Record

Confidence: MEDIUM
Detection: Route 53 DNS record pointing to a resource that no longer exists
Savings: ~$0.50/month per stale hosted zone record (cleanup finding)

How it works:

We scan Route 53 hosted zones for alias and CNAME records
Target resources (EC2, ELB, CloudFront) are verified against current inventory
Records pointing to deregistered or terminated resources are flagged

Example:

Hosted Zone: example.com
Record: api.example.com (CNAME)
Target: old-lb-123.us-east-1.elb.amazonaws.com (DELETED)
Action: Remove or update DNS record

AWS Reference: Route 53 Pricing

Why MEDIUM confidence: Orphaned DNS records may be waiting for a replacement resource. Verify with the application team. Also a security concern (dangling DNS can be hijacked).

Unused VPC Endpoint

Confidence: HIGH
Detection: VPC endpoint with zero data processed in 30 days
Savings: Interface endpoint — $0.01/GB + $0.01/hour/AZ (~$7.30/month/AZ)

How it works:

We check VPC endpoint data processing metrics over 30 days
Interface endpoints with zero bytes processed are unused
Gateway endpoints (S3, DynamoDB) are free — only Interface endpoints incur charges

Example:

Endpoint: vpce-0abc123def456
Service: com.amazonaws.us-east-1.sqs
Type: Interface
AZs: 3
Data Processed (30d): 0 bytes
Monthly Savings: $21.90 ($7.30/AZ × 3)

AWS Reference: VPC Endpoint Pricing

Why HIGH confidence: Zero data processed definitively indicates no traffic through this endpoint.

Serverless Detectors

Unused Lambda Functions

Confidence: HIGH (detection), but LOW savings
Detection: CloudWatch Invocations = 0 for 30 days
Savings: Minimal (~$0.01-$0.08/month for storage)

Important Note: Lambda is pay-per-use. Unused functions cost essentially $0 for execution. We flag these for cleanup purposes, not cost savings.

Example:

Function: my-old-function
Invocations (30 days): 0
Monthly Savings: $0.01 (storage only)
Purpose: Housekeeping

AWS Reference: Lambda Pricing

Why flagged: Unused functions clutter your account and may contain outdated code or security issues.

Deprecated Lambda Runtime

Confidence: LOW
Detection: Function running on a deprecated or end-of-life runtime
Savings: Minimal ($0.01 — housekeeping flag)

Important Note: This is a compliance and security detector, not a cost-savings detector. Deprecated runtimes may lose security patches and eventually become unsupported.

Deprecated Runtimes Detected:

python3.8 — EOL October 2024
python3.7 — EOL November 2023
python2.7 — EOL July 2022
nodejs16.x — EOL June 2024
nodejs14.x — EOL November 2023
nodejs12.x — EOL March 2023
dotnet6 — EOL February 2025
ruby2.7 — EOL December 2023
java8 — EOL December 2023

Example:

Function: my-legacy-handler
Runtime: python3.7
Status: EOL since November 2023
Monthly Savings: $0.01 (housekeeping)
Recommendation: Upgrade to python3.12 or later

AWS Reference: Lambda Runtime Support Policy

Why LOW confidence: Runtime upgrades may require code changes. Manual testing is essential before upgrading.

Unused API Gateway

Confidence: HIGH (detection), but LOW savings
Detection: CloudWatch Count = 0 for 30 days
Savings: Minimal ($0.01) unless caching is enabled

Important Note: API Gateway is pay-per-request. Unused APIs cost $0 for requests. Only caching adds fixed costs (~$14/month for 0.5GB cache).

AWS Reference: API Gateway Pricing

Unused AppSync API

Confidence: HIGH (detection), but LOW savings
Detection: Zero GraphQL queries for 30 days
Savings: Minimal (AppSync is pay-per-request)

How it works:

We query CloudWatch for AppSync request metrics
APIs with zero queries for 30+ days are flagged
AppSync charges per query/mutation, so unused APIs cost $0

Example:

API: my-graphql-api
Queries (30 days): 0
Monthly Savings: $0.00
Purpose: Housekeeping

AWS Reference: AppSync Pricing

Why flagged: Unused APIs clutter your account and may have unnecessary permissions.

Idle AppSync Cache

Confidence: MEDIUM
Detection: Cache with < 100 hits in 14 days
Savings: $32–$4,878/month depending on cache size

How it works:

We enumerate all AppSync APIs and check for attached caches via GetApiCache
For each cache, we query CloudWatch CacheHitCount and CacheMissCount over 14 days
Caches with fewer than 100 total hits are flagged as idle
Monthly savings are calculated from the cache instance type's hourly rate

Cache pricing (hourly):

Type	$/hr
SMALL (t2.small)	$0.044
MEDIUM (r4.large)	$0.182
LARGE (r4.xlarge)	$0.365
XLARGE (r4.2xlarge)	$0.730
LARGE_2X (r4.4xlarge)	$1.461
LARGE_4X (r4.8xlarge)	$2.921
LARGE_8X	$4.339
LARGE_12X	$6.775

Example:

API: my-graphql-api
Cache Type: MEDIUM (r4.large)
Cache Hits (14 days): 12
Monthly Savings: $131.04

AWS Reference: AppSync Caching

Why flagged: AppSync caches incur hourly charges regardless of usage. An idle cache provides no performance benefit while adding cost.

Idle AppSync Subscriptions

Confidence: MEDIUM
Detection: Active WebSocket connections with < 100 GraphQL requests in 14 days
Savings: Estimated from connection-minutes ($0.08 per million connection-minutes)

How it works:

We check CloudWatch ConnectSuccess metric to identify APIs with active WebSocket connections
For those APIs, we check Latency (SampleCount) to see if actual GraphQL operations are occurring
APIs with connections but fewer than 100 requests are flagged
Monthly cost is estimated from ActiveConnections average × connection-minutes

Example:

API: my-realtime-api
Active Connections (avg): 150
GraphQL Requests (14 days): 23
Estimated Monthly Cost: $5.26

AWS Reference: AppSync Real-time Subscriptions

Why flagged: WebSocket connections that rarely receive data suggest abandoned or misconfigured subscription clients, wasting connection-minute charges.

Idle Step Functions State Machine

Confidence: HIGH (detection), but LOW savings
Detection: Zero executions for 30 days
Savings: Minimal (Step Functions is pay-per-transition)

How it works:

We query CloudWatch for Step Functions execution metrics
State machines with zero executions for 30+ days are flagged
Step Functions charges per state transition, so idle machines cost $0

Example:

State Machine: my-workflow
Executions (30 days): 0
Monthly Savings: $0.00
Purpose: Housekeeping

AWS Reference: Step Functions Pricing

Why flagged: Unused workflows may be obsolete or indicate broken integrations.

Step Functions Retry Storm

Confidence: HIGH
Detection: Retry ratio > 25% AND failure rate > 20% over 14 days
Savings: Estimated from avoidable transition volume

How it works:

We collect CloudWatch metrics: ExecutionsFailed, ExecutionsTimedOut, StateTransition
We estimate the proportion of transitions wasted on retry/failure paths
State machines exceeding both thresholds are flagged

Example:

State Machine: order-processor
Failure Rate: 35%
Retry Ratio: 28% of transitions
Estimated Wasted Transitions: 14,000/month
Estimated Savings: $0.35/month

AWS Reference: Step Functions Pricing

Why flagged: Retry loops multiply transitions and downstream compute invocations without business value.

High Transition Density Workflow

Confidence: MEDIUM
Detection: Average transitions per successful execution > 50 (configurable)
Savings: Estimated from reducible transitions

How it works:

We compute average transitions per successful execution from CloudWatch
Workflows with excessive state granularity are flagged
Conservative 30% reduction estimate applied for savings

Example:

State Machine: data-pipeline
Avg Transitions per Success: 85
Monthly Executions: 2,000
Estimated Savings: $1.28/month

AWS Reference: Step Functions Pricing

Why flagged: Excessive state granularity inflates per-execution cost. Collapsing pass states and simplifying branching reduces transitions.

Express Workflow Duration Waste

Confidence: MEDIUM
Detection: Express workflow with p95 duration > 30s and high execution volume
Savings: Estimated from reducible duration component

How it works:

We identify Express workflows via describe-state-machine
We collect ExecutionTime CloudWatch metrics (Average, Maximum)
Workflows with persistently high duration and sufficient volume are flagged

Example:

State Machine: realtime-processor (EXPRESS)
p95 Duration: 45,000ms
Monthly Executions: 15,000
Estimated Savings: $0.50/month

AWS Reference: Step Functions Pricing

Why flagged: Express billing includes per-request + duration charges. Unnecessary waits and payload overhead inflate duration costs.

Idle Transfer Family Server

Confidence: HIGH
Detection: Zero file transfers for 30 days
Savings: $0.30/hour (~$216/month) base charge

How it works:

We query CloudWatch for Transfer Family file operation metrics
Servers with zero transfers for 30+ days are flagged
Transfer Family charges $0.30/hour regardless of activity

Example:

Server: s-0abc123def456
Protocol: SFTP
File Transfers (30 days): 0
Monthly Savings: $216.00

AWS Reference: Transfer Family Pricing

Why HIGH confidence: Zero file transfers means no clients are using this SFTP/FTPS server.

Idle Transfer Family Server — Zero Activity

Confidence: HIGH
Detection: Server has configured users but zero file transfers for 30 days
Savings: $216/month per protocol ($0.30/hr × 720 hrs)

How it works:

We query Transfer Family for servers with configured users
We check CloudWatch FilesIn/FilesOut metrics for the last 30 days
Servers with users but zero transfers are flagged — the most common Transfer waste pattern

Example:

Server: s-0abc123def456
Users: 3
Protocols: SFTP
File Transfers (30 days): 0
Monthly Savings: $216.00

AWS Reference: Transfer Family Pricing

Why HIGH confidence: Configured users with zero transfers for 30+ days strongly indicates the server is abandoned.

Unused Protocol on Transfer Family Server

Confidence: MEDIUM
Detection: Multi-protocol server with one or more protocols showing zero transfers for 30 days
Savings: $216/month per unused protocol

How it works:

We identify servers with multiple protocols enabled (SFTP, FTPS, FTP, AS2)
We check CloudWatch metrics per protocol for the last 30 days
Protocols with zero transfers are flagged — each unused protocol costs $216/month

Example:

Server: s-0abc123def456
Protocols: SFTP, FTPS
Active: SFTP (142 transfers)
Unused: FTPS (0 transfers)
Monthly Savings: $216.00

AWS Reference: Transfer Family Pricing

Why MEDIUM confidence: Removing a protocol is reversible but could break clients configured for that protocol.

Idle Transfer Family Web App

Confidence: HIGH
Detection: Web App with zero active sessions for 30 days
Savings: $360/month per provisioned unit ($0.50/hr × 720 hrs)

How it works:

We query Transfer Family Web Apps and check provisioned units
We check CloudWatch ActiveSessions metric for the last 30 days
Web Apps with zero sessions are flagged — they run continuously once created

Example:

Web App: webapp-0abc123def
Provisioned Units: 2
Active Sessions (30 days): 0
Monthly Savings: $720.00

AWS Reference: Transfer Family Pricing

Why HIGH confidence: Zero sessions for 30+ days means no users are accessing the Web App.

Analytics Detectors

Old Glue Job

Confidence: MEDIUM
Detection: AWS Glue job not run in 90+ days or never run (if created 30+ days ago)
Savings: Minimal (Glue jobs are pay-per-use)

How it works:

We list all Glue jobs and check their run history
Jobs that have never run (and were created 30+ days ago) are flagged
Jobs that haven't run in 90+ days are also flagged

Example:

Job: etl-pipeline-v1
Last Run: 95 days ago (or never)
Monthly Savings: $0.00 (housekeeping)
Purpose: Cleanup unused ETL jobs

AWS Reference: Glue Pricing

Why MEDIUM confidence: Jobs may be scheduled for future use or disaster recovery.

Idle Glue Crawler

Confidence: MEDIUM
Detection: AWS Glue crawler not run in 90+ days or never run (if created 30+ days ago)
Savings: Minimal (Glue crawlers are pay-per-use)

How it works:

We list all Glue crawlers and check their last run time
Crawlers that haven't run in 90+ days are flagged
Crawlers that have never run (and were created 30+ days ago) are also flagged

Example:

Crawler: data-catalog-crawler
Last Run: 120 days ago
Monthly Savings: $0.00 (housekeeping)
Purpose: Cleanup unused data catalog crawlers

AWS Reference: Glue Pricing

Why MEDIUM confidence: Crawlers may be scheduled periodically or kept for future use.

Idle EMR Cluster

Confidence: HIGH
Detection: Running cluster with zero steps for extended period
Savings: Full cluster cost

How it works:

We identify EMR clusters in RUNNING or WAITING state
Clusters with no active or pending steps are flagged
EMR charges for all EC2 instances in the cluster

Example:

Cluster: j-0ABC123DEF456
State: WAITING
Active Steps: 0
Instance Count: 5 (1 master, 4 core)
Monthly Savings: $730.00

AWS Reference: EMR Pricing

Why HIGH confidence: Clusters in WAITING state with no steps are consuming resources without processing data.

Idle Kinesis Stream

Confidence: HIGH
Detection: Zero records for 14 days
Savings: Shard hours ($0.015/shard-hour)

How it works:

We query CloudWatch for IncomingRecords metrics
Streams with zero records for 14+ days are flagged
Kinesis charges per shard-hour regardless of activity

Example:

Stream: my-data-stream
Shards: 4
Incoming Records (14 days): 0
Monthly Savings: $43.80

AWS Reference: Kinesis Data Streams Pricing

Why HIGH confidence: Zero incoming records means no data is being streamed.

Idle MSK Cluster

Confidence: HIGH
Detection: Zero messages for 7 days
Savings: Full cluster cost (instance-type-aware)

How it works:

We query CloudWatch for Kafka message metrics (MessagesInPerSec, BytesInPerSec)
Clusters with zero messages and zero bytes for 7+ days are flagged
MSK charges for broker instance hours and storage

Example:

Cluster: my-kafka-cluster
Brokers: 3 (kafka.m5.large)
Messages (7 days): 0
Monthly Savings: $459.90

AWS Reference: MSK Pricing

Why HIGH confidence: Zero messages means no producers or consumers are using this Kafka cluster.

Oversized MSK Cluster

Confidence: MEDIUM
Detection: Broker CPU < 20% AND network throughput < 50% of instance capacity for 7 days (cluster is not idle)
Savings: Recommendation only (depends on target instance type)

How it works:

We query CloudWatch for CpuUser, BytesInPerSec, and BytesOutPerSec metrics across all brokers
Network utilization is calculated as (BytesIn + BytesOut) / instance network capacity — each instance type has a known throughput ceiling
If both average CPU is below 20% AND network throughput is below 50% of instance capacity over 7 days, the cluster may be over-provisioned
We flag the cluster for manual review — MSK is network-bound, so we check both signals to avoid false positives

Example:

Cluster: my-kafka-cluster
Brokers: 3 (kafka.m5.2xlarge)
Average CPU: 8.2%
Network Utilization: 4.3%
Messages/sec: 150
Current Cost: $1,839/month
Action: Consider downsizing to kafka.m5.large ($460/month)

AWS Reference: MSK Instance Types

Why MEDIUM confidence: MSK is network-bound, not CPU-bound. Low CPU alone doesn't indicate over-provisioning — a cluster saturating its network at 10% CPU is properly sized. CloudWise requires both low CPU and low network utilization before flagging. Review consumer lag, partition count, and storage throughput before downsizing.

Idle Glue Dev Endpoint

Confidence: HIGH
Detection: Dev endpoint in READY state with no recent activity
Savings: DPU-hour charges (~$0.44/DPU-hour)

How it works:

We identify Glue development endpoints in READY state
Endpoints running without recent notebook activity are flagged
Dev endpoints charge continuously while in READY state

Example:

Endpoint: my-dev-endpoint
Status: READY
DPUs: 5
Last Activity: 7+ days ago
Monthly Savings: $1,584.00

AWS Reference: Glue Pricing

Why HIGH confidence: Dev endpoints in READY state charge continuously. They should be terminated when not in use.

Oversized Glue Job

Confidence: MEDIUM
Detection: JVM heap utilization < 30% average over 14 days OR short execution with high DPU allocation
Savings: Proportional to excess DPUs × average run duration × frequency

How it works:

We examine each Glue ETL job's DPU allocation (MaxCapacity or NumberOfWorkers)
We query CloudWatch for glue.ALL.jvm.heap.usage to measure worker memory utilization
If average heap usage is below 30%, the job has more workers than it needs
We also apply a heuristic: jobs with ≥ 10 DPUs completing in under 30 minutes are likely oversized
Savings are calculated based on the excess DPUs, average run duration, and estimated run frequency

Example:

Job: daily-customer-etl
Allocated: 20 DPUs (G.1X workers)
Avg JVM Heap: 15%
Avg Duration: 12 minutes
Recommended: 6 DPUs
Estimated Runs: 30/month
Monthly Excess Cost: $55.44

Recommended Action: Test with reduced DPUs in a development environment. If the job completes successfully within acceptable time, update production.

AWS Reference: Glue Job Properties | Monitoring with CloudWatch

Why MEDIUM confidence: JVM heap usage is a reliable signal, but some jobs have legitimate peak memory needs that don't show in averages. The heuristic fallback (short duration + high DPUs) is less reliable.

Glue Job — Excessive Timeout Risk

Confidence: MEDIUM
Detection: Configured timeout ≥ 10× average execution duration
Savings: Risk mitigation — prevents catastrophic billing from stuck jobs

How it works:

We compare the job's configured Timeout against its average execution duration from run history
Jobs where the timeout exceeds 10× the average duration are flagged
Only jobs with recent successful runs and timeout ≥ 60 minutes are evaluated
We recommend setting timeout to 3× average duration (industry best practice)

Example:

Job: nightly-data-sync
Configured Timeout: 2,880 minutes (48 hours — the default)
Average Duration: 8 minutes
Timeout Ratio: 360×
Recommended Timeout: 24 minutes (3× average)
Risk: If this job hangs, it burns $211 before timing out
       (10 DPUs × 48 hours × $0.44)

Recommended Action: Set the timeout to 3× your average duration. This protects against stuck jobs while allowing for normal execution variance.

Agentic Tier — Automated Fix Available

Click Fix This to update the timeout configuration automatically, with full rollback capability.

AWS Reference: Glue Job Timeout

Why MEDIUM confidence: The timeout-to-duration ratio is a strong signal, but some jobs have legitimate long-tail executions on certain data partitions.

Failing Glue Job with Retries

Confidence: HIGH
Detection: ≥ 50% failure rate across recent runs with retries configured
Savings: Direct — eliminates DPU cost from repeated failed runs

How it works:

We examine the last 10 job runs for each Glue job
Jobs with ≥ 50% failure rate and MaxRetries > 0 are flagged
We calculate the total DPU-hours wasted on failed runs
Monthly waste is projected from the observed failure frequency
The most recent error message is captured for root cause analysis

Example:

Job: inventory-etl
Failure Rate: 80% (8 of 10 runs FAILED)
MaxRetries: 3
DPUs: 10
Avg Failed Run Duration: 15 minutes
Monthly Wasted DPU-hours: 90
Monthly Waste: $39.60
Last Error: "S3 bucket 'inventory-raw' does not exist"

Recommended Action: Disable retries immediately to stop the cost bleeding, then investigate and fix the root cause error.

Agentic Tier — Automated Fix Available

Click Fix This to disable retries automatically, with full rollback capability.

AWS Reference: Glue Job Retries

Why HIGH confidence: A sustained 50%+ failure rate across 10+ runs is a definitive pattern. Combined with retry configuration, this is guaranteed waste.

Migrate Dev Endpoint to Interactive Sessions

Confidence: LOW (advisory)
Detection: Any active Glue development endpoint
Savings: Difference between 24/7 billing and actual active-use billing

Important Note: AWS has deprecated development endpoints in favor of Interactive Sessions (Glue Studio Job Notebooks). Dev endpoints bill continuously at $0.44/DPU-hour — they cannot be paused or stopped. Interactive Sessions use the same $0.44/DPU-hour rate but automatically stop after an idle timeout (default: 60 minutes).

How it works:

We flag all active Glue development endpoints
We calculate the continuous monthly cost: DPUs × $0.44 × 730 hours
We estimate savings assuming the developer actively uses the endpoint ~4 hours per workday
The difference between 24/7 billing and workday-only billing is the savings opportunity

Example:

Endpoint: dev-spark-env
Status: READY
DPUs: 5
Created: 180 days ago
Continuous Monthly Cost: $1,606
Estimated Active Use: 4 hours/day × 22 workdays = 88 hours/month
Active Monthly Cost: $193.60
Monthly Savings: $1,412.40 (88% reduction)

Recommended Action: Create a new Interactive Session in Glue Studio, migrate your development scripts, then delete the dev endpoint.

AWS Reference: Interactive Sessions | Dev Endpoint Deprecation

Why LOW confidence: All dev endpoints are flagged — the savings estimate depends on actual usage patterns. Some teams may use endpoints near-continuously, reducing the savings delta.

Glue Data Catalog Bloat

Confidence: LOW
Detection: Data Catalog object count exceeds 1M free tier, or table version growth approaching threshold
Savings: $1.00 per 100,000 objects above the free tier

How it works:

We count total objects in the Data Catalog (databases, tables, table versions, partitions)
Catalogs exceeding 1,000,000 objects are flagged with the calculated overage cost
Catalogs with 500,000+ table versions get an early warning even if under the total threshold
The primary bloat source is table versions — every UpdateTable call creates a new version

Example:

Data Catalog Objects:
  Databases: 12
  Tables: 450
  Table Versions: 1,200,000  ← The problem
  Partitions: 350,000
  Total: 1,550,462
  Overage: 550,462 objects above free tier
  Monthly Cost: $5.50

Recommended Action: Delete old table versions using aws glue batch-delete-table-version. Keep only the latest 5–10 versions per table.

AWS Reference: Data Catalog Pricing | Table Versioning

Why LOW confidence: The cost is accurate but relatively low. The detector's value is catching uncontrolled growth before it becomes expensive.

Idle MQ Broker

Confidence: HIGH
Detection: Zero connections for 14 days
Savings: Full broker cost

How it works:

We query CloudWatch for Amazon MQ connection metrics
Brokers with zero connections for 14+ days are flagged
MQ charges for broker instance hours
Savings account for instance type pricing and HA deployment multipliers

Example:

Broker: my-activemq-broker
Engine: ActiveMQ
Instance Type: mq.t3.micro
Connections (14 days): 0
Monthly Savings: $24.82

AWS Reference: Amazon MQ Pricing

Why HIGH confidence: Zero connections means no applications are connected to this message broker.

Over-Provisioned MQ Broker

Confidence: MEDIUM
Detection: MQ broker with consistently low CPU utilization
Savings: $50–$1,400/month per right-sizing step

How it works:

We query Amazon MQ for all running brokers
CloudWatch CpuUtilization is analyzed over 14 days
Brokers with <15% avg CPU on mq.m5.large or larger are flagged
We identify the next smaller instance type as the downsizing target
Savings multiply by the number of instances (HA = 2×, RabbitMQ cluster = 3×)

Example:

Broker: dev-event-bus
Engine: ActiveMQ
Instance Type: mq.m5.large
Deployment Mode: ACTIVE_STANDBY_MULTI_AZ
CPU Average (14 days): 4.2%
Current Cost: $402.96/month (2 instances × $201.48)
Target: mq.t3.micro ($49.64/month for 2 instances)
Monthly Savings: $353.32

Recommended Action: Create a new broker with the smaller instance type, migrate configuration and queue definitions, update application connection strings, verify message flow, then delete the old broker.

Important

Amazon MQ does not support in-place instance type changes. Downsizing requires creating a new broker and migrating. Plan for brief message delivery interruption during cutover.

AWS Reference: Amazon MQ Pricing

Why MEDIUM confidence: CPU is a good proxy for message throughput but doesn't capture memory-bound scenarios (large message batches). Validate with monitoring after downsizing.

Long-Running EMR Cluster

Confidence: MEDIUM
Detection: EMR cluster in RUNNING state > 7 days without auto-termination
Savings: ~30% from using Spot instances or transient clusters

How it works:

We identify EMR clusters that have been running continuously for 7+ days
Clusters without auto-termination enabled run indefinitely after jobs complete
Long-running clusters may be better served by transient (job-scoped) clusters

Example:

Cluster: j-ABC123DEF456
Name: analytics-etl
Running: 14 days
Auto-Terminate: Disabled
Instance Type: m5.xlarge × 5
Monthly Savings: $450.00 (Spot + transient)

AWS Reference: EMR Pricing

Why MEDIUM confidence: Some EMR clusters intentionally run continuously for interactive workloads (Spark Thrift Server, Presto). Verify workload pattern.

Over-Provisioned EMR Cluster

Confidence: MEDIUM
Detection: YARN memory available > 50% averaged over 7 days
Savings: Proportional to instance count reduction

How it works:

We pull CloudWatch YARNMemoryAvailablePercentage for each WAITING/RUNNING cluster
If average available memory exceeds 50%, the CORE group is over-provisioned
Recommended new count = current × (1 − available_pct / 100), minimum 1

Example:

Cluster: j-OVERPROV001
Name: data-pipeline
YARN Memory Available: 72%
CORE Nodes: 8 → 4 recommended
Monthly Savings: $700.00

AWS Reference: EMR Instance Groups

Why MEDIUM confidence: YARN memory is a good proxy for utilisation but workloads may have burst patterns. Monitor after resize.

EMR Missing Auto-Termination

Confidence: MEDIUM
Detection: Keep-alive cluster with no AutoTerminationPolicy
Savings: Prevents indefinite idle costs

How it works:

We check clusters with KeepJobFlowAliveWhenNoSteps=True
If no AutoTerminationPolicy is set, the cluster will never terminate on its own
Recommendation: set a 1-hour idle timeout

Example:

Cluster: j-NOAUTOTERM01
Name: adhoc-queries
Keep Alive: Yes
Auto-Termination: Not configured
Instances: 5

AWS Reference: EMR Auto-Termination

Why MEDIUM confidence: Some clusters are intentionally long-lived (e.g., serving Presto/Hive endpoints). Verify before applying.

EMR Previous-Generation Instances

Confidence: MEDIUM
Detection: Cluster using previous-generation instance families
Savings: ~15–20% cost reduction with current-generation instances

How it works:

We check instance types in all instance groups
If any instance family is in the previous-gen map (m3→m5, m4→m5, c3→c5, c4→c5, r3→r5, r4→r5, i2→i3, d2→d3, p2→p3, g2→g4dn), we flag the cluster
Current-gen instances offer better performance at lower cost

Example:

Cluster: j-PREVGEN001
Name: reports-cluster
Current Type: m4.xlarge ($0.240/hr)
Recommended: m5.xlarge ($0.192/hr)
Savings per Instance: $14.60/month

AWS Reference: EC2 Instance Types

Why MEDIUM confidence: Current-gen instances are generally drop-in replacements but EMR release compatibility should be verified.

EMR Spot Opportunity

Confidence: MEDIUM
Detection: TASK instance group using On-Demand pricing
Savings: ~60–70% cost reduction on task nodes

How it works:

We identify TASK instance groups with Market=ON_DEMAND
Task nodes are fault-tolerant — YARN reschedules work on Spot interruption
Spot pricing is typically 60–70% cheaper than On-Demand

Example:

Cluster: j-SPOTOPPTY01
Task Nodes: 4 × r5.2xlarge (On-Demand)
On-Demand Cost: $0.504/hr per instance
Spot Estimate: ~$0.17/hr per instance
Monthly Savings: $964.00

AWS Reference: EMR Spot Instances

Why MEDIUM confidence: Spot instances can be interrupted. Verify workload is fault-tolerant before switching.

Over-Provisioned Kinesis Stream

Confidence: MEDIUM
Detection: Kinesis Data Stream with per-shard utilization < 20% over 14 days
Savings: Excess shards × $11/shard/month

How it works:

We analyze Kinesis stream IncomingBytes relative to shard capacity
Each shard can handle 1 MB/s input and 2 MB/s output
Streams with < 20% utilization per shard have excess capacity

Example:

Stream: event-ingestion
Shards: 10
Avg Throughput: 1.2 MB/s (12% per-shard utilization)
Recommended: 4 shards
Monthly Savings: $66.00 ($11/shard × 6 excess shards)

AWS Reference: Kinesis Data Streams Pricing

Why MEDIUM confidence: Stream throughput can be bursty. Analyze peak utilization and consider On-Demand mode for variable workloads.

Kinesis On-Demand Downgrade

Confidence: MEDIUM
Detection: On-Demand stream with stable throughput (coefficient of variation < 0.3) for 14+ days, age ≥ 30 days
Savings: On-Demand vs Provisioned cost delta

How it works:

Identifies On-Demand streams with predictable traffic patterns
Calculates coefficient of variation (CV) of daily IncomingBytes over 14 days
CV < 0.3 indicates stable throughput suitable for Provisioned mode
Computes recommended shard count with 30% headroom

Example:

Stream: log-pipeline
Mode: ON_DEMAND
CV: 0.12 (stable)
On-Demand Cost: $58.20/month
Provisioned (2 shards): $21.90/month
Monthly Savings: $36.30

AWS Reference: Kinesis Data Streams Pricing

Kinesis Extended Retention Waste

Confidence: HIGH
Detection: Extended retention (>24h) with zero GetRecords in 14 days
Savings: Shards × $0.020/shard-hour × 730 hours/month

How it works:

Identifies streams with retention period > 24 hours (default)
Checks if any consumer is reading data (GetRecords metric)
If zero reads for 14 days, extended retention is waste

Example:

Stream: clickstream-archive
Retention: 168 hours (7 days)
Shards: 4
GetRecords (14d): 0
Monthly Savings: $58.40 ($0.020 × 4 shards × 730 hours)

Kinesis Enhanced Fan-Out Waste

Confidence: HIGH
Detection: Enhanced fan-out consumer with zero reads for 14 days
Savings: Shards × $0.015/consumer-shard-hour × 730 hours/month

How it works:

Lists enhanced fan-out consumers registered on each stream
Checks per-consumer GetRecords metrics
Consumers with zero reads for 14 days are candidates for deregistration

Example:

Stream: events/analytics-consumer
Shards: 4
Consumer: analytics-consumer (ACTIVE)
GetRecords (14d): 0
Monthly Savings: $43.80 ($0.015 × 4 shards × 730 hours)

Idle Kinesis Firehose

Confidence: HIGH
Detection: Firehose delivery stream with zero IncomingRecords for 14 days
Savings: $0 direct (pay-per-use), but eliminates forgotten infrastructure

How it works:

Lists all Firehose delivery streams
Checks IncomingRecords and IncomingBytes metrics over 14 days
Idle streams are flagged as forgotten infrastructure
Notes if Lambda transforms are configured (may have associated costs)

Example:

Delivery Stream: logs-to-s3-firehose
Status: ACTIVE
Records (14d): 0
Lambda Transform: Yes (may incur invocation costs)

ML/AI Detectors

Idle SageMaker Notebook Instance

Confidence: HIGH
Detection: Notebook instance InService with no kernel activity
Savings: Full instance cost

How it works:

We identify SageMaker notebook instances in InService status
Check for kernel activity or user connections
Idle notebooks continue to charge full instance cost

Example:

Notebook: my-ml-notebook
Instance Type: ml.t3.medium
Status: InService
Activity (7 days): None
Monthly Savings: $37.00

AWS Reference: SageMaker Pricing

Why HIGH confidence: SageMaker notebooks charge continuously when InService, regardless of activity.

Idle SageMaker Endpoint

Confidence: HIGH
Detection: Zero invocations for 7 days
Savings: Full endpoint cost

How it works:

We query CloudWatch for endpoint invocation metrics
Endpoints with zero invocations for 7+ days are flagged
Idle endpoints charge full compute cost continuously

Example:

Endpoint: my-model-endpoint
Instance Type: ml.m5.large
Invocations (7 days): 0
Monthly Savings: $96.00

AWS Reference: SageMaker Pricing

Why HIGH confidence: Zero invocations is unambiguous. Endpoint is not being used.

Oversized SageMaker Endpoint

Confidence: MEDIUM
Detection: Low CPU/memory utilization over 7 days
Savings: Based on right-sized instance

How it works:

We analyze CloudWatch metrics for endpoint CPU and memory utilization
Endpoints with consistently low utilization are flagged
Right-sizing to smaller instance type reduces costs

Example:

Endpoint: my-inference-endpoint
Instance Type: ml.c5.xlarge
Avg CPU: 15%
Avg Memory: 20%
Monthly Savings: ~$70.00 (if downsized to ml.c5.large)

AWS Reference: SageMaker Pricing

Why MEDIUM confidence: Inference workloads may have periodic spikes. Review before right-sizing.

Stopped SageMaker Notebook Storage

Confidence: HIGH
Detection: Stopped notebook instance with attached EBS volume
Savings: $0.58 – $58/month per notebook (based on EBS volume size)

How it works:

We identify SageMaker notebook instances in Stopped status
Stopped notebooks still incur EBS storage charges ($0.116/GB-month for gp2/gp3)
Notebooks stopped for >7 days are flagged
Deleting the notebook removes the attached EBS volume

Example:

Notebook: dev-ml-notebook
Status: Stopped (30 days)
Volume Size: 50 GB
Monthly Storage Cost: $5.80
Action: Delete notebook instance to remove EBS volume

AWS Reference: SageMaker Pricing

Why HIGH confidence: Stopped status is binary and volume size is exact. No ambiguity.

Previous-Generation SageMaker Instance

Confidence: HIGH
Detection: Notebook or endpoint using previous-gen instance type (ml.m4, ml.c4, ml.t2, ml.r4, ml.p2)
Savings: 10-30% of current cost

How it works:

We check instance types against a known previous-generation prefix list
Each flagged resource includes a specific upgrade recommendation
Covers both notebook instances and endpoint instances
Upgrade map: ml.m4→ml.m5, ml.c4→ml.c5, ml.t2→ml.t3, ml.r4→ml.r5, ml.p2→ml.p3

Example:

Notebook: legacy-training-nb
Instance Type: ml.m4.xlarge ($0.28/hr)
Recommended: ml.m5.xlarge ($0.23/hr)
Monthly Savings: $36.50 (18% reduction)

AWS Reference: SageMaker Pricing

Why HIGH confidence: Instance type is deterministic metadata. No ambiguity in detection.

SageMaker Savings Plan Opportunity

Confidence: HIGH
Detection: AWS Cost Explorer recommendation for SageMaker Savings Plan
Savings: Exact AWS calculation (typically 20–40% vs On-Demand)

How it works:

AWS Cost Explorer analyzes 30 days of SageMaker usage (training, inference, notebooks)
Identifies consistent usage patterns suitable for a SageMaker Savings Plan commitment
Returns exact savings calculation for 1-year or 3-year terms

Example:

Current SageMaker Spend: $850.00/month (On-Demand)
Recommended SP: $550.00/month commitment
Monthly Savings: $300.00 (35%)
Term: 1-year, No Upfront

AWS Reference: SageMaker Savings Plans

Why HIGH confidence: Uses AWS's own Cost Explorer analysis with exact pricing data.

Security Detectors

Unused Secrets Manager Secret

Confidence: HIGH
Detection: Secret not accessed for 90 days
Savings: $0.40/month per secret

How it works:

We check the LastAccessedDate for each secret
Secrets not accessed for 90+ days are flagged
Secrets Manager charges $0.40/month per secret

Example:

Secret: old-api-key
Last Accessed: 120 days ago
Monthly Savings: $0.40

AWS Reference: Secrets Manager Pricing

Why HIGH confidence: Access date is tracked by AWS. 90+ days without access indicates unused secret.

Unused KMS Key

Confidence: HIGH
Detection: Customer-managed key not used for 90 days
Savings: $1.00/month per key

How it works:

We check CloudTrail for KMS key usage events
Customer-managed keys not used for 90+ days are flagged
KMS charges $1.00/month per customer-managed key

Example:

Key: alias/old-encryption-key
Key ID: arn:aws:kms:us-east-1:123456789:key/abc123
Last Used: 95 days ago
Monthly Savings: $1.00

AWS Reference: KMS Pricing

Why HIGH confidence: Key usage is tracked via CloudTrail. 90+ days without use indicates unused key.

Unencrypted EBS Volume

Confidence: HIGH
Detection: EBS volume with encrypted = false in in-use state
Savings: Zero (security/compliance finding)

How it works:

We check the encrypted attribute on all EBS volumes
Volumes in in-use state without encryption at rest are flagged
EBS encryption has no performance penalty and uses AWS-managed or customer-managed KMS keys

Example:

Volume: vol-0abc123def456
State: in-use
Encrypted: No
Size: 500 GB
Action: Enable default EBS encryption, recreate volume from snapshot

AWS Reference: EBS Encryption

Why HIGH confidence: Encryption status is deterministic metadata from the EBS API.

Unencrypted RDS Instance

Confidence: HIGH
Detection: RDS instance with StorageEncrypted = false
Savings: Zero (security/compliance finding)

How it works:

We check the StorageEncrypted attribute on all RDS instances
Unencrypted instances cannot be encrypted in-place — requires snapshot + restore
This is a compliance requirement for most security frameworks (SOC 2, HIPAA, PCI)

Example:

Instance: mydb-production
Engine: mysql 8.0
Encrypted: No
Action: Create encrypted snapshot, restore encrypted instance

AWS Reference: RDS Encryption

Why HIGH confidence: Encryption status is exact API metadata. No estimation needed.

Unencrypted EFS Filesystem

Confidence: HIGH
Detection: EFS filesystem with encrypted = false
Savings: Zero (security/compliance finding)

How it works:

We check the Encrypted attribute on all EFS filesystems
EFS encryption must be enabled at creation — cannot be added to existing filesystems
Migration requires creating a new encrypted filesystem and copying data

Example:

Filesystem: fs-0abc123def456
Encrypted: No
Size: 150 GB
Action: Create encrypted EFS, migrate data with DataSync

AWS Reference: EFS Encryption

Why HIGH confidence: Encryption status is deterministic. No false positives possible.

Unencrypted DocumentDB Cluster

Confidence: HIGH
Detection: DocumentDB cluster with StorageEncrypted = false
Savings: Zero (security/compliance finding)

How it works:

We check the StorageEncrypted attribute on DocumentDB clusters
Unencrypted clusters cannot be encrypted in-place
Migration requires creating an encrypted cluster and restoring from snapshot

Example:

Cluster: my-docdb-cluster
Encrypted: No
Action: Create encrypted cluster, restore from snapshot

AWS Reference: DocumentDB Encryption

Why HIGH confidence: Encryption status is exact metadata from the DocumentDB API.

S3 Without Default Encryption

Confidence: MEDIUM
Detection: S3 bucket without default encryption configuration
Savings: Zero (compliance finding)

How it works:

We check for ServerSideEncryptionConfiguration on each S3 bucket
Buckets without default encryption rely on individual object-level encryption
Since January 2023, AWS enables SSE-S3 by default — this detector catches legacy buckets

Example:

Bucket: legacy-data-bucket
Default Encryption: None
Action: Enable SSE-S3 or SSE-KMS default encryption

AWS Reference: S3 Default Encryption

Why MEDIUM confidence: Legacy buckets may have object-level encryption already applied. Bucket-level policy provides a safety net.

OpenSearch Encryption Not Enabled

Confidence: HIGH
Detection: OpenSearch domain with EncryptionAtRestOptions.Enabled = false
Savings: Zero (security/compliance finding)

How it works:

We check the EncryptionAtRestOptions configuration on OpenSearch domains
Domains without encryption at rest are flagged as a compliance issue
Encryption can be enabled on existing domains (in-place, no downtime)

Example:

Domain: search-logs
Encryption at Rest: Disabled
Action: Enable encryption at rest in domain settings

AWS Reference: OpenSearch Encryption

Why HIGH confidence: Encryption configuration is exact API metadata.

RDS Without Deletion Protection

Confidence: HIGH
Detection: RDS instance with DeletionProtection = false
Savings: Zero (reliability/governance finding)

How it works:

We check the DeletionProtection attribute on all RDS instances
Production databases without this safeguard risk accidental deletion
Enabling deletion protection is a one-click change with no downtime

Example:

Instance: production-db
Deletion Protection: Disabled
Action: Enable in RDS console → Modify → Deletion Protection

AWS Reference: RDS Deletion Protection

Why HIGH confidence: Configuration attribute is exact metadata. All production databases should have this enabled.

DynamoDB Without Deletion Protection

Confidence: HIGH
Detection: DynamoDB table with DeletionProtectionEnabled = false
Savings: Zero (reliability/governance finding)

How it works:

We check the DeletionProtectionEnabled attribute on DynamoDB tables
Tables without protection can be accidentally deleted via API or console
Enabling deletion protection prevents accidental table deletion

Example:

Table: user-sessions
Deletion Protection: Disabled
Action: Enable via UpdateTable API or console

AWS Reference: DynamoDB Deletion Protection

Why HIGH confidence: Attribute is exact boolean metadata from the DynamoDB API.

RDS Publicly Accessible

Confidence: HIGH
Detection: RDS instance with PubliclyAccessible = true
Savings: Zero (security finding)

How it works:

We check the PubliclyAccessible attribute on RDS instances
Publicly accessible databases are reachable from the internet (if security groups allow)
Best practice is to place databases in private subnets with VPC-only access

Example:

Instance: mydb-production
Publicly Accessible: Yes
Action: Modify instance → set Publicly Accessible to No

AWS Reference: RDS Security Best Practices

Why HIGH confidence: Boolean attribute from the RDS API. Combined with open security groups, this is a critical security risk.

Resource Without Backup Coverage

Confidence: MEDIUM
Detection: RDS/DynamoDB resource with no AWS Backup recovery points
Savings: Backup cost (~$0.50–$5/month per resource)

How it works:

We cross-reference RDS instances and DynamoDB tables against AWS Backup inventory
Resources with zero recovery points across all vaults are flagged
This is a governance/resilience finding rather than a cost finding

Example:

Resource: arn:aws:rds:us-east-1:123456789:db:mydb
Recovery Points: 0
Action: Create backup plan covering this resource

AWS Reference: AWS Backup

Why MEDIUM confidence: Some resources use native snapshots (RDS automated backups) rather than AWS Backup. Verify backup strategy.

AWS Compute Optimizer Integration

CloudWise integrates with AWS Compute Optimizer to provide ML-backed rightsizing recommendations. This is a FREE AWS service that analyzes 14 days of CloudWatch metrics to recommend optimal resource configurations.

Oversized EC2 (Compute Optimizer)

Confidence: HIGH
Detection: AWS Compute Optimizer ML analysis
Savings: Exact calculation based on recommended instance type

How it works:

AWS Compute Optimizer analyzes 14 days of CPU, memory, network, and disk metrics
Returns specific recommendations (e.g., "t3.large → t3.medium")
CloudWise surfaces these with exact savings calculations

Example:

Instance: i-0abc123def456
Current: m5.xlarge ($140.16/month)
Recommended: m5.large ($70.08/month)
Monthly Savings: $70.08 (50%)
Confidence: HIGH (AWS Compute Optimizer)

AWS Reference: Compute Optimizer

Why HIGH confidence: AWS ML model analyzes real utilization. Specific target instance provided.

Oversized EBS (Compute Optimizer)

Confidence: HIGH
Detection: AWS Compute Optimizer IOPS/throughput analysis
Savings: Based on recommended volume configuration

How it works:

Compute Optimizer analyzes EBS IOPS and throughput patterns
Recommends optimal volume type and size
Exact savings calculated from current vs. recommended cost

AWS Reference: Compute Optimizer - EBS

Over-Provisioned Lambda (Compute Optimizer)

Confidence: HIGH
Detection: AWS Compute Optimizer memory analysis
Savings: Based on recommended memory configuration

How it works:

Compute Optimizer analyzes Lambda memory utilization
Recommends right-sized memory allocation
Lower memory = lower cost per invocation

AWS Reference: Compute Optimizer - Lambda

Oversized RDS (Compute Optimizer)

Confidence: HIGH
Detection: AWS Compute Optimizer ML analysis of RDS instance
Savings: Exact calculation based on recommended instance type

How it works:

AWS Compute Optimizer analyzes RDS instance CPU, memory, and I/O metrics
Returns specific downsizing recommendations (e.g., "db.r6g.xlarge → db.r6g.large")
CloudWise surfaces these with exact savings calculations

Example:

Instance: mydb-production
Current: db.r6g.xlarge ($0.48/hr)
Recommended: db.r6g.large ($0.24/hr)
Monthly Savings: $175.20

AWS Reference: Compute Optimizer - RDS

Why HIGH confidence: AWS Compute Optimizer uses 14+ days of ML-backed analysis. Recommendations include exact pricing.

Reserved Instance & Savings Plans Recommendations

CloudWise provides official AWS recommendations for commitment-based pricing. These are based on 30 days of actual usage and provide exact savings calculations.

API Cost Note: These recommendations are refreshed weekly to minimize Cost Explorer API costs ($0.01/request). Data is cached for 7 days.

EC2 Reserved Instance Opportunity

Confidence: HIGH
Detection: AWS Cost Explorer 30-day usage analysis
Savings: Exact AWS calculation (typically 30-40%)

How it works:

AWS Cost Explorer analyzes 30 days of EC2 usage patterns
Identifies consistent usage that would benefit from RI commitment
Returns specific purchase recommendations with exact savings

Example:

Recommendation: Purchase 2x t3.large 1-Year No Upfront RI
Region: us-east-1
Monthly Savings: $42.40 (37% savings)
Break-even: 7 months

AWS Reference: EC2 Reserved Instances

Why HIGH confidence: AWS official recommendation based on actual usage patterns.

RDS Reserved Instance Opportunity

Confidence: HIGH
Detection: AWS Cost Explorer 30-day usage analysis
Savings: Exact AWS calculation

How it works:

Analyzes RDS instance usage patterns
Recommends RI purchases for consistent database workloads
Includes engine-specific recommendations (MySQL, PostgreSQL, etc.)

AWS Reference: RDS Reserved Instances

Compute Savings Plan Opportunity

Confidence: HIGH
Detection: AWS Cost Explorer 30-day usage analysis
Savings: Exact AWS calculation (typically 20-30%)

How it works:

Analyzes all compute usage (EC2, Fargate, Lambda)
Recommends hourly commitment level
Compute SP covers all regions and instance families (most flexible)

Example:

Recommendation: $10/hour Compute Savings Plan (1-Year)
Monthly Commitment: $7,300/month
Current On-Demand: $9,500/month
Monthly Savings: $2,200 (23%)
Coverage: EC2, Fargate, Lambda (all regions)

AWS Reference: Savings Plans

EC2 Instance Savings Plan Opportunity

Confidence: HIGH
Detection: AWS Cost Explorer 30-day usage analysis
Savings: Higher than Compute SP (locked to instance family/region)

Note: Only recommended when savings exceed 25% due to reduced flexibility.

AWS Reference: Savings Plans

Unused Reserved Instance

Confidence: HIGH Detection: EC2 RI utilization < 20% over 30 days (via Cost Explorer Reservation Utilization API) Savings: Committed RI cost × (1 - utilization%)

How it works:

CloudWise reads your Cost Explorer Reservation Utilization report weekly
Any EC2 RI group with < 20% utilization over the past 30 days is flagged
Monthly waste is calculated as the fraction of committed cost not offsetting on-demand

Example:

Instance type: m4.xlarge
Reserved Instances: 3 (No Upfront 1-year, ~$140/month each)
RI Utilization: 5% (3 hours matched out of 24 × 30 = 720 hours)
Monthly waste: $140 × 3 × 0.95 = $399/month

Why HIGH confidence: Uses AWS Cost Explorer's own utilization calculation, which tracks exact RI-to-instance matching.

What to do: Determine if the low utilization is permanent (migrate fleet, sell RI) or temporary (workload scaling back up). Standard RIs can be sold on the AWS Reserved Instance Marketplace. Convertible RIs can be exchanged for a different type.

AWS Reference: Reserved Instances in Amazon EC2

Expiring Reserved Instance

Confidence: HIGH Detection: Active EC2 RI with end date within 90 days Savings: Monthly cost delta between current RI rate and on-demand equivalent (cost avoidance)

How it works:

CloudWise reads your active EC2 Reserved Instances weekly
Any RI expiring within 90 days is flagged with an urgency level (URGENT ≤30d / WARNING ≤60d / NOTICE ≤90d)
The estimated impact shows what the covered instance(s) would cost at on-demand rates post-expiry

Example:

Instance type: r5.4xlarge
Reserved Instances: 2 (All Upfront 1-year)
Expiry: 22 days from now — URGENT
Current RI rate: ~$320/month each
On-demand equivalent: ~$960/month each
Risk: +$1,280/month if not renewed

Why HIGH confidence: Expiry date is deterministic — describe_reserved_instances returns exact ISO 8601 end timestamps.

What to do: (a) Purchase a replacement RI before expiry, (b) switch to a Compute Savings Plan for more instance-type flexibility, or (c) if the workload is ending, allow expiry and plan to terminate the instance promptly.

Convertible RI Exchange Opportunity

Confidence: MEDIUM Detection: Convertible EC2 RI on a previous-generation instance family where a current-gen exchange is available Savings: On-demand price differential between old and new instance type × quantity

How it works:

CloudWise identifies active Convertible RIs tied to previous-gen instance families (m4, r4, c5, etc.)
Maps each to the recommended current-gen equivalent (m7i, r7i, c7i, etc.)
If the target type has a lower on-demand reference price, the exchange is flagged
Exchange is free per AWS policy — no financial outlay required

Example:

RI type: m4.xlarge Convertible (3-year, active, 18 months remaining)
Recommended exchange: m7i.xlarge
m4.xlarge on-demand: $0.192/hr → $140.16/month
m7i.xlarge on-demand: $0.1785/hr → $130.31/month
Savings on reference pricing: $9.85/month per RI
Exchange cost: $0

Why MEDIUM confidence: Pricing differential relies on public on-demand pricing; actual RI discount rates are calculated by AWS at exchange time and may differ slightly.

What to do: Log in to EC2 Console → Reserved Instances → select the RI → Actions → Exchange Reserved Instance. Select the target instance type. Review the exchange ratio (AWS may require more than one old RI to match one new RI in some cases). Confirm the exchange — it takes effect immediately.

Note: Graviton-family targets (t4g, m7g, c7g) are not recommended automatically as ARM64 compatibility requires code-level validation.

Expiring Savings Plan

Confidence: HIGH
Detection: Savings Plan with end date within 90 days
Savings: Monthly commitment amount at risk after expiry

How it works:

We check all active Savings Plans for upcoming expiration dates
Plans expiring within 30 days are flagged as URGENT, within 60 days as WARNING
Early notification allows time to renew or adjust commitment strategy

Example:

Savings Plan: sp-abc123def456
Type: Compute Savings Plan
Commitment: $500.00/month
Expires: 2026-05-15 (37 days away)
Urgency: WARNING
Action: Renew or purchase replacement SP

AWS Reference: Savings Plans

Why HIGH confidence: Expiration dates are exact metadata from the Savings Plans API. No estimation involved.

Unused Savings Plan

Confidence: HIGH
Detection: Savings Plan utilization < 20% over 30 days
Savings: Unused daily commitment × 30

How it works:

We analyze Savings Plan utilization via Cost Explorer over the past 30 days
Plans with utilization below 20% are flagged — the commitment is being paid but not offsetting On-Demand usage
This often indicates workload changes, instance type shifts, or over-commitment

Example:

Savings Plan: sp-abc123def456
Type: EC2 Instance Savings Plan
Commitment: $300.00/month
Utilization: 12%
Unused: $264.00/month
Action: Review commitment strategy

AWS Reference: Savings Plans Utilization

Why HIGH confidence: Utilization data comes directly from AWS Cost Explorer with exact dollar amounts.

Savings Plan Coverage Gap

Confidence: MEDIUM
Detection: Significant On-Demand spend not covered by existing Savings Plans
Savings: ~25% of uncovered On-Demand spend (conservative estimate)

How it works:

We analyze Cost Explorer data for compute spend covered vs. uncovered by Savings Plans
Accounts with < 50% SP coverage and > $100/month uncovered spend are flagged
Purchasing additional SP commitment could reduce the uncovered portion by 25–40%

Example:

Total Compute Spend: $2,000/month
SP Coverage: 40% ($800 covered)
Uncovered On-Demand: $1,200/month
Potential SP Savings: $300.00/month (25%)

AWS Reference: Savings Plans Coverage

Why MEDIUM confidence: Savings estimate depends on the SP type/term chosen and actual usage stability. The 25% is conservative.

CUR Unused Reservation

Confidence: HIGH
Detection: CUR-based analysis showing Reserved Instance amortization with low utilization
Savings: Unused RI fee from CUR cost breakdown

How it works:

We analyze CUR (Cost and Usage Report) line items for RI amortization
RI fees without corresponding On-Demand equivalent usage indicate waste
This provides more granular insight than the Cost Explorer summary

Example:

RI: ri-abc123def456
Type: m5.xlarge
Amortized Cost: $250.00/month
Utilized: 35%
Waste: $162.50/month
Action: Sell on RI Marketplace or modify

AWS Reference: Understanding CUR Reserved Instance Data

Why HIGH confidence: CUR data provides exact line-item cost allocation. No estimation needed.

CUR Savings Plan Waste

Confidence: MEDIUM
Detection: CUR-based analysis of Savings Plan commitment not fully utilized
Savings: Unused SP commitment from CUR breakdown

How it works:

We analyze CUR line items for Savings Plan amortization and coverage
SP commitment cost without matching covered usage indicates waste
Provides daily granularity on SP utilization trends

Example:

Savings Plan: sp-abc123def456
Type: Compute SP
Monthly Commitment: $400.00
Utilized: 65%
Waste: $140.00/month
Action: Adjust workloads or reduce commitment at renewal

AWS Reference: Understanding CUR Savings Plans

Why MEDIUM confidence: CUR data is exact but utilization patterns may fluctuate month to month.

IPv4 Address Optimization

As of February 2024, AWS charges $0.005/hour ($3.65/month) for ALL public IPv4 addresses.

EIP on Stopped Instance

Confidence: HIGH
Detection: Elastic IP attached to stopped EC2 instance
Savings: $3.65/month per EIP

How it works:

Identifies Elastic IPs attached to stopped instances
Stopped instances don't need public IPs
Release or detach to save $3.65/month per IP

Example:

Elastic IP: 52.123.45.67
Attached to: i-0abc123 (stopped)
Monthly Savings: $3.65
Action: Release EIP or terminate instance

AWS Reference: VPC Pricing - Public IPv4

Multiple EIPs per Instance

Confidence: HIGH
Detection: Instance with more than one Elastic IP attached
Savings: $3.65/month per extra EIP

How it works:

Identifies instances with multiple EIPs attached
Multiple public IPs is usually an anti-pattern
Consolidate to single EIP to reduce costs

AWS Reference: VPC Pricing

Management Detectors

CloudWatch Log Group Without Retention

Confidence: MEDIUM (small groups) / HIGH (large groups)
Detection: CloudWatch log groups with no retention policy (logs stored forever)
Savings: Varies based on log volume ($0.03/GB-month for storage)

How it works:

We identify all CloudWatch log groups without a retention policy
Log groups without retention store logs indefinitely (growing storage costs)
Larger log groups (>1GB) are flagged with higher confidence

Example:

Log Group: /aws/lambda/my-function
Size: 5.2GB
Retention: Never expires (no policy)
Monthly Savings: $0.16/month (storage) + preventing future growth
Recommendation: Set 30-90 day retention policy

AWS Reference: CloudWatch Pricing

Why flagged: Without retention, logs grow indefinitely. Most logs are only useful for 30-90 days.

Excessive Log Group Retention

Confidence: MEDIUM / HIGH (HIGH when >10 GB stored)
Detection: Log group with retention ≥ 365 days and ≥ 100 MB stored data
Savings: Proportional to stored data × retention reduction factor

How it works:

We check each log group's retention policy via the CloudWatch Logs API
Log groups with 365+ day retention and at least 100 MB of stored data are flagged
Savings are calculated based on reducing retention to 30 days: savings = storage_cost × (1 − 30/current_retention_days)

Example:

Log Group: /aws/lambda/api-handler
Retention: 365 days
Stored Data: 50 GB
Current Storage Cost: $1.50/month
Recommended Retention: 30 days
Estimated Savings: $1.38/month (92% reduction)

Recommended Action: Reduce retention to 30 days. For compliance, consider S3 export + Glacier instead.

aws logs put-retention-policy \
  --log-group-name '/aws/lambda/api-handler' \
  --retention-in-days 30

Agentic Tier — Automated Fix Available

Click Fix This to reduce retention automatically, with full rollback capability.

AWS Reference: CloudWatch Logs Pricing | PutRetentionPolicy API

Why flagged: 99% of log searches target the last 7 days. Storing 365+ days of logs in CloudWatch Logs is rarely cost-effective compared to S3 + Glacier for long-term archival.

Empty Log Group

Confidence: LOW
Detection: Log group with 0 bytes stored and no ingestion for 30+ days
Savings: $0.00 (hygiene — quota reclamation)

How it works:

We check each log group's stored bytes from the CloudWatch Logs API
Log groups with exactly 0 bytes are evaluated for staleness
If the log group was created 30+ days ago (or last received logs 30+ days ago), it's flagged
No savings — empty log groups are free. This is a housekeeping detector.

Example:

Log Group: /aws/lambda/deleted-processor
Stored Data: 0 bytes
Created: 180 days ago
Last Event: Never received logs
Status: Orphaned (Lambda function was deleted)

Recommended Action: Delete the empty log group to reduce clutter and free quota (10,000 log groups per region default limit).

aws logs delete-log-group \
  --log-group-name '/aws/lambda/deleted-processor'

Agentic Tier — Automated Fix Available

Click Fix This to delete the empty log group automatically. CloudWatch will recreate the log group automatically if the associated service resumes logging.

AWS Reference: CloudWatch Logs Quotas

Why flagged: While free, orphaned log groups consume regional quota (default: 10,000). Accounts with many microservices, Lambda functions, or ECS tasks can accumulate hundreds of empty log groups over time.

Old Log Group

Confidence: MEDIUM
Detection: CloudWatch log group with > 0.5 GB and no new log events for 90+ days
Savings: Storage cost at $0.03/GB/month

How it works:

We check the last ingestion timestamp for each log group
Groups with no new events for 90+ days and stored volume > 0.5 GB are flagged
Stale data should be archived to S3 or deleted to reduce storage costs

Example:

Log Group: /aws/lambda/old-data-processor
Size: 2.3 GB
Last Event: 120 days ago
Monthly Savings: $0.07
Action: Export to S3 and set retention, or delete

AWS Reference: CloudWatch Logs Pricing

Why MEDIUM confidence: Some log groups are retained for audit/compliance. Verify retention requirements before deleting.

Unused CloudWatch Dashboard

Confidence: MEDIUM
Detection: CloudWatch dashboard beyond the first 3 free dashboards, not modified in 90+ days
Savings: $3.00/month per unused dashboard

How it works:

AWS provides 3 free CloudWatch dashboards — additional dashboards cost $3.00/month each
We identify paid-tier dashboards that haven't been modified in 90+ days
Dashboards that haven't been updated may no longer reflect the current infrastructure

Example:

Dashboard: legacy-monitoring
Last Modified: 145 days ago
Tier: Paid (dashboard #5 of 8)
Monthly Savings: $3.00
Action: Delete or consolidate into active dashboards

AWS Reference: CloudWatch Dashboards Pricing

Why MEDIUM confidence: Dashboards may still be viewed even if not modified. The "not modified" heuristic may not capture viewing activity.

CloudTrail Optimization

Duplicate CloudTrail Trails

Confidence: HIGH
Detection: Multiple trails logging same events in same region
Savings: $2.00 per 100,000 events for duplicate trails

How it works:

Analyzes CloudTrail configurations across the account
Identifies trails with overlapping event selectors
Recommends consolidation to single trail

Example:

Trail 1: my-trail (logging all management events)
Trail 2: security-trail (logging all management events)
Issue: Duplicate logging of same events
Savings: Depends on event volume

AWS Reference: CloudTrail Pricing

CloudTrail S3 Without Lifecycle

Confidence: MEDIUM
Detection: CloudTrail S3 bucket without lifecycle policy
Savings: Depends on log volume and retention

How it works:

Identifies S3 buckets receiving CloudTrail logs
Checks for lifecycle policy to transition/expire old logs
Recommends lifecycle policy to reduce storage costs

AWS Reference: S3 Lifecycle

🤖 AI Copilot Integration

The AI Copilot uses waste detection data to provide intelligent recommendations:

User: "What can I do to reduce my AWS costs?"

AI Copilot: "I found 3 high-confidence waste items that could save you $156/month:

1. **Unattached EBS Volume** (vol-abc123) - 500GB gp3 not attached to any instance
   Savings: $40/month
   Action: `aws ec2 delete-volume --volume-id vol-abc123`

2. **Idle NAT Gateway** (nat-xyz789) - No traffic for 7 days
   Savings: $32.40/month
   Action: Delete via VPC console if not needed

3. **2 Old EBS Snapshots** - Over 90 days old
   Savings: $10/month
   Action: Review and delete if no longer needed for recovery

Would you like me to explain any of these in more detail?"

The AI Copilot:

Prioritizes HIGH confidence items first
Explains the detection methodology when asked
Provides ready-to-run AWS CLI commands
Warns about potential impacts before destructive actions

📚 AWS Pricing References

Service	Official Pricing URL
EC2	https://aws.amazon.com/ec2/pricing/on-demand/
EBS	https://aws.amazon.com/ebs/pricing/
RDS	https://aws.amazon.com/rds/pricing/
DynamoDB	https://aws.amazon.com/dynamodb/pricing/
VPC (EIP, NAT)	https://aws.amazon.com/vpc/pricing/
ELB	https://aws.amazon.com/elasticloadbalancing/pricing/
Lambda	https://aws.amazon.com/lambda/pricing/
S3	https://aws.amazon.com/s3/pricing/
ECR	https://aws.amazon.com/ecr/pricing/
EFS	https://aws.amazon.com/efs/pricing/
FSx	https://aws.amazon.com/fsx/pricing/
Glue	https://aws.amazon.com/glue/pricing/
CloudWatch	https://aws.amazon.com/cloudwatch/pricing/
CloudTrail	https://aws.amazon.com/cloudtrail/pricing/
Route53	https://aws.amazon.com/route53/pricing/
CloudFront	https://aws.amazon.com/cloudfront/pricing/
Global Accelerator	https://aws.amazon.com/global-accelerator/pricing/
Kinesis	https://aws.amazon.com/kinesis/data-streams/pricing/
EMR	https://aws.amazon.com/emr/pricing/
MSK	https://aws.amazon.com/msk/pricing/
Amazon MQ	https://aws.amazon.com/amazon-mq/pricing/
ElastiCache	https://aws.amazon.com/elasticache/pricing/
Redshift	https://aws.amazon.com/redshift/pricing/
OpenSearch	https://aws.amazon.com/opensearch-service/pricing/
Neptune	https://aws.amazon.com/neptune/pricing/
DocumentDB	https://aws.amazon.com/documentdb/pricing/
Timestream	https://aws.amazon.com/timestream/pricing/
QLDB	https://aws.amazon.com/qldb/pricing/
SageMaker	https://aws.amazon.com/sagemaker/pricing/
WorkSpaces	https://aws.amazon.com/workspaces/pricing/
Lightsail	https://aws.amazon.com/lightsail/pricing/
Elastic Beanstalk	https://aws.amazon.com/elasticbeanstalk/pricing/
ECS	https://aws.amazon.com/ecs/pricing/
Step Functions	https://aws.amazon.com/step-functions/pricing/
AppSync	https://aws.amazon.com/appsync/pricing/
Transfer Family	https://aws.amazon.com/aws-transfer-family/pricing/
API Gateway	https://aws.amazon.com/api-gateway/pricing/
Secrets Manager	https://aws.amazon.com/secrets-manager/pricing/
KMS	https://aws.amazon.com/kms/pricing/
Compute Optimizer	https://aws.amazon.com/compute-optimizer/
Savings Plans	https://aws.amazon.com/savingsplans/

🔧 Threshold Configuration

CloudWise uses a two-tier threshold system to give you complete control over waste detection and notifications:

Understanding the Two Threshold Systems

┌─────────────────────────────────────────────────────────────────────────┐
│                        TIER 1: DETECTION PHASE                          │
│  Per-Account Setting (aws-accounts table)                               │
│                                                                         │
│  AWS Account Scan → Find $0.20 EBS gp2→gp3 migration opportunity        │
│  Check: Is $0.20 >= min_waste_threshold_usd ($0.01)?                    │
│  Result: YES → Store in database ✓                                      │
└─────────────────────────────────────────────────────────────────────────┘
                                    ↓
┌─────────────────────────────────────────────────────────────────────────┐
│                        TIER 2: NOTIFICATION PHASE                       │
│  Per-User Setting (users table → notification_preferences)              │
│                                                                         │
│  Check user's total waste findings → Total: $45.00/month                │
│  Check: Is $45.00 >= min_savings_threshold ($50.00)?                    │
│  Result: NO → Don't send email notification                             │
└─────────────────────────────────────────────────────────────────────────┘

Tier 1: Account-Level Detection Threshold

These settings control what waste items are discovered and stored in the database during scans.

Setting	Default	Range	Description
`cloudwatch_enabled`	`true`	on/off	Enable CloudWatch metrics for accurate utilization analysis
`min_waste_threshold_usd`	`$0.01`	$0.01 - $100	Minimum monthly savings to include a waste item

Where to configure:

Go to AWS Accounts page (Settings → AWS Accounts)
Click the Edit (pencil) icon on any connected account
Scroll to Waste Detection Settings section
Adjust the settings and click Update Account

Impact: Items below min_waste_threshold_usd are never stored in the database. With the default of $0.01, virtually all waste is captured.

Tier 2: User-Level Notification Threshold

These settings control when email and Slack notifications are sent about waste findings.

Setting	Default	Range	Description
`min_savings_threshold`	`$10.00`	$1 - $1000	Minimum total monthly savings to trigger notification
`frequency`	`daily`	daily/weekly/monthly	How often to send waste summary notifications
`high_priority_immediate`	`true`	on/off	Send immediately for high-priority findings
`high_priority_threshold`	`$50.00`	$10 - $500	Monthly savings threshold for high priority

Slack notifications: Waste detection can send summaries to a dedicated Slack channel. The message includes total findings, potential monthly savings, top 5 resources, and a direct link to the findings view in your Workspace.

To enable Slack for waste detection:

Go to Settings → Notifications
Find the Waste Detection card
Enable Slack notifications and paste a webhook URL
Click Test Connection to verify

See the Slack Integration Guide for full multi-channel setup.

Where to configure:

Go to Settings → Notifications
Find the Waste Findings section
Adjust your notification preferences
Click Save Changes

Impact: Emails are only sent if total waste across all accounts exceeds min_savings_threshold.

Recommended Settings by Use Case

Use Case	Detection Threshold	Notification Threshold	Notes
Catch Everything (default)	$0.01	$10.00	See all waste, get notified for meaningful amounts
High-Volume Accounts	$1.00	$50.00	Reduce noise from micro-optimizations
Enterprise	$5.00	$100.00	Focus only on significant waste
Cost-Conscious Startup	$0.01	$1.00	Every penny counts

Configuration Examples

Example 1: Default (Catch Everything)

Account Detection:   $0.01  → Stores $0.05 EBS snapshot, $2.30 idle EC2
User Notification:   $10.00 → Email sent if total waste > $10/month

Example 2: Enterprise (Reduce Noise)

Account Detection:   $5.00  → Ignores $0.05 EBS snapshot, stores $2.30 idle EC2
User Notification:   $100.00 → Email only for significant waste

CloudWatch Metrics Toggle

The CloudWatch Metrics toggle affects detection accuracy:

Setting	Detection Method	Confidence
Enabled (recommended)	Uses actual CPU, memory, network metrics	HIGH
Disabled	Uses configuration patterns only	MEDIUM

When to disable CloudWatch:

Accounts with strict IAM policies preventing CloudWatch access
Reducing CloudWatch API costs (minimal impact, ~$0.01/scan)

🔧 Additional Configuration

Waste detection includes additional thresholds configurable by CloudWise support:

Setting	Default	Description
`ec2_idle_cpu_threshold`	5%	CPU below this = idle
`ec2_idle_days`	14	Days to analyze
`rds_idle_days`	14	Days to analyze connections
`snapshot_age_days`	90	Snapshots older than this are flagged

Contact support to adjust these for your organization.

💡 Best Practices

Start with HIGH confidence items - These are guaranteed savings
Review MEDIUM confidence items - Check for exceptions before acting
Use LOW confidence items for cleanup - Not cost savings, but good hygiene
Enable CloudWatch metrics - Required for idle detection
Tag your resources - Helps identify ownership for waste items
Schedule regular reviews - Run waste detection monthly

Last updated: January 2026

🎯 Our Philosophy: Accuracy Over Everything​

📊 Confidence Levels Explained​

HIGH Confidence ✅​

MEDIUM Confidence ⚠️​

LOW Confidence ⚡​

📊 Complete Detector Inventory​

Detector Overview by Category​

📴 Air-Gapped Mode vs Online Mode​

Complete Detector Reference Table​

Compute Optimization (34 Detectors)​

Storage Optimization (26 Detectors)​

Database Optimization (40 Detectors)​

Network Optimization (19 Detectors)​

Serverless Optimization (10 Detectors)​

Analytics Optimization (24 Detectors)​

ML/AI Optimization (6 Detectors)​

Management & Operations (7 Detectors)​

Security & Compliance (12 Detectors)​

AWS Compute Optimizer Integration (4 Detectors)​

Reserved Instance, Savings Plans & Commitment Risk Intelligence (16 Detectors)​

Complete Waste Type Coverage Matrix (191 Types)​

Mode Definitions​

Coverage Summary​

Complete Coverage Matrix​

Compute Optimization (29 Types)​

Storage Optimization (26 Types)​

Database Optimization (39 Types)​

Network Optimization (17 Types)​

Serverless Optimization (11 Types)​

Analytics Optimization (24 Types)​

ML/AI Optimization (5 Types)​

Management & Operations (7 Types)​

Security & Compliance (12 Types)​

IPv4 Address Optimization (3 Types)​

Additional Compute Detectors (8 Types)​

Additional Storage Detectors (4 Types)​

Additional Network Detectors (3 Types)​

CUR-Derived Insights (4 Types)​

AWS Compute Optimizer Integration (4 Types) 🔌​

Reserved Instance & Savings Plans (16 Types) 🔌​

Online-Only Waste Types (20)​

Detector Reference​

Compute Detectors​

Idle EC2 Instance​

Stopped EC2 with EBS Storage​

Over-Provisioned Lambda​

Idle Provisioned Concurrency​

Excessive Lambda Timeout​

ARM64 Migration Opportunity​

Oversized ECS Task​

Idle SageMaker Notebook​

Idle SageMaker Endpoint​

Idle WorkSpaces​

Idle Lightsail Instance​

Unattached Lightsail Static IP​

Unattached Lightsail Disk​

Old Lightsail Snapshot​

Idle Lightsail Load Balancer​

Idle Lightsail Database​

Idle Elastic Beanstalk Environment​

Idle Beanstalk Environment (Zero Traffic)​

Unnecessary Load Balancer on Single-Instance Beanstalk​

Beanstalk Previous-Generation Instances​

Over-Provisioned Beanstalk Environment​

Orphaned RDS from Beanstalk Environment​

EKS Extended Support Cost​

Oversized WorkSpaces​

WorkSpaces AutoStop Opportunity​

WorkSpaces Pool Overprovisioned Capacity​

WorkSpaces Windows License Optimization​

Containers Detectors​

Idle ECS Service​

ECS Without Auto-Scaling​

Container Insights Waste​

Oversized ECS Memory​

Storage Detectors​

Unattached EBS Volume​

gp2 → gp3 Migration​

Old EBS Snapshots​

Orphaned EBS Snapshots​

🎯 Our Philosophy: Accuracy Over Everything

📊 Confidence Levels Explained

HIGH Confidence ✅

MEDIUM Confidence ⚠️

LOW Confidence ⚡

📊 Complete Detector Inventory

Detector Overview by Category

📴 Air-Gapped Mode vs Online Mode

Complete Detector Reference Table

Compute Optimization (34 Detectors)

Storage Optimization (26 Detectors)

Database Optimization (40 Detectors)

Network Optimization (19 Detectors)

Serverless Optimization (10 Detectors)

Analytics Optimization (24 Detectors)

ML/AI Optimization (6 Detectors)

Management & Operations (7 Detectors)

Security & Compliance (12 Detectors)

AWS Compute Optimizer Integration (4 Detectors)

Reserved Instance, Savings Plans & Commitment Risk Intelligence (16 Detectors)

Complete Waste Type Coverage Matrix (191 Types)

Mode Definitions

Coverage Summary

Complete Coverage Matrix

Compute Optimization (29 Types)

Storage Optimization (26 Types)

Database Optimization (39 Types)

Network Optimization (17 Types)

Serverless Optimization (11 Types)

Analytics Optimization (24 Types)

ML/AI Optimization (5 Types)

Management & Operations (7 Types)

Security & Compliance (12 Types)

IPv4 Address Optimization (3 Types)

Additional Compute Detectors (8 Types)

Additional Storage Detectors (4 Types)

Additional Network Detectors (3 Types)

CUR-Derived Insights (4 Types)

AWS Compute Optimizer Integration (4 Types) 🔌

Reserved Instance & Savings Plans (16 Types) 🔌

Online-Only Waste Types (20)

Detector Reference

Compute Detectors

Idle EC2 Instance

Stopped EC2 with EBS Storage

Over-Provisioned Lambda

Idle Provisioned Concurrency

Excessive Lambda Timeout

ARM64 Migration Opportunity

Oversized ECS Task

Idle SageMaker Notebook

Idle SageMaker Endpoint

Idle WorkSpaces

Idle Lightsail Instance

Unattached Lightsail Static IP

Unattached Lightsail Disk

Old Lightsail Snapshot

Idle Lightsail Load Balancer

Idle Lightsail Database

Idle Elastic Beanstalk Environment

Idle Beanstalk Environment (Zero Traffic)

Unnecessary Load Balancer on Single-Instance Beanstalk

Beanstalk Previous-Generation Instances

Over-Provisioned Beanstalk Environment

Orphaned RDS from Beanstalk Environment

EKS Extended Support Cost

Oversized WorkSpaces

WorkSpaces AutoStop Opportunity

WorkSpaces Pool Overprovisioned Capacity

WorkSpaces Windows License Optimization

Containers Detectors

Idle ECS Service

ECS Without Auto-Scaling

Container Insights Waste

Oversized ECS Memory

Storage Detectors

Unattached EBS Volume

gp2 → gp3 Migration

Old EBS Snapshots

Orphaned EBS Snapshots