Identify RDS Rightsizing
Cloud/Service/Type- AWS
- RDS
- Rightsizing
The database instance is considered over-provisioned when CPU utilization has at least 20% margin and memory utilization has at least 10% margin consistently over a 14-day lookback period, with no cost threshold minimum. Explanation
This automation identifies RDS instances that are consistently running with significant headroom in both CPU and memory resources. By analyzing performance metrics over a two-week period, it provides reliable recommendations that account for normal workload fluctuations while identifying true over-provisioning. Remediation Direction
Downsize the RDS instance to an appropriately sized instance type. The automation allows for cross-family recommendations (Restrict Same Family = False) but will only recommend current generation instance types (Enforce Current Gen = True) to ensure modern performance characteristics and AWS support.
Identify Unattached EIP
Cloud/Service/Type- AWS
- EC2/Networking
- Idle Resource
Elastic IP addresses that are not associated with any running EC2 instance or network interface for any period of time. Explanation
Unattached Elastic IP addresses incur hourly charges when not associated with a running instance. Unlike many AWS resources that only generate costs when in use, EIPs continue to generate charges even when idle. AWS charges for EIPs specifically to discourage users from reserving public IPv4 addresses that aren’t being utilized. Remediation Direction
Release (delete) unattached Elastic IP addresses that are not immediately planned for use. If the EIP needs to be preserved for future use, consider documenting its purpose and planned usage date to justify the ongoing cost. For temporary workloads, consider using auto-assigned public IPs instead of EIPs.
Identify Orphaned EBS Volume Risk
Cloud/Service/Type- AWS
- EC2/EBS
- Best Practice
EBS volumes that have the “DeleteOnTermination” flag set to False, creating risk of orphaned volumes when their associated EC2 instances are terminated. Explanation
When an EC2 instance is terminated, any attached EBS volumes with the “DeleteOnTermination” flag set to False will persist and continue incurring charges. This commonly leads to forgotten, unused volumes that accumulate costs over time. These orphaned volumes often contain no critical data but continue to generate unnecessary expenses in the environment. Remediation Direction
Modify the “DeleteOnTermination” attribute to True for volumes that should be removed when their instances are terminated. For volumes containing data that should persist beyond instance lifecycle, implement proper tagging and monitoring to ensure they’re tracked and eventually cleaned up when no longer needed.
Identify RDS Extended Support
Cloud/Service/Type- AWS
- RDS
- Best Practice
RDS instances running on versions approaching or already in AWS Extended Support, based on a configurable expiration threshold that provides advance notification before extended support begins. Explanation
When RDS engine versions reach end of standard support, AWS moves them to Extended Support, which incurs significant additional charges (typically 1.5-3x the standard RDS cost). These instances continue to function but at substantially higher cost. Extended Support is designed as a temporary bridge to provide time for version upgrades, not as a long-term operating model. Remediation Direction
Plan and execute upgrades to supported RDS engine versions before the extended support date. Prioritize instances based on their expiration dates and business criticality. For complex applications, establish testing environments to validate compatibility with newer database versions and resolve any potential issues before production migration.
ElastiCache RI Recommender
Cloud/Service/Type- AWS
- ElastiCache
- Cost Optimization
ElastiCache instances running on-demand for extended periods that could benefit from Reserved Instance (RI) pricing, based on consistent usage patterns and predictable workloads. Explanation
On-demand ElastiCache instances provide flexibility but at premium pricing. For stable, predictable ElastiCache workloads, significant cost savings (up to 60%) can be achieved through Reserved Instance commitments. This automation identifies ElastiCache clusters with consistent usage patterns where the financial benefits of RIs would outweigh the commitment risk. Remediation Direction
Purchase appropriate ElastiCache Reserved Instances based on the recommendation, considering term length (1 or 3 years) and payment option (no upfront, partial upfront, all upfront) that best align with your organization’s cash flow preferences and commitment tolerance. Maintain regular reviews of RI coverage and utilization to optimize future purchases.
RDS RI Recommender
Cloud/Service/Type- AWS
- RDS
- Cost Optimization
RDS database instances running on-demand for consistent periods that could benefit from Reserved Instance (RI) pricing, based on stable usage patterns and predictable workloads. Explanation
On-demand RDS instances offer flexibility but at higher prices. For stable database workloads with predictable usage, Reserved Instances can provide substantial discounts (up to 60% depending on term and payment options). This automation analyzes historical RDS usage patterns to identify instances that would benefit financially from RI commitments based on their steady-state operation. Remediation Direction
Purchase appropriate RDS Reserved Instances matching the engine type, instance size, and region based on the recommendations. Consider term length (1 or 3 years) and payment option that best align with your financial planning horizon. For multi-AZ deployments, ensure RIs are purchased to cover all instances. Regularly review RI utilization to maintain optimal coverage as your database fleet evolves.
Upgrade RDS Storage from GP2 to GP3
Cloud/Service/Type- AWS
- RDS
- Optimization/Best Practice
RDS instances using older GP2 storage volumes when newer, more flexible GP3 storage is available and can provide potentially better performance characteristics for the same cost. Explanation
While GP2 and GP3 have the same base storage pricing for RDS, GP3 offers more predictable and flexible performance management. GP3 allows independent provisioning of IOPS and throughput, rather than the automatic but less predictable scaling of GP2. This gives you better control over performance and can prevent unexpected performance issues during peak loads when properly configured. Remediation Direction
Modify existing RDS instances to use GP3 storage instead of GP2, focusing on optimizing the IOPS and throughput settings based on your workload patterns. This change can be made with minimal downtime during a maintenance window. When migrating, analyze historical performance metrics to properly configure the GP3 performance parameters to match or exceed your current GP2 performance.
Identify Idle CloudWatch Log Groups
Cloud/Service/Type- AWS
- CloudWatch
- Idle Resource
CloudWatch Log Groups that have not received any new log events for an extended period (typically 30+ days) while still incurring storage costs. Explanation
CloudWatch Log Groups continue to generate storage charges regardless of whether they’re actively receiving logs. Many log groups are created for temporary workloads or troubleshooting but never cleaned up after use. These idle log groups accumulate over time, especially in environments with automated infrastructure deployment, creating ongoing unnecessary costs. Remediation Direction
Set appropriate retention periods for log groups based on compliance and operational requirements, then delete idle log groups that are no longer needed. For critical systems, consider exporting historical logs to S3 with lifecycle policies before deletion to reduce costs while maintaining access to historical data if needed in the future.
Identify Idle OpenSearch Domains
Cloud/Service/Type- AWS
- OpenSearch
- Idle Resource
OpenSearch domains that consistently show minimal resource utilization (below the configured CPU and memory thresholds) during the specified lookback period, indicating they are significantly overprovisioned or not actively used. Explanation
OpenSearch domains can be expensive resources due to their compute, storage, and memory requirements. Idle or minimally used OpenSearch clusters continue to incur full costs regardless of actual utilization. These often result from test environments, completed projects, or workloads that were migrated but not properly decommissioned. Remediation Direction
For truly idle domains, consider decommissioning them entirely after confirming they’re no longer needed. For underutilized domains still serving a purpose, downsize the instance types or reduce the number of nodes to match actual workload requirements. If the domain has sporadic usage patterns, evaluate if on-demand querying of data in S3 might be more cost-effective than maintaining a persistent cluster.
Identify Unattached EBS Volumes
Cloud/Service/Type- AWS
- EC2/EBS
- Idle Resource
EBS volumes that are not attached to any EC2 instance and have remained in this state for a period of time, continuing to incur storage charges without providing any value. Explanation
Unattached EBS volumes generate the same storage costs as attached volumes. These often result from terminated instances where the DeleteOnTermination flag was set to False, or from manual detachment during troubleshooting that was never followed up with reattachment or deletion. As these volumes accumulate over time, they can represent significant unnecessary costs. Remediation Direction
Take snapshots of unattached volumes if they might contain valuable data, then delete the volumes. Implement tagging policies to mark volumes with their purpose and expiration date when intentionally keeping them unattached. Consider using automated lifecycle policies to identify and alert on volumes that remain unattached beyond a certain threshold.
SageMaker Savings Plans
Cloud/Service/Type- AWS
- SageMaker
- Cost Optimization
SageMaker workloads consistently running on-demand over the specified lookback period that could benefit from Savings Plans commitments, where the potential savings exceed the defined cost threshold. Explanation
SageMaker on-demand pricing provides flexibility but at premium rates. For persistent SageMaker workloads with predictable usage patterns, Savings Plans can provide significant discounts (up to 64% depending on term and payment options). This automation analyzes historical SageMaker usage across the specified account scope to identify cost-effective commitment opportunities based on consistent usage. Remediation Direction
Purchase appropriate SageMaker Savings Plans based on the recommendation, considering the term length (1 or 3 years) and payment option (no upfront, partial upfront, all upfront) that best align with your organization’s financial strategy. Monitor utilization regularly to ensure optimal coverage as workloads evolve, and consider additional purchases as new stable workloads are identified.
Identify Idle Redshift Clusters
Cloud/Service/Type- AWS
- Redshift
- Idle Resource
Redshift clusters that show minimal query activity or resource utilization over the 30-day lookback period, indicating they are not actively being used while continuing to incur full costs. Explanation
Idle Redshift clusters can be significant sources of waste as they incur the same costs regardless of actual utilization. These often result from completed analytics projects, development/testing environments, or workloads that were migrated but never properly decommissioned. Redshift’s pricing model is based on provisioned capacity rather than consumption, making idle clusters particularly costly. Remediation Direction
For truly idle clusters, export any valuable data to S3 and then terminate the cluster. For clusters with sporadic usage, consider pausing them during periods of inactivity or switching to Redshift Serverless to better align costs with actual usage. Evaluate if low-usage clusters could benefit from a reduction in node count or sizing to better match actual workload requirements.
Identify ASG Configuration Optimization
Cloud/Service/Type- AWS
- EC2/Auto Scaling
- Best Practice/Rightsizing
Auto Scaling Groups with suboptimal configurations identified by AWS Trusted Advisor, such as consistently operating at minimum capacity, having inappropriate scaling thresholds, or using inefficient instance types. Explanation
Improperly configured ASGs can lead to waste in multiple ways: over-provisioned minimum capacities maintain unnecessary instances during low demand, inefficient scaling policies cause slow responses to demand changes, and outdated instance type selections miss opportunities for better price-performance. AWS Trusted Advisor analyzes historical patterns and instance utilization to identify these configuration issues. Remediation Direction
Adjust minimum and maximum capacity settings to better align with actual workload patterns. Review and tune scaling policies to respond more efficiently to workload changes, considering both scaling in and out behaviors. Evaluate recommended instance types that provide the same or better performance at lower cost, potentially including newer generation instances with better price-performance ratios.
Identify Idle S3 Buckets
Cloud/Service/Type- AWS
- S3
- Idle Resource
S3 buckets that have received no GET requests or other access activity for 60 days or more, indicating they contain data that is not being actively used or accessed. Explanation
Idle S3 buckets generate ongoing storage costs regardless of whether the data is actively used. While individual S3 storage costs may seem small compared to compute resources, they accumulate significantly over time, especially for large data stores. These idle buckets often contain old project data, test files, or backups that are no longer needed but were never cleaned up. Remediation Direction
Review the content of idle buckets to determine if the data can be deleted or should be retained. For data that must be kept but is rarely accessed, transition objects to less expensive storage classes like S3 Glacier or S3 Glacier Deep Archive. Implement lifecycle policies to automatically move objects to appropriate storage tiers based on age and access patterns, or to delete temporary data after specific retention periods.
Identify RDS Graviton Eligibility
Cloud/Service/Type- AWS
- RDS
- Optimization
RDS instances running on x86-based instance types that could be migrated to ARM-based Graviton processors for better price-performance without compatibility issues. Explanation
AWS Graviton processors offer significant cost savings (up to 20-40%) compared to equivalent x86 instances while often delivering better performance for database workloads. This automation identifies RDS instances that are compatible with Graviton architecture and would benefit from migration, considering factors like engine type, version compatibility, and instance family. Remediation Direction
Plan instance type modifications to move eligible RDS instances to equivalent Graviton-based instance types (instances with “g” in the name, like db.m6g or db.r6g). This change typically requires a maintenance window with some downtime. Test the database performance after migration to ensure it meets or exceeds previous levels, and adjust parameter groups if necessary to optimize for the Graviton architecture.
Compute Saving Plans Recommendation
Cloud/Service/Type- AWS
- EC2/Lambda/Fargate
- Cost Optimization
Consistent on-demand usage of compute services (EC2, Lambda, and Fargate) that could benefit from Compute Savings Plans commitments based on stable usage patterns across the organization. Explanation
On-demand pricing for compute services provides maximum flexibility but at premium rates. Compute Savings Plans offer significant discounts (up to 66% depending on term and payment options) in exchange for a commitment to a consistent amount of compute usage measured in dollars per hour. Unlike Reserved Instances, Compute Savings Plans provide flexibility to change instance types, sizes, families, regions, and even services without losing the discount. Remediation Direction
Purchase appropriate Compute Savings Plans based on the recommendation, considering term length (1 or 3 years) and payment option that best aligns with your organization’s financial objectives. Focus initial purchases on covering the most stable baseline usage, then incrementally add more coverage as confidence in long-term usage patterns increases. Monitor utilization rates regularly to maintain optimal coverage as your compute fleet evolves.
Identify EBS Configuration Optimization
Cloud/Service/Type- AWS
- EC2/EBS
- Best Practice/Optimization
EBS volumes with configuration inefficiencies identified by AWS Trusted Advisor, such as over-provisioned IOPS, suboptimal volume types for the workload, or unnecessarily large volumes with low utilization. Explanation
Improperly configured EBS volumes often lead to unnecessary costs through over-provisioning. This includes using Provisioned IOPS (io1/io2) volumes for workloads that don’t require high performance, maintaining high IOPS levels that are never utilized, or provisioning excessive storage capacity. Trusted Advisor analyzes historical performance metrics to identify these mismatches between configuration and actual usage patterns. Remediation Direction
Modify EBS volumes to more appropriate types based on the actual workload requirements (e.g., switching from Provisioned IOPS to GP3). Adjust provisioned IOPS and throughput levels to better align with observed usage patterns. For volumes with low capacity utilization, consider resizing to appropriate levels after ensuring there are no future growth requirements that justify the current size.
Identify Idle ECR Images
Cloud/Service/Type- AWS
- ECR
- Idle Resource
Container images in Amazon ECR repositories that have not been pulled or updated for at least 30 days and are at least 30 days old, indicating they are no longer actively used in deployments. Explanation
Idle ECR images consume storage and incur ongoing costs regardless of whether they’re being used. As organizations build and push new container images, older unused versions accumulate, especially in CI/CD environments with frequent builds. These idle images not only generate unnecessary storage costs but can also make repository management more complex and increase the risk of deploying outdated images. Remediation Direction
Implement ECR lifecycle policies to automatically expire and delete images based on age and usage patterns. Consider retaining only the most recent versions of images for each major release while purging intermediate builds. Tag important images (like production releases) with specific tags to exempt them from automated cleanup, and document your image retention policies to align with your deployment and rollback requirements.
Identify Idle Kinesis Streams
Cloud/Service/Type- AWS
- Kinesis
- Idle Resource
Kinesis streams that show minimal or no data ingestion (PUT records) and consumption (GET records) activity over an extended period while continuing to incur charges for provisioned shards. Explanation
Kinesis Data Streams are billed based on the number of shards provisioned, regardless of actual throughput utilization. Idle streams with provisioned capacity continue to generate costs even when not processing data. These often result from completed projects, test environments, or data pipelines that were migrated or decommissioned without properly cleaning up the stream resources. Remediation Direction
Delete truly idle streams after confirming they’re no longer needed by any applications or workflows. For streams with intermittent usage, consider implementing on-demand capacity mode (if available for your use case) or reducing the number of provisioned shards to match actual throughput requirements. Document any intentionally maintained streams with appropriate tagging to prevent accidental deletion during future cleanup efforts.
RDS Aurora IOPS Optimized
Cloud/Service/Type- AWS
- RDS Aurora
- Optimization
Aurora RDS instances operating with standard storage configuration but demonstrating I/O patterns that would benefit from the IOPS-optimized storage configuration based on their workload characteristics and performance requirements. Explanation
Aurora offers an IOPS-optimized storage configuration designed for I/O-intensive workloads that can provide better performance predictability for databases with high write throughput requirements. The standard configuration is adequate for many workloads, but databases with heavy write operations or specific I/O patterns may experience better performance and potentially lower costs by switching to the optimized configuration that aligns with their actual usage patterns. Remediation Direction
Modify eligible Aurora clusters to use the IOPS-optimized storage configuration. This change requires a maintenance window and may cause brief downtime. Before making the change, analyze the I/O patterns carefully to ensure the workload will genuinely benefit from the optimized configuration, as this option is best suited for specific workload types with high, consistent I/O demands rather than general-purpose database workloads.
Identify EC2 Rightsizing
Cloud/Service/Type- AWS
- EC2
- Rightsizing
EC2 instances consistently operating with low CPU utilization, underutilized network capacity, or suboptimal EBS I/O and throughput patterns over an extended period, indicating they are over-provisioned for their actual workload requirements. Explanation
Over-provisioned EC2 instances waste cloud spend by providing computing resources that remain largely unused. AWS Compute Optimizer analyzes multiple performance metrics including CPU utilization, network throughput, and EBS I/O patterns to identify instances that could be downsized without impacting performance. Since EC2 charges are based on the provisioned capacity regardless of utilization, right-sizing represents one of the most impactful FinOps optimizations. Remediation Direction
Modify instance types to appropriately sized options based on the comprehensive historical utilization patterns identified by Compute Optimizer. Consider switching to instance families better aligned with the workload characteristics (compute-optimized, storage-optimized, etc.). For workloads with variable resource needs, evaluate using Auto Scaling with appropriate instance types rather than statically provisioning for peak capacity.
S3 Lifecycle Incomplete Partial Upload
Cloud/Service/Type- AWS
- S3
- Best Practice
S3 buckets containing incomplete multipart uploads that have been abandoned and left in a partial state, continuing to accrue storage charges while providing no functional value. Explanation
When multipart uploads to S3 are initiated but not completed or aborted, the partially uploaded data remains in storage and incurs the same charges as complete objects. These incomplete uploads often result from failed transfers, application errors, or interrupted processes. Since they’re not visible in standard object listings, these abandoned uploads can go unnoticed for extended periods while generating ongoing costs. Remediation Direction
Implement S3 Lifecycle rules to automatically remove incomplete multipart uploads after a specified period (typically 7 days). Configure application logic to properly abort multipart uploads if the process cannot be completed successfully. Regularly audit buckets for existing incomplete uploads and remove them to immediately reduce current storage costs.
Identify Idle Snapshots
Cloud/Service/Type- AWS
- EC2/EBS
- Idle Resource
EBS snapshots that are at least 90 days old and AMIs that have not been used to launch instances for at least 90 days, indicating they are likely no longer needed for recovery or deployment purposes. Explanation
Idle snapshots and unused AMIs continue to incur storage charges regardless of whether they’re being actively used. Organizations often create snapshots and AMIs for temporary backups, golden images, or disaster recovery but fail to implement cleanup processes. As these resources accumulate over time, they can represent significant ongoing costs, especially for larger volumes or when stored across multiple regions. Remediation Direction
Review idle snapshots and AMIs to confirm they are truly no longer needed before deletion. Implement proper tagging for snapshots and AMIs to indicate purpose, expiration date, and associated resources. Create automated lifecycle policies to archive or delete snapshots/AMIs based on age and usage patterns, while ensuring critical recovery points are preserved according to backup retention requirements.
AWS Cost Anomalies
Cloud/Service/Type- AWS
- Cost Management
- Best Practice
Unexpected or significant deviations from established spending patterns across AWS services, indicating potential waste, misconfiguration, or unauthorized usage that requires investigation. Explanation
Cost anomalies represent sudden changes in spending that fall outside normal usage patterns for your organization. These anomalies can result from legitimate business needs (like project launches or scaling events) but often indicate waste-generating issues such as misconfigurations, forgotten test resources, or compromised credentials. Early detection of these anomalies helps prevent small issues from becoming significant financial problems. Remediation Direction
Investigate identified anomalies promptly to determine root causes. Enable and configure AWS Cost Anomaly Detection to automatically identify unusual spending patterns across your organization, setting appropriate thresholds based on your normal spending fluctuations. Implement notification mechanisms to alert appropriate teams when anomalies are detected, and establish clear response procedures for investigating and addressing these alerts.
Identify Idle Load Balancer
Cloud/Service/Type- AWS
- ELB/ALB/NLB
- Idle Resource
Load balancers that have processed minimal or no traffic (measured by request count or data transfer) over a 336-hour (14-day) lookback period, indicating they are not actively distributing traffic to any backend services. Explanation
Idle load balancers continue to generate hourly charges regardless of traffic volume. These often result from decommissioned applications where the load balancer was forgotten, temporary testing environments, or staging systems that are no longer in use. Each idle load balancer typically costs $16-25 per month plus data processing charges, making them expensive resources to leave running unnecessarily. Remediation Direction
Delete truly idle load balancers after confirming they’re not serving any critical applications. Document load balancer configurations before deletion in case they need to be recreated in the future. Implement proper tagging strategies to identify purpose, associated application, and responsible team to facilitate regular reviews of load balancer utilization across the environment.
Identify Idle RDS
Cloud/Service/Type- AWS
- RDS
- Idle Resource
RDS instances with zero maximum database connections over a 14-day lookback period, indicating they are not being accessed or utilized by any applications while continuing to incur full charges. Explanation
Idle RDS instances generate the same costs as active databases regardless of usage. These often result from completed projects, development environments, or applications that were migrated without proper decommissioning. Since RDS pricing is based on provisioned capacity rather than actual utilization, idle instances represent pure waste with no corresponding business value. Remediation Direction
Take a final snapshot of idle RDS instances before termination if data retention might be needed. For databases that must be preserved but aren’t actively used, consider stopping the instance (for up to 7 days) or exporting the data to S3 and terminating the instance. Implement proper instance tagging to document purpose, expiration dates, and contacts to facilitate easier identification of truly idle resources.
Identify EKS Clusters Extended Support
Cloud/Service/Type- AWS
- EKS
- Best Practice
EKS clusters running on Kubernetes versions approaching or already in extended support period, with an expiration threshold of 100 days to provide advance notification before the version enters extended support. Explanation
When Kubernetes versions in EKS reach end of standard support, AWS moves them to extended support, which incurs additional charges while providing a limited window for upgrades. Extended support typically costs an additional 1.5-2x the standard EKS cost per cluster. These additional charges are designed to encourage timely upgrades rather than serve as a long-term operational model. Remediation Direction
Plan and execute Kubernetes version upgrades before the extended support date. Review application compatibility with newer Kubernetes versions and address any potential issues before upgrading production clusters. Consider a phased approach by upgrading non-production clusters first to identify any compatibility challenges. Document the upgrade process and create a regular cadence for Kubernetes version reviews to avoid future extended support scenarios.
Identify EC2 Stopped Instances
Cloud/Service/Type- AWS
- EC2
- Idle Resource
EC2 instances that are at least 30 days old and are currently in a stopped state, generating ongoing charges for attached EBS volumes and provisioned Elastic IP addresses without providing any compute value. Explanation
Stopped EC2 instances do not incur compute charges but continue to generate costs for associated resources, primarily EBS volumes. These instances often indicate temporary shutdowns that were never revisited, failed projects, or test environments that were abandoned. Long-term stopped instances represent an opportunity cost by tying up allocated resources and generating maintenance overhead. Remediation Direction
Evaluate whether stopped instances are truly needed before taking action. For instances no longer required, take final AMI or EBS snapshots if necessary, then terminate the instances to eliminate all associated charges. For instances needed in the future, consider creating AMIs and terminating the instances rather than keeping them in a stopped state. Implement proper tagging to document purpose and planned usage dates for any intentionally stopped instances.
Identify Idle NAT Gateway
Cloud/Service/Type- AWS
- VPC/Networking
- Idle Resource
NAT Gateways that have processed no data transfer over a 336-hour (14-day) lookback period, indicating they are not actively being used by resources in private subnets to access the internet. Explanation
Idle NAT Gateways incur hourly charges regardless of actual usage. Each deployed NAT Gateway costs approximately $32-45 per month plus data processing fees, making them relatively expensive networking components to maintain when not actively utilized. These idle gateways often result from decommissioned workloads, test environments, or over-provisioned network architectures where private subnets no longer contain active resources. Remediation Direction
Delete NAT Gateways in VPCs with no active resources requiring outbound internet access. For environments with intermittent workloads, consider implementing automation to create and remove NAT Gateways on-demand rather than keeping them provisioned continuously. Ensure proper architecture documentation and tagging to track which NAT Gateways are essential for which workloads.
Identify CloudTrail Optimizations
Cloud/Service/Type- AWS
- CloudTrail
- Best Practice/Cost Optimization
CloudTrail configurations that generate unnecessary costs through duplicate trails, excessive data ingestion, or inefficient storage and retention configurations. Explanation
CloudTrail inefficiencies commonly include multiple overlapping trails recording the same events across accounts or regions, unnecessarily logging read-only API calls, storing logs directly in S3 without lifecycle policies, or maintaining longer retention periods than required by compliance needs. Since CloudTrail costs are based on the volume of events recorded and storage consumed, optimizing these configurations can provide meaningful cost savings without compromising security or compliance. Remediation Direction
Consolidate multiple trails into organization trails where possible to reduce duplication. Configure trails to be selective about the event types they capture, filtering out high-volume, low-value events when not required for compliance. Implement appropriate S3 lifecycle policies to transition older logs to cheaper storage tiers or expire them according to your retention requirements. Consider using CloudTrail Insights selectively rather than enabling it broadly across all trails.
Identify S3 Early Delete Charges
Cloud/Service/Type- AWS
- S3
- Best Practice/Cost Optimization
S3 objects that are deleted before their minimum storage commitment period across any storage tier with such requirements, resulting in prorated early deletion fees. Explanation
Several S3 storage classes impose minimum storage duration commitments, including Glacier (90 days), Glacier Deep Archive (180 days), Intelligent-Tiering (30 days), and even Standard-IA and One Zone-IA (both 30 days). Deleting objects before these minimum durations results in charges for the remaining commitment period. These charges commonly occur due to improper lifecycle configurations, misunderstanding of storage tier requirements, or temporary data being moved to inappropriate storage classes. Remediation Direction
Audit your S3 billing for early deletion charges and identify the affected buckets and prefixes. Review and adjust lifecycle policies to ensure objects are only transitioned to storage tiers with minimum durations when they will remain stored for the full commitment period. Use Standard storage for objects with unpredictable or short retention needs. Implement tagging strategies to clearly identify expected data lifespans and align storage class decisions with these expectations.
Identify Idle VPC Endpoints
Cloud/Service/Type- AWS
- VPC/Networking
- Idle Resource
VPC Endpoints that show no data transfer activity over an extended period, indicating they are not actively being used by resources within the VPC to access the associated AWS services. Explanation
Idle VPC Endpoints continue to incur hourly charges regardless of actual usage. Interface endpoints (powered by AWS PrivateLink) cost approximately $7-10 per endpoint per month per Availability Zone, plus data processing fees. These idle endpoints often result from decommissioned workloads, test environments, or architectural changes where the services they were configured to access are no longer being used within the VPC. Remediation Direction
Delete unused VPC Endpoints after confirming they’re not essential for any active workloads. Document endpoint configurations before deletion in case they need to be recreated in the future. Implement proper tagging for VPC Endpoints to identify purpose, associated application, and responsible team to facilitate regular reviews of endpoint utilization across the environment.
CloudFront Compression Optimization
Cloud/Service/Type- AWS
- CloudFront
- Best Practice/Optimization
CloudFront distributions where the compression conditions are not fully met: scanning for distributions using legacy cache configurations without proper compression settings, and distributions with cache policies but compression disabled, or configurations where TTLs>0, or either Brotli or Gzip compression is not enabled. Explanation
This automation audits CloudFront distributions for optimal compression configuration. Improper compression settings lead to higher data transfer costs and slower content delivery. The automation collects detailed information about compression-related configuration gaps, identifying which specific compression conditions are not met for each distribution (TTLs, compression enablement, supported algorithms). Remediation Direction
Address the specific compression configuration gaps identified by the automation. For distributions using cache policies, ensure compression is enabled and both Brotli and Gzip algorithms are supported. For distributions using legacy cache settings, verify compression is properly configured. Ensure TTL settings are appropriate for the content being served to maximize caching and compression benefits.
Identify Idle ElastiCache Clusters
Cloud/Service/Type- AWS
- ElastiCache
- Idle Resource
ElastiCache clusters that show minimal or no activity (low CPU utilization, connection count, or cache hits) over a 30-day lookback period, indicating they are not actively being used by applications. Explanation
Idle ElastiCache clusters continue to incur full costs regardless of actual utilization. These often result from decommissioned applications, test environments, or development projects that were never properly cleaned up. Since ElastiCache pricing is based on node hours rather than actual usage, idle clusters represent pure waste with no corresponding business value. Remediation Direction
Delete truly idle clusters after confirming they’re not serving any critical applications. For clusters with intermittent usage, consider downsizing the node types or reducing the number of nodes. Evaluate whether on-demand database caching might be more cost-effective than maintaining persistent cache clusters for workloads with sparse or unpredictable usage patterns.
Identify Lambda Graviton Savings
Cloud/Service/Type
- AWS
- Lambda
- Optimization
Lambda functions running on x86 architecture that could be migrated to ARM-based AWS Graviton processors for better price-performance ratio. Explanation
AWS Graviton processors offer up to 34% better price-performance compared to equivalent x86 instances for Lambda workloads. This automation identifies Lambda functions written in runtimes that are Graviton-compatible (Python, Node.js, Java, .NET Core, Ruby, etc.) and would benefit from ARM migration, especially for functions with significant execution time or high invocation counts. Remediation Direction
Update the Lambda function configuration to use the ARM64 architecture option. Test functions thoroughly after migration to ensure compatibility, especially for any functions using binary dependencies or native extensions. For functions with dependencies on x86-specific libraries, evaluate alternative libraries or containerized Lambda options that support ARM64.
Identify S3 INT Eligibility
Cloud/Service/Type- AWS
- S3
- Optimization
S3 buckets with objects stored in Standard storage class that have unpredictable access patterns and would benefit from Intelligent-Tiering, where projected savings (using a 30% standard savings assumption) would outweigh the monitoring and automation costs. Explanation
S3 Intelligent-Tiering automatically moves objects between access tiers based on actual usage patterns, optimizing costs without performance impact or operational overhead. This automation scans the number of objects per bucket using CloudWatch metrics to calculate the annual monitoring cost, then compares this against projected savings from intelligent tier transitions to identify eligible buckets where net savings would be positive. Remediation Direction
Apply S3 Intelligent-Tiering configuration to identified eligible buckets, either through direct storage class changes or lifecycle policies. Consider selectively applying Intelligent-Tiering to object prefixes with the highest potential savings rather than entire buckets. For small objects where monitoring costs might outweigh savings, maintain them in S3 Standard rather than including them in Intelligent-Tiering.
OpenSearch RI Recommender
Cloud/Service/Type- AWS
- OpenSearch
- Cost Optimization