The average company wastes 32% of its cloud spend. For large enterprises running $5M+/year in cloud, that's over $1.5M per year going directly to AWS, Azure, or GCP for resources nobody actually uses.
This isn't a technology problem. It's a visibility and accountability problem. Once you can see where the waste is, eliminating it is straightforward. Here's exactly how to find it and fix it.
Where Cloud Waste Hides
Before you can optimize, you need to find the waste. Cloud waste comes from five primary sources:
1. Idle and Underutilized Resources
The biggest single category. Includes:
- EC2/VM instances running at under 10% CPU for 30+ days
- Unattached EBS volumes and snapshots older than 90 days
- Idle load balancers with zero traffic
- Unused Elastic IPs (charged even when unattached)
- Dev/test resources that forgot to shut down
Typical savings: 15–25% of total cloud spend
2. Oversized Instances (Right-Sizing Opportunities)
Most teams provision instances based on peak capacity assumptions that never materialize. A database instance provisioned for "future growth" in 2022 running at 8% CPU today is pure waste.
- Right-size based on actual P95 utilization, not theoretical peaks
- Use Compute Optimizer (AWS), Azure Advisor, or Recommender (GCP) signals
- Prioritize instances with >$500/month cost and <20% average CPU
Typical savings: 10–20% of compute costs
3. On-Demand vs. Reserved vs. Spot Pricing
On-demand pricing is a premium. For workloads with predictable baselines, it's money left on the table.
- Reserved Instances (RI) / Savings Plans: 30–72% discount vs. on-demand for 1–3 year commitments
- Spot Instances: Up to 90% discount for fault-tolerant, interruptible workloads
- Mix: Cover baseline with RIs, burst with spot, keep on-demand minimal
Typical savings: 20–40% of compute costs for teams moving from all on-demand
4. Data Transfer and Egress Costs
Cloud providers charge for data that leaves their network. Common culprits:
- Cross-region data transfer (deploy in one region where possible)
- CloudFront/CDN not configured — direct S3 egress is expensive
- Logging and monitoring sending full payloads to expensive destinations
Typical savings: 5–15% reduction with proper traffic optimization
5. Orphaned Services and "Zombie" Infrastructure
Every migration, failed POC, and abandoned prototype leaves resources behind. Without a full asset inventory, you can't see what you're paying for. Common findings:
- NAT Gateways in regions with zero workloads
- Elasticsearch clusters from decommissioned logging stacks
- Old RDS instances never decommissioned after app migrations
- Build artifacts and log archives in expensive storage classes
Step-by-Step Optimization Process
Step 1: Get Full Asset Visibility
You cannot optimize what you cannot see. Run a full cloud asset discovery before touching anything. For AWS, use Cost Explorer + Trusted Advisor + Config. For multi-cloud, use a unified governance platform that pulls from all three clouds simultaneously.
Goal: A single inventory of every resource, its cost, its usage metrics, and its owner (team/project).
Step 2: Tag Everything
Cost allocation without tagging is impossible. Implement a mandatory tag policy:
- Environment: prod / staging / dev / test
- Team: engineering / data / security / finance
- Project: the project or product this resource serves
- Owner: Slack handle or email of the responsible engineer
Use AWS Organizations SCPs, Azure Policy, or GCP Organization Policies to enforce tagging at resource creation. Untagged resources should be automatically flagged and routed to a cleanup queue.
Rule of thumb: If you can't tell who owns a resource within 30 seconds, it's probably waste. If the owner has left the company, it's almost certainly waste.
Step 3: Quick Wins in the First 30 Days
Start here for fast ROI:
- Delete unattached EBS volumes (100% savings — you pay for storage even when detached)
- Release unattached Elastic IPs ($0.005/hr adds up across hundreds)
- Terminate dev/test instances outside business hours (auto-scheduling can save 60–70% of dev compute)
- Move infrequently accessed S3 data to Infrequent Access or Glacier tiers
- Remove empty load balancers ($0.008/hr per ALB adds up)
Step 4: Medium-Term Wins (30–90 Days)
- Right-size your top 20 most expensive instances based on actual utilization data
- Purchase Reserved Instances for your predictable baseline workloads
- Implement auto-scaling for workloads with variable traffic patterns
- Review and rightsize managed services (RDS, ElastiCache, Elasticsearch)
Step 5: Governance and Prevention (Ongoing)
Optimization without governance creates waste as fast as you eliminate it. Build the system that prevents waste from accumulating:
- Budget alerts: Set AWS budgets / Azure cost alerts at 80% and 100% of monthly target
- Anomaly detection: Alert on spend increases >20% week-over-week
- Scheduled cleanup jobs: Weekly automated detection of orphaned resources
- Cost allocation showback: Show each team their cloud bill — accountability reduces waste
- Quarterly RI reviews: Ensure reserved instances match current workload needs
Multi-Cloud Cost Optimization Specifics
AWS
- Use Compute Savings Plans for flexibility across EC2, Fargate, and Lambda (vs. rigid EC2 RIs)
- Enable S3 Intelligent-Tiering for automatic tiering of objects not accessed for 30+ days
- Review NAT Gateway costs — often 2–5% of AWS bill; use VPC endpoints to reduce traffic
- Use Spot Instances for batch jobs, ML training, and CI/CD runners (70–90% discount)
Azure
- Use Azure Hybrid Benefit with existing Windows Server/SQL Server licenses — up to 40% savings
- Enable Azure Dev/Test pricing for non-production subscriptions (60–70% discount on Windows VMs)
- Review Azure Blob lifecycle policies — automatically tier cold data to Archive storage
GCP
- Committed Use Discounts (CUDs): 1-year or 3-year commitments for 37–55% discount
- Sustained Use Discounts: Automatic discounts for instances running 25%+ of the month
- Use Spot VMs (formerly Preemptible) for batch and fault-tolerant workloads
Building a FinOps Culture
Technology alone doesn't solve cloud waste. You need a FinOps culture — where engineers think about cost as a first-class engineering concern, not someone else's problem.
- Unit economics: Define cost per unit of business value (cost per user, cost per transaction, cost per API call)
- Cost in sprint planning: Make cost impact a story point consideration, not an afterthought
- Showback, not chargeback: Show teams their costs without punishing them — build understanding first
- FinOps champion: Every team should have one person responsible for their cloud cost health
How ElevatedIQ Accelerates Cloud Cost Optimization
ElevatedIQ's FinOps platform continuously monitors your cloud spend across AWS, Azure, and GCP and automatically identifies:
- Every idle and underutilized resource, with its owner and monthly cost
- Right-sizing recommendations with confidence scores based on P95 utilization
- Reserved Instance gaps and coverage analysis
- Anomalous spend increases with root cause attribution to specific resources
- Scheduling opportunities for dev/test environments
Our clients recover an average of $142,800/month in their first 90 days — with zero engineering effort for identification.