Cost Optimization in Cloud Infrastructure: Real-World Strategies
Cloud costs can spiral out of control quickly. What starts as a few hundred dollars a month can become thousands if not managed carefully. After optimizing cloud infrastructure that was costing $10K+/month, I've learned strategies that actually work.
Understanding Cloud Costs
Where Money Goes
- Compute - EC2, Lambda, containers
- Storage - S3, EBS, databases
- Network - Data transfer, load balancers
- Databases - RDS, DynamoDB, ElastiCache
- Monitoring - CloudWatch, logging
Compute Optimization
Right-Sizing Instances
// Monitor actual usage
const metrics = await cloudwatch.getMetricStatistics({
MetricName: 'CPUUtilization',
Namespace: 'AWS/EC2',
StartTime: startTime,
EndTime: endTime,
Period: 3600,
Statistics: ['Average']
});
// If average CPU < 30%, consider smaller instance
if (averageCPU < 30) {
console.log('Consider downsizing instance');
}
Strategy: Start small, scale up if needed. Monitor and adjust.
Reserved Instances
// Reserved instances save 30-70%
// Good for: Predictable workloads
// Bad for: Variable workloads
// Calculate savings
const onDemandCost = 0.10 * 730; // $0.10/hour * 730 hours/month
const reservedCost = 0.05 * 730; // $0.05/hour (1-year reserved)
const savings = onDemandCost - reservedCost; // $36.50/month
Spot Instances
// Spot instances: 50-90% cheaper
// Good for: Fault-tolerant workloads
// Bad for: Critical, always-on services
// Use for batch processing, testing
const spotInstance = await ec2.requestSpotInstances({
SpotPrice: '0.05', // Much cheaper
InstanceCount: 1,
Type: 'one-time'
});
Auto-Scaling
// Scale down during low traffic
const autoScalingGroup = {
MinSize: 2, // Minimum instances
MaxSize: 10, // Maximum instances
DesiredCapacity: 3, // Current instances
// Scale based on CPU
TargetTrackingScalingPolicy: {
TargetValue: 70.0, // Target 70% CPU
PredefinedMetricSpecification: {
PredefinedMetricType: 'ASGAverageCPUUtilization'
}
}
};
Storage Optimization
S3 Storage Classes
// Use appropriate storage class
const storageClasses = {
// Frequently accessed
STANDARD: 0.023, // $0.023/GB
// Infrequently accessed
STANDARD_IA: 0.0125, // $0.0125/GB (cheaper)
// Archive
GLACIER: 0.004, // $0.004/GB (much cheaper)
// Deep archive
DEEP_ARCHIVE: 0.00099 // $0.00099/GB (cheapest)
};
// Lifecycle policy
const lifecyclePolicy = {
Rules: [{
Id: 'Move to Glacier',
Status: 'Enabled',
Transitions: [{
Days: 90, // After 90 days
StorageClass: 'GLACIER'
}]
}]
};
Clean Up Unused Resources
// Find unused EBS volumes
async function findUnusedVolumes() {
const volumes = await ec2.describeVolumes();
const attachedVolumeIds = new Set();
// Get all attached volumes
const instances = await ec2.describeInstances();
instances.forEach(instance => {
instance.BlockDeviceMappings.forEach(device => {
attachedVolumeIds.add(device.Ebs.VolumeId);
});
});
// Find unattached volumes
const unused = volumes.filter(v => !attachedVolumeIds.has(v.VolumeId));
return unused;
}
Database Optimization
Right-Size Databases
// Monitor database metrics
const dbMetrics = {
CPUUtilization: 45, // %
FreeableMemory: 2048, // MB
DatabaseConnections: 50 // Current connections
};
// If CPU < 50% and memory usage low, consider smaller instance
Use Read Replicas
// Read replicas for read-heavy workloads
// Cheaper than scaling primary database
const readReplica = {
DBInstanceIdentifier: 'db-replica',
SourceDBInstanceIdentifier: 'db-primary',
PubliclyAccessible: false
};
Connection Pooling
// Reduce database connections
const pool = new Pool({
max: 20, // Instead of 100 connections
min: 5,
idleTimeoutMillis: 30000
});
Network Optimization
Data Transfer Costs
// Minimize cross-region transfer
// Use CloudFront for static assets
// Compress responses
app.use(compression()); // Reduce data transfer
Use CDN
// CloudFront reduces origin requests
// Caches at edge locations
// Reduces data transfer costs
Monitoring and Alerting
Cost Monitoring
// Set up billing alerts
const billingAlarm = {
AlarmName: 'MonthlyCostAlarm',
MetricName: 'EstimatedCharges',
Threshold: 1000, // Alert if > $1000
ComparisonOperator: 'GreaterThanThreshold'
};
Resource Tagging
// Tag resources for cost tracking
const tags = {
Environment: 'production',
Team: 'backend',
Project: 'api-service',
CostCenter: 'engineering'
};
// Query costs by tag
const costs = await costExplorer.getCostAndUsage({
TimePeriod: { Start: '2025-01-01', End: '2025-01-31' },
Granularity: 'MONTHLY',
GroupBy: [{ Type: 'TAG', Key: 'Project' }]
});
Real-World Example
Challenge: Cloud costs $8,000/month, need to reduce by 50%.
Optimizations Applied:
- Right-sized instances - Reduced from t3.large to t3.medium (saved $2,000/month)
- Reserved instances - 1-year reserved for predictable workloads (saved $1,500/month)
- S3 lifecycle policies - Moved old data to Glacier (saved $500/month)
- Auto-scaling - Scale down during off-hours (saved $1,000/month)
- Database optimization - Read replicas, connection pooling (saved $500/month)
- Cleanup - Removed unused resources (saved $500/month)
Result:
- Cost: $8,000 → $3,000/month
- Performance: No degradation
- Reliability: Improved (better monitoring)
Cost Optimization Checklist
- Right-size compute instances
- Use reserved instances for predictable workloads
- Implement auto-scaling
- Use appropriate S3 storage classes
- Clean up unused resources
- Optimize database instances
- Use CDN for static assets
- Monitor and alert on costs
- Tag resources for tracking
- Review costs regularly
Best Practices
- Monitor continuously - Know where money goes
- Right-size everything - Don't over-provision
- Use appropriate services - Cheapest that meets needs
- Automate scaling - Scale down when not needed
- Review regularly - Costs change over time
- Tag resources - Track costs by project/team
- Set budgets - Alert when approaching limits
- Clean up regularly - Remove unused resources
Conclusion
Cloud cost optimization is an ongoing process. The key is to:
- Monitor - Know your costs
- Right-size - Match resources to needs
- Automate - Scale based on demand
- Review - Regular cost audits
Remember: Every dollar saved is a dollar earned. Small optimizations add up to significant savings over time.
What cloud cost optimization strategies have you used? What savings have you achieved?
Related Posts
Serverless Architecture: When to Use and When to Avoid
A practical guide to serverless architecture. Learn when serverless makes sense, its trade-offs, and how to build effective serverless applications.
Database Migration Strategies: Zero-Downtime Deployments
Learn how to perform database migrations without downtime. From schema changes to data migrations, master the techniques that keep your application running.
Observability in Modern Applications: Logging, Metrics, and Tracing
Master the three pillars of observability: logging, metrics, and distributed tracing. Learn how to build observable systems that are easy to debug and monitor.
Container Orchestration with Kubernetes: A Practical Guide
Learn Kubernetes fundamentals and practical patterns for deploying and managing containerized applications at scale. Real-world examples and best practices.
Docker and Containerization: Best Practices for Production
Master Docker containerization with production-ready best practices. Learn how to build efficient, secure, and maintainable containerized applications.
Microservices vs Monoliths: When to Choose What in 2024
A practical guide to choosing between microservices and monolithic architectures. Learn when each approach makes sense, common pitfalls, and how to make the right decision for your project.