Insights
Managing Cloud Hosting Costs(AWS)
Author : Hari P S
Introduction
AWS is one of the most scalable and reliable hosting environments in the world. They have various infrastructure building blocks with which we can architect and implement an infrastructure that meets almost any requirement. Added to this, these environments are dynamically scalable and so can grow and shrink based on pre-planned schedules or usage metrics.
The main challenge for new adopters of AWS is about how to manage the costs. For estimating the costs the best place to start is to use the AWS Calculator. But given the number of products it is quite possible to get lost while finding one’s way around the options.
As users get used to the ease with which new infrastructure is added to AWS, they can forget the housekeeping tasks and leave resources like EC2 instances, EBS blocks even unused elastic IPs which would add up on the monthly bill .
Over the years we have found quite a few techniques to keep AWS costs under check.
Compute Optimization
Scheduling of running instances
We can schedule the up-time for the instances. Usage of compute instances ( like EC2) are measured based on the time that they are running. For most ec2 instances, we have automated scripts that shut-down the instances in the night. For infrequently used instances ( like demo machines ) we do not start-up in the morning as the team which works on demos have the permission to start the machine through the AWS Console Mobile App. We still set-up a schedule a shutdown for them as many forget to shutdown the machine once the work is over
Reserved and spot instances
The base price for instances is based on a standard on-demand instance price. While these prices can vary from region to region these differences are not significant. There are a couple of other options for reducing infrastructure cost by choosing the right kind of instance.
Reserved Instances Reserved instances can be used if the instance is required for a long period (1-3 years). While upfront payment (full or partial) can provide additional discounts, a smaller discount ( nearly 20% ) is available for just giving a commitment – with no upfront fees
Spot Instances If you have fault-tolerant stateless workloads you can use spot instances where you can bid for an instance at a lower price. AWS periodically computes spot price for an instance type and if the spot price is less than your threshold price the instance would be available for you to run. Discounts of up-to 90% are possible Spot instances can be used in many scenarios – for running additional instances for big-data / high-performance computing workloads , for running non-time-critical CI/CD infrastructure etc
Cleaning up and Auditing resources
In a development organization like ours, where lots of team members ask for cloud resources or create them on their own but do not always remember to release them after usage, it is important for the Cloud Infrastructure Management team to review resources on a periodic basis.
Things that I review and verify with owners are
- Compute resources that are shut-down since a long time
- EBS blocks not assigned to EC2 instances
- Unused Virtual IPs
Using the right compute instance
Many times developers may not know of all the resource types available and might ask for a resource based on previous experiences of what worked well and can then use instances that are an overkill for their workload.
So we normally ask them for how much RAM, CPU and Storage they require and then try to allocate resources accordingly. The t series instances are good for dev usage as they give bursts of higher processing power and normally it is useful to go for the latest generation of an instance type.
For stateless applications, Auto scaling. Is a good option as it would allow the number of resources in use to grow or shrink based on usage.
Another cost-reduction practice that we always looked at is to have small applications in an instance or a single EC2 instance instead of separate web and rds servers ( which devs sometimes ask for ) . Sometimes even a lightsail instance ( USD 3.5 / month ) could be sufficient for small applications.
Storage resource usage optimization
Cleaning up and Auditing resources
As in the case of compute resources an audit of the resources would allow identifying instances that are not required.
Things that I review are
- Multiple Snapshots/images for the same instance
- S3 bucket ( looking for temp resources that have been created and not accessed for long time )
Choosing the right S3 storage classes
As a default S3 Standard is the storage class for S3. However there are other options especially if you store a lot of data. S3 Intelligent Tiering is almost always a better option ( if the objects being stored are individually large ) as it would move from Standard to Infrequently accessed automatically .
Defining archival and retention policy
Data archival is moving older objects to an archive location. S3 provides two options here – S3 Glacier and Glacier Deep Archive which are cheaper , but may take upto 12 hours for retrieval ( for Deep Glacier). S3 Lifecycle process can be used to make sure that these governance rules are applied for various types of objects stored on S3
Automated transition of Snapshots to S3 Glacier Deep archive.
Normal snapshot cost is $0.05 per GB-Month but the s3 glacier deep archive has $0.00099 per GB-Month. While policies can be made for transitioning S3 to glacier and deep-archive for objects that we store on S3, snapshots are also stored on S3 but not in our storage areas and so cannot be transitioned in this fashion.
To work around this issue we converted the snapshot into an image , instantiated the image and then detached the disk and made a raw copy of the disk into a file using dd. The file was stored in S3 locally and then archived as per S3 lifecycle policy.
AWS features
Another resource for managing costs of different services is AWS Trusted Advisor. The online tool validates only some of the configurations for AWS Basic/Developer support and provides a more comprehensive validation for Business and Enterprise support customers . Of course all AWS customers have access to the Trusted Advisor Best Practices Checklist which provides advice against not just cost optimization but also on performance, security, fault tolerance and service limits .