Scaling AWS EC2 Instances

Migrating workloads AWS comes with many advantages. You can operate workloads in new ways. When you only pay for what you use and add capacity within minutes, the world of auto-scaling opens up.

When your workload is idling, you remove capacity and lower your expenses. When the workload is busy, you add capacity to keep the users happy. An excellent idea that works perfectly in theory. In practice, many pitfalls are waiting for you.

Vertical vs Horizontal scaling

Vertical scaling refers to the practice of upgrading the hardware of a machine that runs your workloads. Instead of running on 16 cores, you can run on 32 cores. The problem here is that you can not add as many cores as you wish. Infinite core CPUs are not yet invented. Besides that, CPUs with a high core density are usually more expensive than many CPUs with a lower core count.

On AWS, it is not possible to change the hardware specs of a running EC2 instance. Vertical scaling always comes with a short downtime. You have to stop the EC2 instance, change the instance type, and start the EC2 instance again.

Horizontal scaling is about adding more machines, usually by using cheap commodity hardware.

Auto Scaling Groups

The naming Auto Scaling Group (ASG) is misleading. The surprising fact is this: ASGs don’t provide auto-scaling capabilities. An ASG has one job: Keep the number of running EC2 instances in sync with a desired number of instances. The ASG itself does not modify the desired number of instances. An external trigger is responsible for adjusting the desired number of instances. By increasing the number, the ASG will add EC2 instances (scale-out). By decreasing the number, the ASL will terminate EC2 instances (scale-in).

AWS provides four options for Auto Scaling:

1- Simple Scaling: This option relies on a CloudWatch Alarm. The alarm watches a metric like network or CPU utilization or the requests per target behind a load balancer. If the metric reaches a predefined threshold, a scaling policy is triggered.

2- Step Scaling: Step scaling also required a CloudWatch Alarm. But you can react differently based on how bad things are. You could add one instance if the CPU is utilized more than 60%. But if the CPU utilization grows to 70%, you want to add two instances. If it rises to 80%, you add four instances. If it grows to 90%, you add eight.

3- Target Tracking: With target tracking, you ask AWS to keep a metric within a specific range. For example, you can tell AWS to keep your CPU utilization inside an ASG at 50%. AWS will adjust the number of instances in a way to reach the target. No CloudWatch Alarm is required here. AWS will create alarms on your behalf.

4- AWS Auto Scaling: AWS Auto Scaling can be confusing. It also provides target tracking, which is called dynamic scaling, but t. The difference is that you also get predictive scaling. To do so, AWS Auto Scaling takes the past 14 days into account to predict the next two days. This works great for workloads that are less used on weekends or during the night. Black Friday and large end-of-month batch jobs will not be detected.

In summary, no matter which option you choose, don’t forget to run extensive load tests to validate your configuration. Load testing is very time-consuming. You have to wait a lot to see how the system behaves while load patterns change.