Create both Development and Production-Ready AWS EKS Clusters using AWS CDK

As a Xerris Solutions Architect, I sometimes get customers asking about how to maintain a Kurbernetes cluster in AWS the easiest way possible. Kubernetes is becoming the de-facto standard for running container workloads and provides many benefits over traditional virtual machine based architectures. It enables you to scale your compute resources seamlessly while providing fast development and deployment cycles and fast rollbacks. The cost of this previously has been the high levels of administration that come with maintaining your own cluster but AWS has a managed service that aim’s just to ease this problem.

AWS EKS AND CDK

AWS EKS is a fully managed Kubernetes service that frees you from having to deal with the day-to-day cluster maintenance and instead let’s you focus on the applications running on your cluster. It also is deeply integrated into the AWS ecosystem with services such as Amazon CloudWatch, Auto Scaling Groups, AWS Identity and Access Management (IAM), and Amazon Virtual Private Cloud (VPC). In addition to this, AWS will automatically apply the latest security patches to your cluster so you can know that any known vulnerabilities are taken care of.

When it comes to deploying your cluster you have a couple options as well including Cloudformation and Terraform but the tool that I am going to use here is called the AWS Cloud Development Kit (CDK). It allows you to describe your infrastructure using existing programming languages like C# or Python.

In this post, we will set up both a cost effective development cluster and a highly-available production cluster from scratch using CDK and C#.

Prerequisites

Before we start we need to get our CLI setup. Install the AWS CLI on your machine with a user who has been given AdministratorAccess. This is due to the extensive access that is needed for the CDK. In a production environment this should be limited to only the needed permissions for creating the infrastructure. Now we need to install the CDK which can be done through npm:

npm install -g aws-cdk

Now let’s create a CDK project:

cdk init app --language dotnet

You can open up your project in Visual Studio and install the CDK NuGet Package and we can get started. It’s going to be a lot of code but at the end it should all come together into a neat and maintainable way to manage your infrastructure.

VPC Setup

First let’s start by setting up the VPC that our cluster will reside in. We are setting it up with large subnets for future growth in our cluster as well as creating them over 4 AZ’s for maximum availability.

#nullable enable
using Amazon.CDK;
using Amazon.CDK.AWS.EC2;
namespace KubernetesDemo
{
public class Network : Construct
{
public Vpc Vpc { get; }
public Network(Construct scope, string id) : base(scope, id)
{
ISubnetConfiguration[]? subnetConfiguration = {
new SubnetConfiguration {
SubnetType = SubnetType.PUBLIC,
Name = "public",
CidrMask = 26
},
new SubnetConfiguration {
SubnetType = SubnetType.PRIVATE,
Name = "private",
CidrMask = 20,
},
};
Vpc = new Vpc(scope, $"galaxy", new VpcProps
{
Cidr = "10.0.0.0/16",
MaxAzs = 4,
SubnetConfiguration = subnetConfiguration
});
}
}
}
view raw Network.cs hosted with ❤ by GitHub

ECR Setup

We will need a place to store our container images so we will create an ECR repository for each deployment environment we plan to run in.

using System;
namespace KubernetesDemo
{
public enum DeploymentEnvironment
{
Sandbox,
Dev,
Stage,
Prod
}
}
using Amazon.CDK;
using Amazon.CDK.AWS.ECR;
namespace KubernetesDemo
{
public class ContainerRegistry : Construct
{
public Repository Registry { get; set; }
public ContainerRegistry(Construct scope, string id, DeploymentEnvironment env) : base(scope, id)
{
var repositoryName = $"demo-{env.ToString().ToLower()}";
Registry = new Repository(this, repositoryName, new RepositoryProps
{
RepositoryName = repositoryName,
RemovalPolicy = RemovalPolicy.DESTROY
});
}
}
}

EKS Setup

Now we have to actually create our cluster. This involves setting up an administrator IAM role to access the cluster as well as outputs that allow us to quickly extract these values and login after the stack has finished creating.

using Amazon.CDK;
using Amazon.CDK.AWS.EC2;
using Amazon.CDK.AWS.ECR;
using Amazon.CDK.AWS.EKS;
using Amazon.CDK.AWS.IAM;
namespace KubernetesDemo
{
public class KubernetesCluster : Construct
{
public Cluster Master { get; }
public KubernetesCluster(Construct scope, string id, Vpc vpc, string clusterName = "yoda") : base(scope, id)
{
var clusterAdmin = SetupSecurity(scope, id);
Master = new Cluster(this, $"{id}-eks-cluster", new ClusterProps
{
ClusterName = clusterName,
Version = KubernetesVersion.V1_16,
DefaultCapacity = 0,
MastersRole = clusterAdmin,
Vpc = vpc
});
new CfnOutput(scope, $"{id}-kube-cluster-name", new CfnOutputProps
{
Description = "name of cluster",
Value = clusterName
});
}
private static Role SetupSecurity(Construct scope, string id)
{
var clusterAdmin = new Role(scope, $"{id}-cluster-administrator", new RoleProps
{
RoleName = $"{id}-cluster-administrator",
AssumedBy = new AccountRootPrincipal(),
Description = "super admin"
});
new CfnOutput(scope, $"{id}-kube-role-arn", new CfnOutputProps
{
Description = "value of the cluster admin IAM role",
Value = clusterAdmin.RoleArn
});
return clusterAdmin;
}
}
}

Node Group Setup

Here we are setting up an Abstract class that both our development and production node groups can inherit from. In here we also describe the basic autoscaling policy that both will use.

using Amazon.CDK;
using Amazon.CDK.AWS.IAM;
namespace KubernetesDemo
{
abstract public class AbstractNodeGroup : Construct
{
protected static readonly PolicyStatement autoScalePolicy = new PolicyStatement(new PolicyStatementProps()
{
Effect = Effect.ALLOW,
Actions = new string[]
{
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup"
},
Resources = new[]
{
"*"
}
});
protected AbstractNodeGroup(Construct scope, string id) : base(scope, id)
{
}
}
}

Development Node Group Setup

Spot instances are a great way to save money on workloads that can be terminated without notice. While a production environment might not fit here, a development workload is a perfect situation to leverage spot instances to save cost. Here we are calculating our spot price with the list price and discount then launching our cluster with md5.large nodes in an ASG.

using System;
using System.Globalization;
using Amazon.CDK;
using Amazon.CDK.AWS.EC2;
using Amazon.CDK.AWS.EKS;
namespace KubernetesDemo
{
public class SpotFleetWorkerGroup : AbstractNodeGroup
{
public SpotFleetWorkerGroup(Construct scope, string id, Cluster cluster) : base(scope, id)
{
//m5d.large
const string instanceType = "m5d.large";
var listPrice = new decimal(0.113);
var discount = new decimal(0.70);
//Get current price listed at https://www.ec2instances.info/?selected=t3.large
var spotPrice = listPrice - listPrice * discount;
var finalSpotPrice = spotPrice.ToString(CultureInfo.InvariantCulture);
var spotFleet = cluster.AddAutoScalingGroupCapacity($"{id}-spot", new AutoScalingGroupCapacityOptions
{
SpotPrice = finalSpotPrice,
InstanceType = new InstanceType(instanceType),
MaxCapacity = 6,
MinCapacity = 2,
});
spotFleet.Role.AddToPrincipalPolicy(autoScalePolicy);
}
}
}

Production Node Group Setup

For our production node group we want to maximize availability and reliability. We do this by creating our node groups across all AZ’s in our VPC . We also define our autoscaler manifest which will be created after cluster creation.

using System;
using System.Collections.Generic;
using Amazon.CDK;
using Amazon.CDK.AWS.EC2;
using Amazon.CDK.AWS.EKS;
using Amazon.CDK.AWS.IAM;
namespace KubernetesDemo
{
public class AutoScalerNodeGroup : AbstractNodeGroup
{
public AutoScalerNodeGroup(Construct scope, string id, string nodeGroupName, Cluster cluster,
string instanceType = "c5d.large") : base(scope, $"{id}-{nodeGroupName}")
{
for (var index = 0; index < cluster.Vpc.PrivateSubnets.Length; index++)
{
var subnet = cluster.Vpc.PrivateSubnets[index];
CreateNodeGroup($"{id}-{nodeGroupName}", cluster, nodeGroupName, instanceType, subnet, (AZ)index, 1);
}
}
private static void CreateNodeGroup(string id, Cluster cluster, string nodeGroupName, string instanceType,
ISubnet subnet, AZ az, int actualMinSize = 1, int actualMaxSize = 5)
{
var nodegroup = cluster.AddNodegroupCapacity($"{id}-{az}", new NodegroupProps()
{
NodegroupName = $"{cluster.ClusterName}-{nodeGroupName}-{az}",
InstanceType = new InstanceType(instanceType),
MinSize = actualMinSize,
MaxSize = actualMaxSize,
Subnets = new Subnety
Selection { Subnets = new[] { subnet } },
Tags = new Dictionary<string, string>
{
{"k8s.io/cluster-autoscaler/enabled", ""}
}
});
nodegroup.Role.AddManagedPolicy(ManagedPolicy.FromAwsManagedPolicyName("CloudWatchAgentServerPolicy"));
nodegroup.Role.AddToPrincipalPolicy(autoScalePolicy);
}
private enum AZ
{
a, b, c, d, e, f, g, h, i, j, k
}
}
}
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["events", "endpoints"]
verbs: ["create", "patch"]
- apiGroups: [""]
resources: ["pods/eviction"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/status"]
verbs: ["update"]
- apiGroups: [""]
resources: ["endpoints"]
resourceNames: ["cluster-autoscaler"]
verbs: ["get", "update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["watch", "list", "get", "update"]
- apiGroups: [""]
resources:
- "pods"
- "services"
- "replicationcontrollers"
- "persistentvolumeclaims"
- "persistentvolumes"
verbs: ["watch", "list", "get"]
- apiGroups: ["extensions"]
resources: ["replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["watch", "list"]
- apiGroups: ["apps"]
resources: ["statefulsets", "replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses", "csinodes"]
verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "patch"]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["create"]
- apiGroups: ["coordination.k8s.io"]
resourceNames: ["cluster-autoscaler"]
resources: ["leases"]
verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create","list","watch"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete", "get", "update", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
app: cluster-autoscaler
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
annotations:
kubernetes.io/safe-to-evit: 'false'
prometheus.io/scrape: 'true'
prometheus.io/port: '8085'
spec:
serviceAccountName: cluster-autoscaler
containers:
- image: us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.16.5
name: cluster-autoscaler
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,
- --balance-similar-node-groups
- --skip-nodes-with-system-pods=false
volumeMounts:
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-certificates.crt
readOnly: true
imagePullPolicy: "Always"
volumes:
- name: ssl-certs
hostPath:
path: "/etc/ssl/certs/ca-bundle.crt"
view raw autoscaler.yaml hosted with ❤ by GitHub

Putting it all Together

Here we put all the pieces together into two stacks. A development stack with the spot instance node group as well as a production stack with the high availability node group.

using Amazon.CDK;
using Amazon.CDK.AWS.EC2;
namespace KubernetesDemo
{
public class DemoStack : Amazon.CDK.Stack
{
public readonly Vpc Vpc;
internal DemoStack(Construct scope, string id, DeploymentEnvironment env, IStackProps props = null) :
base(scope, id, props)
{
//network
Vpc = new Network(this, "{id}-supernetwork").Vpc;
//Kubernetes
var ecr = new ContainerRegistry(this, $"{id}-container-registry", env);
var k8 = new KubernetesCluster(this, $"{id}-cluster", Vpc, "democluster");
if (env == DeploymentEnvironment.Dev || env == DeploymentEnvironment.Sandbox)
{
var sandboxCluster = new SpotFleetWorkerGroup(this, id, k8.Master);
}
else
{
var apiCluster1 = new AutoScalerNodeGroup(this, id, "api-cluster-v1", k8.Master);
apiCluster1.Node.AddDependency(k8);
}
}
}
}
using System;
using System.Collections.Generic;
using Amazon.CDK;
using Amazon.CDK.AWS.EC2;
namespace KubernetesDemo
{
class Program
{
static void Main(string[] args)
{
var app = new App();
CreateDemoStack(app, DeploymentEnvironment.Dev);
CreateDemoStack(app, DeploymentEnvironment.Prod);
app.Synth();
}
private static void CreateDemoStack(App app, DeploymentEnvironment env)
{
var stackName = $"{env.ToString().ToLower()}-demo";
var kube = new DemoStack(app, stackName, env,
new StackProps
{
StackName = stackName,
Env = new Amazon.CDK.Environment
{
Region = "us-west-2",
Account = ""
},
Tags = new Dictionary<string, string>
{
{"k8s.io/cluster-autoscaler/enabled", "enabled"},
{"k8s.io/cluster-autoscaler/yoda", "enabled"}
}
});
}
}
}
view raw Program.cs hosted with ❤ by GitHub

Deploying our Infrastructure

To deploy the development environment all you need to do it run:

cdk bootstrap
cdk deploy dev-demo

Or the production environment:

cdk bootstrap
cdk deploy prod-demo

Then once everything is deployed it will output what you need to configure your kubectl to connect to your cluster. The format will be similar to:

aws eks update-kubeconfig --name demo--role-arn arn:aws:iam::123456789:role/dev-demo-cluster-cluster-administrator --region us-west-2

Then to start the autoscaler:

kubectl apply -f autoscaler.yaml




Deleting our Infrastructure

Once you are done you can delete everything by going:

cdk delete dev-demo
cdk delete prod-demo




Conclusions

Now that we have our cluster up and running we can now look at tools to deploy our images like Flux or monitoring tools like Prometheus. This is out of scope for this post but the sky is the limit with Kubernetes and it’s extensibility leads to lots of great workflows. Thank you for reading and you can find the full CDK code here. If you want to find out more about scaling up your infrastructure using Kubernetes feel free to get in touch with us at Xerris and we can help you craft innovative cloud focused solutions for your business.

24

This website collects cookies to deliver better user experience