35
AWS Data Lake with Terraform - Parts 4-5 of 6
Up to this point on this series we have discuss about data injection, data collection and data analysis. I bet you have been wondering about how can we protect this infrastructure and data?
On these two parts 4-5 post we will learn how we can secure this project using some powerful AWS services such as:
Let’s first start by understanding the benefits of IAM security and how it can help to secure our project.
What is IAM and what are some of IAM benefits?
IAM stands for Identity and Access Management. IAM is an important AWS service that enables you control the access and use of your AWS resources and services in one shop.
IAM is many other identities as well and some of them are:
On this section we will focus only on roles and policy(s). Let’s start by answering a few questions.
What is an IAM role and what are some of the benefits?
IAM role is an identity that has permissions to make AWS service(s) requests. As simple as that.
Some of the benefits are:
resource "aws_s3_bucket" "data_logs" {
bucket = var.bucket_name
}
resource "aws_iam_role" "f_role" {
name = "f_role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "firehose.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
What about Role limitation?
What is a policy?
Policy structure components:
Actions: define what action is allows over an AWS service.
Resources: define what resources actions can be performed.
Effect: define if the user or role is allowed or deny completing any actions on the resources. Deny is set by default, you would need to explicitly allow it.
resource "aws_iam_role_policy" "f_delivery_policy" {
role = aws_iam_role.f_role.id
policy = <<EOT
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:ListAllMyBuckets",
"s3:PutObject"
],
"Effect": "Allow",
"Resource": "${aws_s3_bucket.arn}"
},
{
"Action": [
"s3:*"
],
"Effect": "Allow",
"Resource": "${aws_s3_bucket.data_logs.arn}"
}
]
}
EOT
}
As we can see roles and policy are resourceful features of IAM and it can be used in different scenarios.
The challenge here is how to protect data in transit. Do not worry AWS have you back.
Let me introduce you with AWS Key Management Service (KMS)
What is KMS and what are some of KMS benefits?
KMS is a totally managed service that supports encryption of your data at rest or in transit.
How does KMS works?
KMS allows you to create keys to encrypt your data, provides you with a fully managed and highly available storage. You can encrypt your data within your applications and across accounts. One of the important elements of KMS is that it is low cost per use key and can be stored in your account at zero charge.
What are some of KMS benefits?
Finally, how to control the flow of your infrastructure using CloudWatch
As previously let’s respond to a few questions to begin.
What is CloudWatch and what are some of CloudWatch benefits?
AWS CloudWatch is a global monitoring service that allow you to collect metrics of your AWS resources and applications.
AWS CloudWatch features does not end at the monitoring level you can also create alarm for constantly monitor performance, health checks, and billing. This allows you to act proactively in case of reaching budgets or going over thresholds set by your department or administrative team.
CloudWatch monitoring model:
Benefits:
resource "aws_cloudwatch_dashboard" "thresholds_control" {
dashboard_name = "admin-dashboard"
dashboard_body = <<EOF
{
"widgets": [
{
"type": "metric",
"x": 0,
"y": 0,
"width": 12,
"height": 6,
"properties": {
"metrics": [
[
"AWS/EC2",
"CPUUtilization",
"InstanceId",
"i-012345"
]
],
"period": 300,
"stat": "Average",
"region": "us-east-1",
"title": "EC2 Instance CPU"
}
},
{
"type": "text",
"x": 0,
"y": 7,
"width": 3,
"height": 3,
"properties": {
"markdown": "We are monitoring"
}
}
]
}
EOF
}

In the other hand Terraform offers you a state file where you can read the configuration of your resources. Nonetheless if you are a person who prefers visualizations Terraform manages a dependency graph for you in the back end.
If you were not familiar with this feature let me share it with you.
what is a dependency graph?
dependency graph is a directed graph representing dependencies of several objects towards each other. It is possible to derive an evaluation order or the absence of an evaluation order that respects the given dependencies from the dependency graph.
wiki-Dependency_graph
Terraform dependency-graph-sample

35