23
Create Managed Cassandra Database(AWS Keyspaces) using AWS CDK
Apache Cassandra is an open source NoSQL distributed database where it is popular because of its linear scalability, proven fault tolerance and high performance.
Cassandra provides the Cassandra Query Language (CQL), an SQL-like language, to create and update database schema and access data. CQL syntax is much similar to SQL but there are some limitations such as no joins, no aggregations etc.
Learn more about CQL: https://www.guru99.com/cassandra-query-language-cql-insert-update-delete-read-data.html
- Keyspace: defines how a dataset is replicated, for example in which datacenters and how many copies. Keyspaces contain tables.
- Table: defines the typed schema for a collection of partitions. Cassandra tables have flexible addition of new columns to tables with zero downtime. Tables contain partitions, which contain partitions, which contain columns.
- Partition: defines the mandatory part of the primary key all rows in Cassandra must have. All performant queries supply the partition key in the query.
- Row: contains a collection of columns identified by a unique primary key made up of the partition key and optionally additional clustering keys.
- Column: A single datum with a type which belong to a row.
Amazon Keyspaces is a managed Apache Cassandra–compatible database service. With Amazon Keyspaces, you can run your Cassandra workloads on AWS using the same Cassandra application code and developer tools easily. Its a serverless service. Therefore it eliminates Server provisioning, patching, maintaining burdens and you just pay as you go. Amazon Keyspaces take care of automatically scaling tables up and down in response to application traffic.
- Virtually unlimited throughput and storage
- Data is encrypted by default
- Enables you to back up your table data continuously using point-in-time recovery
AWS CDK helps to provision Cloud infrastructure resources in AWS faster in your favorite language. So, I decided to use Python to provision infrastructure for this tutorial and its super easy.
cdk init app --language python
Note:- cdk init uses the name of the project folder to name various elements of the project, including classes, subfolders, and files.
I ran cdk init command in a empty folder called cassandra_cdk
#For Linux enviroments
source .venv/bin/activate
.venv folder is your python virtual environment directory
python -m pip install -r requirements.txt
Set credentials in the AWS credentials profile file on your local system, located at:
~/.aws/credentials
on Linux, macOS, or Unix
C:\Users\USERNAME\.aws\credentials
on Windows
[default]
aws_access_key_id = your_access_key_id
aws_secret_access_key = your_secret_access_key
Set the AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
environment variables.
To set these variables on Linux, macOS, or Unix, use :
export AWS_ACCESS_KEY_ID=your_access_key_id
export AWS_SECRET_ACCESS_KEY=your_secret_access_key
To set these variables on Windows, use :
set AWS_ACCESS_KEY_ID=your_access_key_id
set AWS_SECRET_ACCESS_KEY=your_secret_access_key
For an EC2 instance, specify an IAM role and then give your EC2 instance access to that role.
6) Bootstrapping - provisioning these initial resources such as S3 bucket for storing files and IAM roles that grant permissions needed to perform deployments.
cdk bootstrap
You will notice in AWS management console that Keyspace is successfully provisioned and table is created which we previously defined in CDK code.
cqlsh is usually bundled with Cassandra. To make things easier, i will use cassandra:3.11.7 docker image to run cql queries through cqlsh utility.
Pull the docker image
docker pull cassandra:3.11.7
Create a container from the image and start bash,so you will be able to run commands in it.
docker run -it <Image id> /bin/bash
Amazon Keyspaces only accepts secure connections using Transport Layer Security (TLS).Therefore to connect using SSL/TLS,
Download the Starfield digital certificate:
curl https://certs.secureserver.net/repository/sf-class2-root.crt -O
Note down the cerfile path.
Run following command to connect to Cassandra database.
Generate AWS Keyspace credentials for your IAM User:
Follow this guide to generate it.
How to Generate AWS Keyspace credentials for your IAM User
From this step, you will obtain username and password for Keyspace.
List of service endpoints for Keyspace Available here.Choose the correct Service Endpoint based on the region.
List of service endpoints for Keyspace
cqlsh <keyspace service endpoint> 9142 -u "<generated-keyspace-useranme>" -p "<generated-keyspace-password>" --ssl
After you execute this command, you will prompt to cqlsh command line shell to execute query on Keyspace(Cassandra) Database.
cdk destroy
Github repository link: https://github.com/chathra222/aws-managed-cassandra-cdk
23