24
intro to 4 types of NoSQL databases
There is no one fixed solution for choosing a database. This blog covers the 4 types of NoSQL databases which are Key-value pairs, Columnar, Document, and Graph. Along with some examples of where to use each type of database.
What do you mean by a database?
A database is any type of mechanism used for storing, managing, and retrieving information. It is a repository or collection of data.
A database’s implementation and how data is structured will determine how well an application will perform as it scales. There are two primary types of databases, relational and non-relational.
Each database type is optimized to support a specific type of workload. Matching an application with the appropriate database type is essential for highly performant and cost-efficient operation.
Why to use Non relational data bases?
While relational databases are highly-structured repositories of information, non-relational databases do not use a fixed table structure. They are schema-less.
Since it doesn’t use a predefined schema that is enforced by a database engine, a non-relational database can use structured, semi-structured, and unstructured data without difficulty.

NoSQL databases are popular with developers because they do not require an upfront schema design; they are able to build code without waiting for a database to be designed and built.
How to choose between Consistency, Availability and Partition tolerance(CAP theorem)?
Most NoSQL databases relax ACID constraints found in relational databases. NoSQL solutions were developed around the purpose of providing high availability and scalability in a distributed environment. To do this, either consistency or durability has to be sacrificed according to CAP theorem.

By relaxing consistency, distributed systems can be highly available and durable. It's possible for data to be inconsistent; a query might return old or stale data. You might hear this phenomenon described as being eventually consistent. Over time, data that is spread across storage nodes will replicate and become consistent. What makes this behavior acceptable is that developers can anticipate this eventual consistency and allow for it.
Scaling a NoSQL database is easier and less expensive than scaling a relational database because the scaling is horizontal instead of vertical. NoSQL databases generally trade consistency for performance and scalability.
NoSQL databases are often run in clusters of computing nodes. Data is partitioned across multiple computers so that each computer can perform a specific task independently of the others. Each node performs its task without having to share CPU, memory, or storage with other nodes. This is known as a shared-nothing architecture.

Different types of NoSQL databases
There are a variety of No SQL data base applications ranging from key value to graph based data bases. A single database can't be efficient in all the cases and we have to choose a data base efficient for our use case,
1.Key value data store:
Saves data as a group of key value pairs, which are made up of two data items that are linked. The link between the items is a "key" which acts as an identifier for an item within the data and the "value"that is the data that has been identified.

Key value systems treat the data as a single opaque collection which may have different fields for every record. Hence, they generally use much less memory while saving and storing the same amount of data, which in turn, increases the performance for certain workloads. In each key value pair,
The value is stored as a blob requiring no upfront data modeling. This offers considerable flexibility and more closely follows modern concepts like object-oriented programming. As optional values are not represented by placeholders, it leads to less memory used to store. The storage of the value as a blob removes the need to index the data to improve performance. You cannot filter or control whats returned from a request based on the value because the value is opaque.
When to use key value stores ?
Key-value stores handle size well and are good at processing a constant stream of read/write operations with low latency making them perfect for,
Advantages of Key value stores,
Disadvantages of Key value stores,
2.Columnar data store:
While a relational database is optimized for storing rows of data, typically for transactional applications, a columnar database is optimized for fast retrieval of columns of data, typically in analytical applications.
Column-oriented storage for database tables is an important factor in analytic query performance because it drastically reduces the overall disk I/O requirements and reduces the amount of data you need to load from the disk.

Like other NoSQL databases, column-oriented databases are designed to scale “out” using distributed clusters of low-cost hardware to increase throughput, making them ideal for data warehousing and Big Data processing.
Terminologies used in a columnar data store?
- Tables contain partitions, which contain partitions, which contain columns.

Most column stores are traditionally loaded in batches that can optimize the column store process and the data is pretty much always compressed on disk to overcome the increased amount of data to be stored. Hence we mostly use Column data stores for data warehousing and data processing, which is evident in services such as Amazon Redshift.
Advantages of column data stores
Disadvantages of column data store
3.Document Data store:
A document database is a type of non relational database that is designed to store and query data as JSON-like documents. Document databases make it easier for developers to store and query data in a database by using the same document-model format they use in their application code.
They are flexible, semi structured, and hierarchical nature of documents and document databases allows them to evolve with applications’ needs. A document database is, at its core, a key/value store with one major exception. Instead of just storing any blob in it, a document db requires that the data will be store in a format that the database can understand.

The metadata also allows the document format to be changed at any time without affecting the existing records. Fields can be added at any time, anywhere, with no need to change the physical storage.
Terminologies in document data store?

The document model works well with use cases such as catalogs, user profiles, and content management systems where each document is unique and evolves over time.
Advantages of Document Data stores:
Disadvantages of Document data stores:
4.Graph data store:
A graph database is a database designed to treat the relationships between data as equally important to the data itself. Accessing nodes and relationships in a native graph database is an efficient, constant-time operation and allows you to quickly traverse millions of connections per second per core.
Independent of the total size of your dataset, graph databases excel at managing highly-connected data and complex queries.

Data Modeling ?
Nodes are the entities in the graph. They can hold any number of attributes called properties. Nodes can be tagged with labels, representing their different roles in your domain. Node labels may also serve to attach metadata to certain nodes.
Relationships provide directed, named, semantically-relevant connections between two node entities. A relationship always has a direction, a type, a start node, and an end node. Like nodes, relationships can also have properties.
Developing with graph technology aligns perfectly with today’s agile, test-driven development practices, allowing your graph-database-backed application to evolve with your changing business requirements. As, you are the one dictating changes and taking charge.
Managing data as graphs is a particularly good fit when the use case involves modifying schemas and accommodating new features, data points, or sources.
Due to the efficient way relationships are stored, two nodes can share any number or type of relationships without sacrificing performance. Although they are stored in a specific direction, relationships can always be navigated efficiently in either direction.

Graph databases are most useful when handling data containing relationships that have important associated context. Some examples are,
Advantages of Graph data stores,
Disadvantages of Graph data stores,
24