Smart technology for a strong manufacturing industry with 24/7 uninterrupted production of IIoT solutions for companies.

Blog

NoSQL - Wide column, Graph and Search databases

Wide column databases

A wide column database uses tables, rows, and columns, but unlike a RDBMS, the names and format of the columns can vary from row to row in the same table. A wide-column store can be interpreted as a two-dimensional key-value store. Examples of NoSQL wide column databases are Cassandra, Datastax, Amazon Redshift and Snowflake.

In genuine column-oriented databases a columnar data layout is adopted such that each column is stored separately on disk. Wide-column databases often support the notion of column families that are stored separately but do not necessarily store each column separately on disk.

However, each such column family typically contains multiple columns that are used together, similar to traditional relational database tables. Within a given column family, all data is stored in a row-by-row fashion, such that the columns for a given row are stored together.

Wide column databases are well suited for data where applications need to access a few columns of many rows all at once and lend themselves very well for OLAP applications. That’s also the reason why Amazon Redshift and Snowflake are considered modern data warehouse databases.

When to apply?

Advantages:

  • Highly scalable and performant with large datasets

  • Flexible columnar model

  • Good write performance

Disadvantages:

  • Complex management and data modeling

  • No real joins or transactions

  • Consistency can be limited (depending on configuration)

Use cases:

  • IoT data at massive scale

  • Data warehousing for large datasets

Graph databases

Graph databases are purpose-built to store and navigate relationships. They use nodes to store data entities and edges to store relationships between entities. Examples of such NoSQL databases are Neo4J, ArangoDB and Amazon Neptune.

An edge always has a start node, end node, type, and direction. An edge can describe parent-child relationships, actions, ownership, and the like. There is no limit to the number and kind of relationships a node can have.

A graph in a graph database can be traversed along specific edge types or across the entire graph. In graph databases, traversing the joins or relationships is very fast because the relationships between nodes are not calculated at query times but are persisted in the database.

These databases have advantages for use cases such as social networking, recommendation engines, and any system where you need to create relationships between data and quickly query these relationships.

When to apply?

Advantages:

  • Strong in relationships and traversals

  • Flexible model

  • Efficient for network data (social graphs, routes, knowledge networks)

Disadvantages:

  • Not suitable for large, flat datasets

  • Steep learning curve

  • Less support in traditional BI tools

Use cases:

  • Social networks (who knows whom)

  • Recommendation systems

  • Fraud detection (transaction connections)

Search-engine databases

A NoSQL search-engine database is a type of nonrelational database that is dedicated to the search of data content. Search-engine databases use indexes to categorize the similar characteristics among data and facilitate search capability. Search-engine databases are optimized for dealing with data that may be structured, semistructured, or unstructured. They typically offer specialized methods such as full-text search, complex search expressions, and ranking of search results.

They are often used in combination with other databases to provide fast search capabilities in stored content.

When to apply?

Advantages:

  • Full-text search, fuzzy matching, auto-completion

  • Very fast search results

  • Scalable and distributed

  • Aggregations for analytics

Disadvantages:

  • No ACID transactions

  • Complex management (index management)

  • Data must be indexed (duplication with source database)

Use cases:

  • Search functionality on websites, applications, or text-related objects

  • Log analysis (e.g., via ELK stack)

  • Product catalogs with search filters