How and when to index data in Cassandra for fast and efficient retrieval? – A simple explanation

Cassandra is a multi-node, peer-to-peer cluster/distributed system that distributes/stores data across all nodes in the cluster. Every table in Cassandra is physically stored in multiple SSTable files spread across one or multiple nodes. Rows are spread around the cluster based on a hash of the partition key, which is the first part of the primary key. … Continue reading How and when to index data in Cassandra for fast and efficient retrieval? – A simple explanation

Migrating SQL applications to Cassandra – Pattern #3

Pattern #3: Get rid of all NOT operators from sql where clauses. For example, consider a typical SQL query to find all employees who have NOT completed a mandatory information security training: select ee1.* from employee ee1 where NOT exists (     select ‘true’ from employee_training et1     where et1.employeeID = ee1.employeeID     and … Continue reading Migrating SQL applications to Cassandra – Pattern #3

Migrating SQL applications to Cassandra – Pattern #2

Pattern #2: Get rid of all EXISTS and IN from sql where clauses. For example, consider a typical SQL query to find all employees who have completed a mandatory information security training: select ee1.* from employee ee1 where exists (     select 'true' from employee_training et1     where et1.employeeID = ee1.employeeID     and … Continue reading Migrating SQL applications to Cassandra – Pattern #2

Ohioedge 2.0 Architecture

Ohioedge 2.0 is a linearly scalable, no single point of failure (SPOF) architecture built using Cassandra, Zookeeper and Kafka. In addition, Ohioedge document component uses pithos.io service to save/retrieve documents from the underlying Cassandra data store. At the core of Ohioedge is a schema-driven request-to-response transformation engine - Ohioedge Builder. Out-of-the-box it enables CRUD and relationship … Continue reading Ohioedge 2.0 Architecture