Migrating SQL applications to Cassandra – Strategy

Today’s typical Java application consists of SQL statements defined either using Data Access Object pattern or Java Persistence Query Language (JPQL). These SQLs can be categorized into three categories:

  • Simple SQL: These are one table SQLs like select * from Employee where employeeID=1
  • Simple multi-table SQL: These are simple SQLs but involve more than one table, for example select e1.* from Employee e1 where exists (select 'true' from Course c1 where c1.taker = e1.employeeID and c1.name = 'Cassandra basics')
  • Complex SQL: These are multi-table, multi-join SQLs with complex logic of EXISTS, NOT EXISTS, etc. For example, select e1.* from Employee e1 where exists (select 'true' from Course c1 where c1.taker = e1.employeeID and c1.name = 'Cassandra basics') and NOT EXISTS (select 'true' Grade g1 where g1.courseID = c1.courseID and g1.grade = 'Failed'))

Strategy for migrating simple SQL is straight-forward – use Cassandra Java driver to create/retrieve data. This has minimal impact on the application.

Simple multi-table SQL and complex SQL can be handled in two ways – option 1) middle-tier graph construction or option 2) use of Cassandra jdbc driver such as Apache Presto or Apache Drill. Although these two may seem different, architecturally they are the same – Apache Presto or Apache Drill would essentially be doing the same thing – building a graph of objects on cluster nodes and converting it to SQL compliant result set. Using JDBC has minimal impact on the application.

There are two reasons why we may want to use the first option of executing multiple queries and composing your own object graph : 1) performance: with your own graph construction you can achieve significantly higher performance which may be the pressing need and 2) avoid duplicate retrievals: you may need the data, immediately or in sub-sequent calls, that is actually pulled by the jdbc driver internally (to prepare the resultset) but is not returned (for architectural reasons) as part of the resultset.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s