Cassandra: can I index and query along a third dimension? -
i want put third dimension criteria on queries in cassandra. alows efficient 2-d queries because not key-value store, key-key value store. is:
simple key-value store:
key-key-value store:
so attraction cassandra given value keya, can perform efficient range queries along keyb, because contiguously stored.
now possibe, given keya , keyb, have index along third dimension, keyc, can limit values returned based on keyc?
so essentially:
basically given keya, keya-1, , range of keyb, keyb-2 thru keyb-4, want return values corresponding keyc-3, shown green above.
i know possible because simple key-value store can multiple indices. question is, efficient. still perform fast range queries along keyb?
my use case time series, want store minute-resolution, , daily-resolution data same series. keya series want, keyb day, , keyc minute. want because storing minute mean if needed daily data, mean getting far data out , on network (24*60 minutes per day , want 1 of them), memory, , lots of client-side aggregation.
i know store minute , daily in separate tables, limit flexibility somewhat, not mention cleanliness of schema.
if not easy/efficient in cassandra, possible in riak ts?
basically given keya, keya-1, , range of keyb, keyb-2 thru keyb-4, want return values corresponding keyc-3, shown green above.
yes possible following table structure
create table data ( keya text, keyc text, keyb int, val double, primary key ((keya), keyc, keyb) ); select * data keya='xxx' , keyc='yyy' , keyb>=aaa , keyb<=bbb;
the abstraction table can seen as:
map<keya,sortedmap<keyc,sortedmap<keyb,val>>>
so keya series want, keyb day, , keyc minute
so essentially, above table, can answer query: give me values serie s (keya), minute m (keyc) , day (keyb) between x , y very efficiently because results in sequential scan...
the problem partition key, base on serie id (keya) grow arbitrary large.
one solution split year, e.g. having composite partition key primary key((keya, year), keyc, keyb)
. impose constraint on query: you must provide serie id , year every time
Comments
Post a Comment