we didnt rely on it, or on the ordering of messages received from kafka, timesta...

we didnt rely on it, or on the ordering of messages received from kafka, timestamps and transaction IDs were generated by the client app/kafka publisher and were part of the message put into kafka topic, when we consume that message with one of the parallel kafka consumers and save a row in hbase table that original timestamp + transactionId becomes part of the rowkey string, other parts being attributes that we wanted to index (secondary indices are supported in hbase/phoenix but we didnt use them too much, basically the composite rowkey is the index). Then when querying hbase it works as a parallel scanning machine and can do a time range scan + filtering +aggregation very fast.

On a separate note we didn't use joins even though they are supported in Phoenix, data was completely denormalized into one big table.