This is a good tutorial. The problem of out-of-order data becomes more challengi...

roskilli · on July 4, 2021

Right exactly. As a point of reference, within M3DB each unique time series has a list of “in-order” compressed timestamp/float64 tuple streams. When a datapoint is written the series finds an encoder that it can append while keeping the stream in order (timestamp ascending), and if no such stream exists a new stream is created and becomes writeable for any datapoints that arrive with time stamps greater than the last written point.

At query time these streams are read by evaluating the next timestamp of all written streams for a block of time and then taking the datapoint with the lowest timestamp of the streams.

M3DB also runs a background tick that targets to complete within a few minutes each run to amortize CPU. During this process each series merges any streams that have sibling streams created due to out of order writes, producing one single in order stream. This is done by the same process used at query time to read the datapoints in order and they are consequently written to a new single compressed stream. This way extra computation due to our of order writes is amortized and only if a large percentage of series are written in time descending order do you end up with a significant overhead at write and read time. It also reduces the cost of persisting the current mutable data to a volume on disk (whether for snapshot or for persisting data for a completed time time window).

minitoar · on July 4, 2021

For Interana I think we end up doing this by batching writes & sorts, and not really having a strict guarantee on when data imported actually shows up in query results.

yashap · on July 5, 2021

Whatever you’re doing, it works great :) We used Interana at a previous company I worked at, and the combo of query flexibility and performance was excellent, really liked the product!

minitoar · on July 5, 2021

TY! Twitter has since acquired the Interana engineering team & IP, so we’re now doing the same thing at Twitter.

yashap · on July 5, 2021

Ah interesting, like as an internal tool there?

minitoar · on July 5, 2021