Recently did some research on Data Historians, which are effectively just time series databases. Time series databases is the name of genre of databases which are optimized for storing large quantities of data as it is catured over time- think of sensors taking readings every second, or even many times per second. A data historian is typically used in facilities like process plants to store this large quantity of data in a reliable, yet efficient way. Due to the large quantity of values, the storage has to be optimized to prevent exploding database sizes. This is accomplished in a number of ways, often a combination of them. Since long series of readings will often have very similar or same values, the server can take advantage of these small deltas to store less data. Often plain old compression technology is used in conjunction with these other methods.
Since historically a data historian has been a very expensive specialized asset, I was curious if the open source world had tried to tackle this one. I found a couple TSDB’s out there, but the most mature one I found as at http://opentsdb.net – which appears to be used by the StumbleUpon website. Interesting stuff.