SLM-Tree（Log Structured Merge Tree 日志结构合并树），是 Apache HBase 和 Apache Cassandra 使用的数据结构
对于 Cassandra 来说，数据写入步骤如下：
- 记录数据到 commit log
- 写入数据到 memtable
- 将 memtable 中的数据刷到 SSTable
- SSTable 合并
Commit log 故障恢复
当向 Cassandra 写入数据时，Cassandra 会先将数据追加到硬盘的 commit log 中。这样，当 Cassandra 从故障中恢复，通过重播 commit log，即可恢复之前 memtable 的数据。
The leveled compaction strategy creates SSTables of a fixed, relatively small size (160 MB by default) that are grouped into levels. Within each level, SSTables are guaranteed to be non-overlapping. Each level (L0, L1, L2 and so on) is 10 times as large as the previous. Disk I/O is more uniform and predictable on higher than on lower levels as SSTables are continuously being compacted into progressively larger levels. At each level, row keys are merged into non-overlapping SSTables in the next level. This process can improve performance for reads, because Cassandra can determine which SSTables in each level to check for the existence of row key data. This compaction strategy is modeled after Google's LevelDB implementation.
The default compaction strategy. This strategy triggers a minor compaction when there are a number of similar sized SSTables on disk as configured by the table subproperty, min_threshold. A minor compaction does not involve all the tables in a keyspace.
This strategy is an alternative for time series data. TWCS compacts SSTables using a series of time windows. While with a time window, TWCS compacts all SSTables flushed from memory into larger SSTables using STCS. At the end of the time window, all of these SSTables are compacted into a single SSTable. Then the next time window starts and the process repeats. The duration of the time window is the only setting required.
- 《Cassandra: The Definitive Guide, 2nd Edition》, O'Reilly Media, Eben Hewitt and Jeff Carpenter
- 《Designing Data-Intensive Applications》, O'Reilly Media, Martin Kleppmann
- Configuring memtable thresholds | Apache Cassandra 3.0
- Configuring compaction | Apache Cassandra 3.0