Cassandra支持哪些类型的墓碑?

时间:2015-01-05 09:12:30

标签: cassandra cassandra-2.0 tombstone

Cassandra(版本2)支持哪些类型的墓碑?根据它支持的this文章(用CQL术语):

  • 行的特定列。
  • 静态列。
  • 分区键的所有行。

我错过了其他任何类型的墓碑吗?删除特定(CQL)行?是否有任何特殊的墓碑支持删除群集密钥或类似的范围?此信息有助于了解何时规划模式以避免过多的墓碑。

2 个答案:

答案 0 :(得分:7)

墓碑是放置在一行中的标记,表示删除。它们可以存在于不同的位置,列或列范围内,也可以存在于整行中。下面的示例显示了正常类型的逻辑删除(此处未涵盖范围类型)。

在规划模式时,您可以根据正在执行的查询类型对表进行建模,而不是只有一个表,您可能会发现在多个表中存在重复的数据。这些表经过优化,可以为传入的读写提供服务。下面的链接应该为您提供Cassandra数据建模的一些好背景:

http://www.datastax.com/resources/data-modeling

我的例子:我创建了一个表并插入了一些数据,然后用nodetool flush生成了一些sstables。使用sstable2json工具,您可以看到已删除的行,如果它的整行与单列略有不同,但基本上它仍然只是一个标记:

表格中包含所有数据:

$ ~/dse-4.5.1/resources/cassandra/bin/sstable2json ./dse-data/results/ts1/results-ts1-jb-1-Data.db 
[
{"key": "3136","columns": [["","",1417814256390000], ["col2","26",1417814256390000], ["col3","36",1417814256390000], ["id","id16",1417814256390000]]},
{"key": "3133","columns": [["","",1417814218246000], ["col2","23",1417814218246000], ["col3","33",1417814218246000], ["id","id13",1417814218246000]]},
{"key": "3135","columns": [["","",1417814244766000], ["col2","25",1417814244766000], ["col3","35",1417814244766000], ["id","id15",1417814244766000]]},
{"key": "3134","columns": [["","",1417814230711000], ["col2","24",1417814230711000], ["col3","34",1417814230711000], ["id","id14",1417814230711000]]},
{"key": "3132","columns": [["","",1417814207910000], ["col2","22",1417814207910000], ["col3","32",1417814207910000], ["id","id12",1417814207910000]]},
{"key": "3131","columns": [["","",1417814197094000], ["col2","21",1417814197094000], ["col3","31",1417814197094000], ["id","id11",1417814197094000]]},
{"key": "31","columns": [["","",1417814185270000], ["col2","2",1417814185270000], ["col3","3",1417814185270000], ["id","id1",1417814185270000]]}
]

继承cqlsh中的第一个删除:

cqlsh:results> delete from ts1 WHERE col1 = '1';
cqlsh:results> delete id from ts1 WHERE col1 = '11';

冲洗后产生的sstable:

[datastax@DSE3 ~]$ ~/dse-4.5.1/resources/cassandra/bin/sstable2json ./dse-data/results/ts1/results-ts1-jb-2-Data.db 
[
{"key": "3131","columns": [["id","54822130",1417814320400000,"d"]]},
{"key": "31","metadata": {"deletionInfo": {"markedForDeleteAt":1417814302304000,"localDeletionTime":1417814302}},"columns": []}
]

继承cqlsh中的下一个删除:

cqlsh:results> delete col2 from ts1 WHERE col1 = '12';

冲洗后产生的sstable:

[datastax@DSE3 ~]$ ~/dse-4.5.1/resources/cassandra/bin/sstable2json ./dse-data/results/ts1/results-ts1-jb-3-Data.db 
[
{"key": "3132","columns": [["col2","5482220b",1417814539434000,"d"]]}
]

当压缩发生时,所有这些sstables组合成一个单独的sstable,然后删除的行仍然存在但标记为删除,我们可以在运行压缩后再次看到这一点(查找d标记时间戳):

[datastax@DSE3 ~]$ ./dse-4.5.1/bin/nodetool compact
[datastax@DSE3 ~]$ ~/dse-4.5.1/resources/cassandra/bin/sstable2json ./dse-data/results/ts1/results-ts1-jb-4-Data.db 
[
{"key": "3136","columns": [["","",1417814256390000], ["col2","26",1417814256390000], ["col3","36",1417814256390000], ["id","id16",1417814256390000]]},
{"key": "3133","columns": [["","",1417814218246000], ["col2","23",1417814218246000], ["col3","33",1417814218246000], ["id","id13",1417814218246000]]},
{"key": "3135","columns": [["","",1417814244766000], ["col2","25",1417814244766000], ["col3","35",1417814244766000], ["id","id15",1417814244766000]]},
{"key": "3134","columns": [["","",1417814230711000], ["col2","24",1417814230711000], ["col3","34",1417814230711000], ["id","id14",1417814230711000]]},
{"key": "3132","columns": [["","",1417814207910000], ["col2","5482220b",1417814539434000,"d"], ["col3","32",1417814207910000], ["id","id12",1417814207910000]]},
{"key": "3131","columns": [["","",1417814197094000], ["col2","21",1417814197094000], ["col3","31",1417814197094000], ["id","54822130",1417814320400000,"d"]]},
{"key": "31","metadata": {"deletionInfo": {"markedForDeleteAt":1417814302304000,"localDeletionTime":1417814302}},"columns": []}
]

现在这个表将保持这样,直到我们到达gc_grace_seconds,然后在下一次压缩时,行实际上会消失,看着我们放下gc_grace_seconds然后运行压缩:

cqlsh> ALTER TABLE results.ts1 WITH gc_grace_seconds=500;
cqlsh> exit
[datastax@DSE3 ~]$ ./dse-4.5.1/bin/nodetool compact results;

[datastax@DSE3 ~]$ ./dse-4.5.1/resources/cassandra/bin/sstable2json ./dse-data/results/ts1/results-ts1-jb-5-Data.db 
[
{"key": "3136","columns": [["","",1417814256390000], ["col2","26",1417814256390000], ["col3","36",1417814256390000], ["id","id16",1417814256390000]]},
{"key": "3133","columns": [["","",1417814218246000], ["col2","23",1417814218246000], ["col3","33",1417814218246000], ["id","id13",1417814218246000]]},
{"key": "3135","columns": [["","",1417814244766000], ["col2","25",1417814244766000], ["col3","35",1417814244766000], ["id","id15",1417814244766000]]},
{"key": "3134","columns": [["","",1417814230711000], ["col2","24",1417814230711000], ["col3","34",1417814230711000], ["id","id14",1417814230711000]]},
{"key": "3132","columns": [["","",1417814207910000], ["col3","32",1417814207910000], ["id","id12",1417814207910000]]},
{"key": "3131","columns": [["","",1417814197094000], ["col2","21",1417814197094000], ["col3","31",1417814197094000]]}
]

注意密钥31的行如何已经消失,col1行中的密钥3132和行id3131 / p>

为清晰起见,我的表格架构:

cqlsh:results> DESCRIBE TABLE ts1 ;

CREATE TABLE ts1 (
  col1 text,
  col2 text,
  col3 text,
  id text,
  PRIMARY KEY ((col1))
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

作为脚注,sstable2json输出中的墓碑标记如下:

e - 已过期的TTL

d - 已删除的值(墓碑)

t - 已删除的值范围(范围逻辑删除)

答案 1 :(得分:2)

添加到@ markc的答案,还有一个列范围的墓碑,无论何时使用集合都会显示。我们有一个名为“tags”的set<text>列,每当我们插入一行时,我们会得到其中一行(即使我们只是将它设置为null,就像在这种情况下一样):

["1381316637599609:45787829:tags:_","1381316637599609:45787829:tags:!",1438264650252000,"t",1438264650],

我们认为“t”代表墓碑。 This blog post详述了这种墓碑的另一个例子。