Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rocksandra] support cassandra partition deletion #3874

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wpc
Copy link
Contributor

@wpc wpc commented May 18, 2018

To support partition deletion in Rocksandra, we created a separated partition meta cf in each database and passing db and cf handle into the compaction filter. The compaction filter is in charge of dropping the deleted data based on deletion info it read from the partition meta cf. This PR is the first step just for releasing the disk space. Next step would change in cassandra merge operator to convert partition deleted rows into tombstones.

  • add a merge operator for parition meta data (currently partition
    deletion info only)
  • read partition deletion in cassandra compaction filter and drop rows
    if it's partition has been deleted
  • make format for cassandra related test files

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wpc has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@wpc has updated the pull request.

* add a merge operator for parition meta data (currently partition
  deletion info only)
* read partition deletion in cassandra compaction filter and drop rows
  if it's partition has been deleted
@facebook-github-bot
Copy link
Contributor

@wpc has updated the pull request.

@wpc
Copy link
Contributor Author

wpc commented May 21, 2018

update the PR make sure iterator on partition meta cf is deleted after use

DikangGu pushed a commit to Instagram/cassandra that referenced this pull request Sep 2, 2018
Summary:
For supporting partition level deletion we create a partition meta cf in each rocksdb instance, and store partition deletion info into it. On rocksdb side compaction filter will read partition deletion info from this cf and drop data base on marked_for_delete_at. (facebook/rocksdb#3874)

Streaming for partition meta data will be in a separated diff

Test Plan:
Fucntional
========

partition dump after deletion
```
--- metadata:
0x816099270c387b3c989d7ecf13053ec1      0x5b0254a800056cb0504844ab
--- rows:
0x816099270c387b3c989d7ecf13053ec180000000238e8b0e      0x7fffffff8000000000000000000000056b954457139f00000010be10e78051a611e88080808080808080
0x816099270c387b3c989d7ecf13053ec18000000046a34f04      0x7fffffff8000000000000000000000056b955bb808da000000109a01d60051a711e88080808080808080
0x816099270c387b3c989d7ecf13053ec180000000c1b912cd      0x7fffffff8000000000000000000000056b952ddfd27100000010e51ae98051a511e88080808080808080
0x816099270c387b3c989d7ecf13053ec180000000db4233f7      0x7fffffff8000000000000000000000056b9533f4a2a3000000101a273c0051a611e88080808080808080
0x816099270c387b3c989d7ecf13053ec180000000f7465f43      0x7fffffff8000000000000000000000056b954625e8fa00000010d386118051a611e88080808080808080
0x816099270c387b3c989d7ecf13053ec18000000116af6e77      0x7fffffff8000000000000000000000056b959d045dbf000000103e85178051aa11e88080808080808080
```

dump after full compaction  finish
```
--- metadata:
0x816099270c387b3c989d7ecf13053ec1      0x5b0254a800056cb0504844ab
--- rows:
```
Performance
==========
No obvious CPU/IO regresssion
https://fburl.com/ods/e9m1tdr2
https://fburl.com/ods/o5j049hb

Reviewers: svemuri, dikang, sdev, #ig-cassandra

Reviewed By: dikang

Subscribers: fdeliege, trunkagent

Differential Revision: https://phabricator.intern.facebook.com/D8063994

Signature: 8063994:1527988646:7d236751d82d4fee40e5b0ca3dd1da94d8e97e57
wpc added a commit to wpc/cassandra that referenced this pull request Jan 29, 2019
Summary:
For supporting partition level deletion we create a partition meta cf in each rocksdb instance, and store partition deletion info into it. On rocksdb side compaction filter will read partition deletion info from this cf and drop data base on marked_for_delete_at. (facebook/rocksdb#3874)

Streaming for partition meta data will be in a separated diff

Test Plan:
Fucntional
========

testing it in storyarchive cluster

partition dump after deletion
```
[23:32:32 root@priv_prn/instagram/cassandra-data-storyarchiverocks/25 /var/log/cassandra]$ nodetool dumppartition storyarchive reel_media_viewer_by_ts_perm_compact_001 1773713256096284353
--- metadata:
0x816099270c387b3c989d7ecf13053ec1      0x5b0254a800056cb0504844ab
--- rows:
0x816099270c387b3c989d7ecf13053ec180000000238e8b0e      0x7fffffff8000000000000000000000056b954457139f00000010be10e78051a611e88080808080808080
0x816099270c387b3c989d7ecf13053ec18000000046a34f04      0x7fffffff8000000000000000000000056b955bb808da000000109a01d60051a711e88080808080808080
0x816099270c387b3c989d7ecf13053ec180000000c1b912cd      0x7fffffff8000000000000000000000056b952ddfd27100000010e51ae98051a511e88080808080808080
0x816099270c387b3c989d7ecf13053ec180000000db4233f7      0x7fffffff8000000000000000000000056b9533f4a2a3000000101a273c0051a611e88080808080808080
0x816099270c387b3c989d7ecf13053ec180000000f7465f43      0x7fffffff8000000000000000000000056b954625e8fa00000010d386118051a611e88080808080808080
0x816099270c387b3c989d7ecf13053ec18000000116af6e77      0x7fffffff8000000000000000000000056b959d045dbf000000103e85178051aa11e88080808080808080
```

dump after full compaction  finish
```
[01:19:46 root@priv_prn/instagram/cassandra-data-storyarchiverocks/25 /var/log/cassandra]$ nodetool dumppartition storyarchive reel_media_viewer_by_ts_perm_compact_001 1773713256096284353
--- metadata:
0x816099270c387b3c989d7ecf13053ec1      0x5b0254a800056cb0504844ab
--- rows:
```
Performance
==========
Tested on priv_ftw/instagram/cassandra-data-feedviewstaterocks/15 (high compaction, no deletion), deploy at 5/31 2:54pm
No obvious CPU/IO regresssion
https://fburl.com/ods/e9m1tdr2
https://fburl.com/ods/o5j049hb

Reviewers: svemuri, dikang, sdev, #ig-cassandra

Reviewed By: dikang

Subscribers: fdeliege, trunkagent

Differential Revision: https://phabricator.intern.facebook.com/D8063994

Signature: 8063994:1527988646:7d236751d82d4fee40e5b0ca3dd1da94d8e97e57
@siying siying self-assigned this Sep 10, 2019
@siying
Copy link
Contributor

siying commented Sep 10, 2019

@wpc Do we still need it?

@jay-zhuang
Copy link
Contributor

@wpc Do we still need it?

Yes, this is still needed.

@jay-zhuang
Copy link
Contributor

@siying , I rebased the code to master and fixed the test failures: #5898 Please review and please close this one.

@vjnadimpalli vjnadimpalli self-assigned this Oct 10, 2019
@vjnadimpalli
Copy link
Contributor

vjnadimpalli commented Oct 11, 2019

@wpc @cooldoger the change LGTM! I haven't gone into details of cassandra specific logic, hopefully someone on your team can review that. Please rebase and make sure all tests pass and I'd be happy to land it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants