Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up kill tasks by deleting segments in batch #14131

Conversation

TSFenwick
Copy link
Contributor

@TSFenwick TSFenwick commented Apr 20, 2023

Description

create new kill method in DataSegmentKiller that takes a list of DataSegments to kill. This will enable implementers of DataSegmentKiller to leverage a batch delete that the underlying storage system may have. DataSegmentKiller has a default method of iterating through all segments and calling delete on them to enable backwards compatibility. Nothing is new is needed to leverage batch delete for s3. We currently use the same s3 api call for other s3 deletions so i have full confidence in this change and have tested this using s3.

Manual test case scenarios

testing results

Incorrect Access Key

{
  "id": "api-issued_kill_testPerms2_ppcbonpl_1000-01-01T00:00:00.000Z_2023-06-28T00:00:00.000Z_2023-06-29T20:00:16.984Z",
  "groupId": "api-issued_kill_testPerms2_ppcbonpl_1000-01-01T00:00:00.000Z_2023-06-28T00:00:00.000Z_2023-06-29T20:00:16.984Z",
  "type": "kill",
  "createdTime": "2023-06-29T20:00:17.034Z",
  "queueInsertionTime": "1970-01-01T00:00:00.000Z",
  "statusCode": "FAILED",
  "status": "FAILED",
  "runnerStatusCode": "WAITING",
  "duration": 7556,
  "location": {
    "host": "localhost",
    "port": 8100,
    "tlsPort": -1
  },
  "dataSource": "testPerms2",
  "errorMsg": "org.apache.druid.segment.loading.SegmentLoadingException: Couldn't delete segments from s3 see logs ..."
}

log_invalid_access_key.txt

Successful no errors - single bucket

{
  "id": "api-issued_kill_testPerms4_kghaihhc_1000-01-01T00:00:00.000Z_2023-06-28T00:00:00.000Z_2023-06-29T19:47:29.211Z",
  "groupId": "api-issued_kill_testPerms4_kghaihhc_1000-01-01T00:00:00.000Z_2023-06-28T00:00:00.000Z_2023-06-29T19:47:29.211Z",
  "type": "kill",
  "createdTime": "2023-06-29T19:47:29.212Z",
  "queueInsertionTime": "1970-01-01T00:00:00.000Z",
  "statusCode": "SUCCESS",
  "status": "SUCCESS",
  "runnerStatusCode": "WAITING",
  "duration": 15724,
  "location": {
    "host": "localhost",
    "port": 8100,
    "tlsPort": -1
  },
  "dataSource": "testPerms4",
  "errorMsg": null
}

log_delete_no_issues.txt

unable to delete indices

{
  "id": "api-issued_kill_testPerms3_kjakfgaj_1000-01-01T00:00:00.000Z_2023-06-28T00:00:00.000Z_2023-06-29T20:52:04.801Z",
  "groupId": "api-issued_kill_testPerms3_kjakfgaj_1000-01-01T00:00:00.000Z_2023-06-28T00:00:00.000Z_2023-06-29T20:52:04.801Z",
  "type": "kill",
  "createdTime": "2023-06-29T20:52:04.804Z",
  "queueInsertionTime": "1970-01-01T00:00:00.000Z",
  "statusCode": "FAILED",
  "status": "FAILED",
  "runnerStatusCode": "WAITING",
  "duration": 7226,
  "location": {
    "host": "localhost",
    "port": 8100,
    "tlsPort": -1
  },
  "dataSource": "testPerms3",
  "errorMsg": "org.apache.druid.segment.loading.SegmentLoadingException: Couldn't delete segments from s3 see logs ..."
}

log_delete_no_perms_on_keys.txt

Successful delete across multiple buckets

{
  "id": "api-issued_kill_testPerms7_fmpfgjpj_1000-01-01T00:00:00.000Z_2023-06-28T00:00:00.000Z_2023-06-29T21:46:12.768Z",
  "groupId": "api-issued_kill_testPerms7_fmpfgjpj_1000-01-01T00:00:00.000Z_2023-06-28T00:00:00.000Z_2023-06-29T21:46:12.768Z",
  "type": "kill",
  "createdTime": "2023-06-29T21:46:12.779Z",
  "queueInsertionTime": "1970-01-01T00:00:00.000Z",
  "statusCode": "SUCCESS",
  "status": "SUCCESS",
  "runnerStatusCode": "WAITING",
  "duration": 10321,
  "location": {
    "host": "localhost",
    "port": 8100,
    "tlsPort": -1
  },
  "dataSource": "testPerms7",
  "errorMsg": null
}

log_delete_across_multiple_buckets.txt

rough perf test results
~1025 segments deleted in 13seconds batch delete, normal delete took 5 minutes 40 seconds

Release note

Deletion of multiple segments stored in S3 will be much faster now with it leveraging batch deletion.


Key changed/added classes in this PR
  • S3DataSegmentKiller.java
  • OmniDataSegmentKiller.java
  • DataSegmentKiller.java
  • KillUnusedSegmentsTask.java

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@TSFenwick TSFenwick changed the title WIP allow for batched delete of segments instead of deleting segment data… s3 batch delete segments instead of 1 at a time Apr 26, 2023
@TSFenwick TSFenwick marked this pull request as ready for review April 26, 2023 20:12
@suneet-s suneet-s changed the title s3 batch delete segments instead of 1 at a time Speed up kill tasks by deleting segments in batch Apr 26, 2023
@suneet-s
Copy link
Contributor

Added Design Review as this introduces a change to the DataSegmentKiller interface which is an extension point.

@TSFenwick TSFenwick force-pushed the feature-bulk-delete-files-when-killing-unused-segments branch from b8a8285 to 6e5040e Compare April 26, 2023 23:32
Copy link
Contributor

@suneet-s suneet-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the API design. I think there are some improvements that can be made to the S3 implementation - let me know what you think. I also think that this behavior should be enabled by default in KillUnusedSegmentsTask since it is the more efficient way to delete segments. Is there a reason not to enable it by default?

Please also describe the manual testing that was done with this change. Specifically, I would like to know:

  • What happens when you try to delete segment files that have already been removed from S3?
  • What happens when deleting one segment or descriptor in the batch fails for some reason?

Can you also share what the log messages look like in the failure cases. Since we are batching the requests - do we need to be more clever about listing all the keys that failed to delete in the logs?

Overall, this is a very exciting change! Thank you for implementing this!

@@ -129,8 +129,12 @@ public TaskStatus runTask(TaskToolbox toolbox) throws Exception

// Kill segments
toolbox.getTaskActionClient().submit(new SegmentNukeAction(new HashSet<>(unusedSegments)));
for (DataSegment segment : unusedSegments) {
toolbox.getDataSegmentKiller().kill(segment);
if (getContextValue("batchDelete", false)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this feature flag, it should be enabled by default as this is the better method. Since there is no change in behavior with the flag enabled vs disabled, I don't see a reason to default disable this behavior when batch deleting segments is the more optimal way to delete segments. We should also document this in the task context documentation and mention that this feature flag will be going away in the very near future.

Is there a way to configure the auto-kill to specify this context so that Druid operators who have auto-kill enabled can enable / disable this behavior for kill tasks generated from the auto-kill functionality in Druid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah i was thinking the otherway around. have it be opt in at first, then opt out. Just to make it be for the more safety conscious users to have a chance to migrate to it. Never got around to imagining of removing it entirely. Im just hung up the whole feature flag and how no other extension can take advantage of it. So it's more of a best effort feature flag.

I would need to look into auto-kill and how that works

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO enabling this new feature by default is best, as it's better and the risk isn't super high.

I'm not even sure we need a feature flag at all. We should be able to get the risk low enough with appropriate testing that a flag isn't needed.

If we're gonna have one, then yeah, it needs to be settable for people that use autokill.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As to testing: one thing to think about is whether this uses any new S3 APIs and therefore may require new IAM permissions. If it would require new IAM permissions then we definitely need a feature flag and we need to call it out in the docs and release notes. Or, better, adjust the design to not use new APIs and not need new IAM permissions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to go the "not require new APIs" route — which IMO is best — then we should have a note in the interface that says that the multi-arg form of kill must not require additional permissions on the deep storage beyond what the single-arg form of kill requires. Otherwise, when a multi-arg impl is contributed for another deep storage, it could break people in production that don't have the right set of permissions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gianm I decided to enable this by default and not have a feature flag for it at all. The api that is used to delete multiple keys from a bucket is already used elsewhere in the extension.

I added that verbiage for kill not require additional permissions

… one by one

create new batchdelete method in datasegment killer that has default functionality
of iterating through all segments and calling delete on them. This will enable
a slow rollout of other deepstorage implementations to move to a batched delete
on their own time
cleaned up code
just need to add tests and docs for this functionality
@TSFenwick TSFenwick force-pushed the feature-bulk-delete-files-when-killing-unused-segments branch from 18d5f45 to 6b5c887 Compare April 28, 2023 15:34
Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new method signature looks good to me. I had some notes about the feature flag and testing and the required set of permissions; see the comments in thread https://github.com/apache/druid/pull/14131/files#r1178437209.

…couldn't be deleted

if there was a client error or server error
Copy link
Contributor

@suneet-s suneet-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through the logs, I noticed that the log lines that report the files that are to be deleted and the files that failed to be deleted are very long and hard to read. However, I do not have a suggestion to make it better as an operator will likely want to see all the files that will be deleted in the logs. Perhaps a newline separator between the file names in the list instead of a comma will make it easier to read. ¯_(ツ)_/¯

I also noticed that to generate these messages we have to make copies of lists and create new objects. This can introduce additional memory pressure for the kill task that didn't exist before. I think this is acceptable as it is likely there is enough additional heap for the peons to deal with this.

LGTM after CI passes the intelliJ inspections. The code coverage failures reported in the processing modules test is not concerning to me as this is a very simple code path, that doesn't seem like it will benefit from unit tests.

Error:  extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentKillerTest.java:355 -- The declared exception <code>SegmentLoadingException</code> is never thrown
Error:  extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentKillerTest.java:392 -- The declared exception <code>SegmentLoadingException</code> is never thrown
Error:  extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentKillerTest.java:331 -- The declared exception <code>SegmentLoadingException</code> is never thrown
Error: Process completed with exit code 1.

@suneet-s
Copy link
Contributor

@gianm / @rohangarg any other comments on this patch?

@TSFenwick TSFenwick requested a review from gianm June 30, 2023 18:33
// this was a shortcut to handle the many different ways there could potentially be failures and handle them
// reasonably
throw new SegmentLoadingException(
"Couldn't delete segments from s3 see logs for more details"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please improve the grammar (missing punctuation) and capitalization (S3).

Also, can we interpolate some useful information into this error message? Such as the number of segments, the first segment ID (or first N IDs) that couldn't be deleted, the nature of the reason why it couldn't be deleted, etc. This info could come back from deleteKeysForBucket rather than a boolean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gianm The reason why i didn't do that is because all the information is truncated in the status for the delete task. So i figured it wasn't worth the complication to corral that state and present it in a meaningful way if it gets truncated.

This is how it looks in reality in the status object.

"errorMsg": "org.apache.druid.segment.loading.SegmentLoadingException: Couldn't delete from bucket: [tomfenwick-t..."

that being said i can still make those changes

@abhishekrb19 abhishekrb19 mentioned this pull request Jul 21, 2023
@jasonk000
Copy link
Contributor

jasonk000 commented Jul 21, 2023

@TSFenwick we'd spotted the same in #14634 , please @ me if you'd like any help to close this out, extra testing, etc, as this would save me a lot of effort too (~100 million segments pending to delete). Great work.

@jasonk000
Copy link
Contributor

fyi, a related change to the SQL layer #14639

Copy link
Contributor

@maytasm maytasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jasonk000 jasonk000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few notes

@maytasm
Copy link
Contributor

maytasm commented Jul 24, 2023

This PR now has three +1 from three committers.

@TSFenwick Can you check the failing tests?

@rohangarg @gianm Do you have any other comments or feel that this PR should not be merged?

Copy link
Member

@rohangarg rohangarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the CI passes, would recommend rebasing it with master too while fixing the failed tests

@TSFenwick
Copy link
Contributor Author

TSFenwick commented Jul 25, 2023

@rohangarg @maytasm I fixed the one broken test. I was told by @suneet-s and @abhishekagarwal87 to not care about the failing code coverage the default method seen below for the past test runs

https://github.com/apache/druid/actions/runs/5650168274/job/15306385499#step:11:2167
https://github.com/apache/druid/actions/runs/5650168274/job/15306385132#step:11:2167

@maytasm maytasm merged commit 9a9038c into apache:master Jul 27, 2023
63 of 65 checks passed
@LakshSingla LakshSingla added this to the 28.0 milestone Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants