-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete index when cloud snapshotting needs esArchiver retry #39919
Comments
Pinging @elastic/kibana-operations |
Pinging @elastic/kibana-test-triage |
@jinmu03 we need someone from ops team to look at this issue. Thanks. |
I don't know what we are going to do about this. Does Cloud have an API or anything to disable snapshotting? The issues is ES does not allow you to delete an index while it's being snapshotted, and we do that all the time in functional tests. It's possible we could re-architect things to not require deleting but that's a pretty massive change. |
As noted above, Spencer added something for this in esArchiver, I thought maybe since the original code was looking for status 500 but seems like we are getting 400 now that might fix it. I am not familiar with esArchiver so not sure, maybe @spalger can comment on if it would fix it. Possibly this fiile? https://github.com/elastic/kibana/blob/master/src/es_archiver/lib/indices/delete_index.js |
I took a closer look, seems we just need to increase the retry done in es_archiver delete_index. I have put in a PR to update. Tested on 7.5.0. |
On cloud I see this error once during our test runs:
{"path":"/.kibana_1%2C.kibana_2","query":{},"statusCode":400,"response":"{"error":{"root_cause":[{"type":"snapshot_in_progress_exception","reason":"Cannot delete indices that are being snapshotted: [[.kibana_1/QgsxgBUARzWvK3c9daDqKw], [.kibana_2/adK-0Z5NSyqjShoJNOd2tQ]]. Try again after snapshot finishes or cancel the currently running snapshot."}],"type":"snapshot_in_progress_exception","reason":"Cannot delete indices that are being snapshotted: [[.kibana_1/QgsxgBUARzWvK3c9daDqKw], [.kibana_2/adK-0Z5NSyqjShoJNOd2tQ]]. Try again after snapshot finishes or cancel the currently running snapshot."},"status":400}"}
In #39381 when it occurred, Lee mentioned it might be related to an esArchiver PR: #18624
I noticed it is checking for status code 500 but status 400 is being returned, maybe we should also check for 400 status code?
See:
kibana/src/es_archiver/lib/indices/delete_index.js
Line 112 in 138438f
The text was updated successfully, but these errors were encountered: