Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Allow force stopping failed and stopping DF analytics #54650

Conversation

dimitris-athanasiou
Copy link
Contributor

Force stopping a failed job used to work but it
now puts the job in stopping state and hangs.
In addition, force stopping a stopping job is
not handled.

This commit addresses those issues with force
stopping data frame analytics. It inlines the
approach with that followed for anomaly detection
jobs.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

failed.add(analyticsId);
break;
default:
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you've only moved this in this PR, but maybe it's still best to add an assert in the default case just in case someone adds a new state but doesn't add it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Force stopping a failed job used to work but it
now puts the job in `stopping` state and hangs.
In addition, force stopping a `stopping` job is
not handled.

This commit addresses those issues with force
stopping data frame analytics. It inlines the
approach with that followed for anomaly detection
jobs.
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dimitris-athanasiou dimitris-athanasiou merged commit 6fd18f7 into elastic:master Apr 3, 2020
@dimitris-athanasiou dimitris-athanasiou deleted the fix-force-stopping-df-analytics branch April 3, 2020 11:06
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Apr 3, 2020
…tic#54650)

Force stopping a failed job used to work but it
now puts the job in `stopping` state and hangs.
In addition, force stopping a `stopping` job is
not handled.

This commit addresses those issues with force
stopping data frame analytics. It inlines the
approach with that followed for anomaly detection
jobs.

Backport of elastic#54650
dimitris-athanasiou added a commit that referenced this pull request Apr 3, 2020
…) (#54712)

Force stopping a failed job used to work but it
now puts the job in `stopping` state and hangs.
In addition, force stopping a `stopping` job is
not handled.

This commit addresses those issues with force
stopping data frame analytics. It inlines the
approach with that followed for anomaly detection
jobs.

Backport of #54650
dimitris-athanasiou added a commit that referenced this pull request Apr 3, 2020
…) (#54715)

Force stopping a failed job used to work but it
now puts the job in `stopping` state and hangs.
In addition, force stopping a `stopping` job is
not handled.

This commit addresses those issues with force
stopping data frame analytics. It inlines the
approach with that followed for anomaly detection
jobs.

Backport of #54650
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Apr 16, 2020
After elastic#54650 we catch `TaskCancelledException` when we wait for
reindexing to complete as it may be thrown. However, when that happens
we do not mark the task as completed. This results in the stop request
never returning and the failures we saw in elastic#55068.

Closes elastic#55068
dimitris-athanasiou added a commit that referenced this pull request Apr 16, 2020
…55286)

After #54650 we catch `TaskCancelledException` when we wait for
reindexing to complete as it may be thrown. However, when that happens
we do not mark the task as completed. This results in the stop request
never returning and the failures we saw in #55068.

Closes #55068
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Apr 16, 2020
…xing (elastic#55286)

After elastic#54650 we catch `TaskCancelledException` when we wait for
reindexing to complete as it may be thrown. However, when that happens
we do not mark the task as completed. This results in the stop request
never returning and the failures we saw in elastic#55068.

Closes elastic#55068

Backport of elastic#55286
dimitris-athanasiou added a commit that referenced this pull request Apr 16, 2020
…xing (#55286) (#55290)

After #54650 we catch `TaskCancelledException` when we wait for
reindexing to complete as it may be thrown. However, when that happens
we do not mark the task as completed. This results in the stop request
never returning and the failures we saw in #55068.

Closes #55068

Backport of #55286
dimitris-athanasiou added a commit that referenced this pull request Apr 16, 2020
…xing (#55286) (#55295)

After #54650 we catch `TaskCancelledException` when we wait for
reindexing to complete as it may be thrown. However, when that happens
we do not mark the task as completed. This results in the stop request
never returning and the failures we saw in #55068.

Closes #55068

Backport of #55286
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants