Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Add an option to ML get APIs to return only the fields required by the corresponding put API #63055

Closed
droberts195 opened this issue Sep 30, 2020 · 1 comment · Fixed by #63092
Assignees
Labels
>enhancement :ml Machine learning

Comments

@droberts195
Copy link
Contributor

Over the years many users have been frustrated that it is hard to programmatically get an ML job config from one cluster and then put it into a different cluster because the get jobs API returns augmented config documents containing more fields than the config required by the put job API. The put job API will reject configs that contain unknown fields. This creates a burden on the client implementing the transfer process to know which fields need to be kept, and this is especially difficult because the list varies from release to release.

We can address this by adding an option to the get jobs endpoint to return configs containing all the fields required by put job, but no more.

The same option and filtering functionality should be added to every other get endpoint we have that returns more than the corresponding put endpoint requires.

@droberts195 droberts195 added >enhancement :ml Machine learning labels Sep 30, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@benwtrent benwtrent self-assigned this Sep 30, 2020
benwtrent added a commit that referenced this issue Oct 2, 2020
This adds the new `for_export` flag to the following APIs:

- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The flag is designed for cloning or exporting configuration objects to later be put into the same cluster or a separate cluster. 

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that would effectively require changing to be of use (e.g. datafeed job_id)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)


closes #63055
benwtrent added a commit to benwtrent/elasticsearch that referenced this issue Oct 19, 2020
When exporting and cloning ml configurations in a cluster it can be
frustrating to remove all the fields that were generated by
the plugin. Especially as the number of these fields change
from version to version.

This flag, remove_generated, allows the GET config APIs to return
configurations with these generated fields removed.

relates to elastic#63055
benwtrent added a commit that referenced this issue Oct 20, 2020
…in GET config APIs (#63899)

When exporting and cloning ml configurations in a cluster it can be
frustrating to remove all the fields that were generated by
the plugin. Especially as the number of these fields change
from version to version.

This flag, exclude_generated, allows the GET config APIs to return
configurations with these generated fields removed.

APIs supporting this flag: 
- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)

relates to #63055
benwtrent added a commit to benwtrent/elasticsearch that referenced this issue Oct 20, 2020
…63092)

This adds the new `for_export` flag to the following APIs:

- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The flag is designed for cloning or exporting configuration objects to later be put into the same cluster or a separate cluster.

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that would effectively require changing to be of use (e.g. datafeed job_id)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)

closes elastic#63055
benwtrent added a commit to benwtrent/elasticsearch that referenced this issue Oct 20, 2020
…in GET config APIs (elastic#63899)

When exporting and cloning ml configurations in a cluster it can be
frustrating to remove all the fields that were generated by
the plugin. Especially as the number of these fields change
from version to version.

This flag, exclude_generated, allows the GET config APIs to return
configurations with these generated fields removed.

APIs supporting this flag:
- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)

relates to elastic#63055
benwtrent added a commit that referenced this issue Oct 20, 2020
This adds a new flag `exclude_generated` for GET transform API.

This flag is useful for when a transform needs to be cloned within a cluster or exported/imported between clusters.

It removes certain fields that are not able to be set via the PUT api (e.g. version, create_time).

relates #63055
benwtrent added a commit to benwtrent/elasticsearch that referenced this issue Oct 20, 2020
…63093)

This adds a new flag `exclude_generated` for GET transform API.

This flag is useful for when a transform needs to be cloned within a cluster or exported/imported between clusters.

It removes certain fields that are not able to be set via the PUT api (e.g. version, create_time).

relates elastic#63055
benwtrent added a commit that referenced this issue Oct 20, 2020
…63947)

This adds a new flag `exclude_generated` for GET transform API.

This flag is useful for when a transform needs to be cloned within a cluster or exported/imported between clusters.

It removes certain fields that are not able to be set via the PUT api (e.g. version, create_time).

relates #63055
benwtrent added a commit that referenced this issue Oct 20, 2020
…ields in GET config APIs (#63899)(#63092) (#63177)

* [ML] adding for_export flag for ml plugin GET resource APIs (#63092)

This adds the new `for_export` flag to the following APIs:

- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The flag is designed for cloning or exporting configuration objects to later be put into the same cluster or a separate cluster.

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that would effectively require changing to be of use (e.g. datafeed job_id)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)

closes #63055

* [ML] adding new flag exclude_generated that removes generated fields in GET config APIs (#63899)

When exporting and cloning ml configurations in a cluster it can be
frustrating to remove all the fields that were generated by
the plugin. Especially as the number of these fields change
from version to version.

This flag, exclude_generated, allows the GET config APIs to return
configurations with these generated fields removed.

APIs supporting this flag:
- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)

relates to #63055
pugnascotia pushed a commit to pugnascotia/elasticsearch that referenced this issue Oct 21, 2020
…in GET config APIs (elastic#63899)

When exporting and cloning ml configurations in a cluster it can be
frustrating to remove all the fields that were generated by
the plugin. Especially as the number of these fields change
from version to version.

This flag, exclude_generated, allows the GET config APIs to return
configurations with these generated fields removed.

APIs supporting this flag: 
- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)

relates to elastic#63055
pugnascotia pushed a commit to pugnascotia/elasticsearch that referenced this issue Oct 21, 2020
…63093)

This adds a new flag `exclude_generated` for GET transform API.

This flag is useful for when a transform needs to be cloned within a cluster or exported/imported between clusters.

It removes certain fields that are not able to be set via the PUT api (e.g. version, create_time).

relates elastic#63055
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :ml Machine learning
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants