Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PYTHON] Add new --auto_unique_labels option to StandardOptions #28984

Merged
merged 8 commits into from
Oct 24, 2023

Conversation

hjtran
Copy link
Contributor

@hjtran hjtran commented Oct 13, 2023

Motivating Issue

The python SDK requires users specify a unique label for every transform. Often times, the first time a transform is applied there is a default label that will be unique, but any future applications of the transform will require that the user specify a new and unique label. This can be quite tedious since there are many (?) scenarios where a user won't actually care about the labels. A couple scenarios that came up in my experience:

  • Writing unit tests with multiple assert_that()s requires unique label for each assert
  • Writing a pipeline that branches and applies the same transform to each branch
  • Writing a linear pipeline where I want to reuse the same transform (e.g. LogElements or Deduplicate)

Change Summary

This change adds a new --auto_unique_labels standard option. The option defaults to off so there's no change in the default behavior. If the option is set, then whenever a transform is applied with a non-unique label, a new label is generated that includes an automatically incremented suffix. For example, if Deduplicate is applied twice, then the second deduplicate will have a label of Deduplicate_1. Additional Deduplicates would have a label of Deduplicate_n.

Testing

Wrote a new unit test that tests the new behavior. The rest of the pipeline_test.py tests still pass except for one: test_display_data. It also fails on my fork's master though so I'm assuming it's unrelated (EDIT: confirmed unrelated, it's just failing b/c it doesn't expect the URN to change when pipeline_test.py is invoked directly, in which case the URN references __main__)

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@hjtran
Copy link
Contributor Author

hjtran commented Oct 13, 2023

reviewer: @robertwb

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@codecov
Copy link

codecov bot commented Oct 13, 2023

Codecov Report

Merging #28984 (c7d1305) into master (38720a3) will decrease coverage by 33.86%.
Report is 452 commits behind head on master.
The diff coverage is 27.27%.

@@             Coverage Diff             @@
##           master   #28984       +/-   ##
===========================================
- Coverage   72.22%   38.37%   -33.86%     
===========================================
  Files         684      686        +2     
  Lines      100856   101673      +817     
===========================================
- Hits        72846    39013    -33833     
- Misses      26434    61084    +34650     
  Partials     1576     1576               
Flag Coverage Δ
python 29.98% <27.27%> (-52.84%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...dks/python/apache_beam/options/pipeline_options.py 64.84% <100.00%> (-29.69%) ⬇️
sdks/python/apache_beam/pipeline.py 19.79% <20.00%> (-72.07%) ⬇️

... and 314 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@robertwb robertwb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@robertwb robertwb merged commit 48722f1 into apache:master Oct 24, 2023
82 of 84 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants