Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Hugging Face pipeline support #27399

Merged
merged 35 commits into from
Aug 2, 2023

Conversation

riteshghorse
Copy link
Contributor

@riteshghorse riteshghorse commented Jul 7, 2023

This PR adds Hugging Face pipeline support to RunInference by adding a new model handler for it. Build on top of #26632

Example Job with Question Answering Pipeline on Dataflow


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@codecov
Copy link

codecov bot commented Jul 7, 2023

Codecov Report

Merging #27399 (7db987b) into master (1526263) will decrease coverage by 0.26%.
Report is 117 commits behind head on master.
The diff coverage is 2.42%.

@@            Coverage Diff             @@
##           master   #27399      +/-   ##
==========================================
- Coverage   71.09%   70.84%   -0.26%     
==========================================
  Files         859      862       +3     
  Lines      104555   104939     +384     
==========================================
+ Hits        74338    74347       +9     
- Misses      28660    29035     +375     
  Partials     1557     1557              
Flag Coverage Δ
python 79.85% <2.42%> (-0.44%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...amples/inference/huggingface_question_answering.py 0.00% <0.00%> (ø)
.../apache_beam/ml/inference/huggingface_inference.py 0.00% <0.00%> (ø)
...xamples/inference/huggingface_language_modeling.py 13.23% <13.23%> (ø)

... and 16 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@riteshghorse riteshghorse marked this pull request as ready for review July 24, 2023 19:30
@riteshghorse riteshghorse changed the title [WIP] Hugging Face pipeline support [Python] Hugging Face pipeline support Jul 24, 2023
@riteshghorse
Copy link
Contributor Author

Run Python 3.8 PostCommit

@riteshghorse
Copy link
Contributor Author

hugging face integration tests passed on 3.8 PostCommit
image

@riteshghorse
Copy link
Contributor Author

Run Python 3.8 PostCommit

@riteshghorse
Copy link
Contributor Author

riteshghorse commented Jul 28, 2023

PTAL, ready for review. Hugging Face integration tests passed. Some other tests unrelated to this PR are failing. Opened #27734

@riteshghorse
Copy link
Contributor Author

Run Python 3.8 PostCommit

@riteshghorse
Copy link
Contributor Author

assign to next reviewer

@riteshghorse
Copy link
Contributor Author

R: @damccorm

@github-actions
Copy link
Contributor

github-actions bot commented Aug 1, 2023

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, but the core pieces look good to me

@damccorm
Copy link
Contributor

damccorm commented Aug 2, 2023

Run Python 3.8 PostCommit

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once all suites pass

@riteshghorse
Copy link
Contributor Author

Tests passed but not reflected here. Opened #27808 for the failing PreCommits. It is not related to this change.

Merging this PR!

@riteshghorse riteshghorse merged commit 1b76101 into apache:master Aug 2, 2023
71 of 74 checks passed
@liferoad
Copy link
Collaborator

liferoad commented Aug 2, 2023

Can you resolve my comments?

@riteshghorse
Copy link
Contributor Author

I've added the enum you suggested. Are there any comments you didn't publish yet accidentally?

@liferoad
Copy link
Collaborator

liferoad commented Aug 2, 2023

Interesting. I probably forgot to submit them. Sorry about that.

@riteshghorse
Copy link
Contributor Author

riteshghorse commented Aug 2, 2023

Thought so! I wanted to get this in for 2.50 so Danny reviewed it as you were OOO.

Some of your comments are already addressed in later commits. I'll send a short PR for rest of them to you.

@liferoad
Copy link
Collaborator

liferoad commented Aug 2, 2023

Thanks, feel free to resolve my comments.

bvolpato pushed a commit to bvolpato/beam that referenced this pull request Aug 3, 2023
* automodel first pass

* new model

* updated model handler api

* add model_class param

* update doc comments

* updated integration test and example

* unit test, modified params

* add test setup for hugging face tests

* fix lints

* fix import order

* refactor, doc, lints

* refactor, doc comments

* change test file

* update types

* add hugging face pipeline support

* integration test for pipeline

* add doc, gs link

* test raises exception

* fix python lints

* add inference fn

* update doc

* docs, lint

* docs, lint

* remove optional from inference_fn

* add enum for tasks

* update pydoc

* update pydoc

* doc, formatting changes

* fix doc

* fix optional in doc

* pin model version
bullet03 pushed a commit to akvelon/beam that referenced this pull request Aug 11, 2023
* automodel first pass

* new model

* updated model handler api

* add model_class param

* update doc comments

* updated integration test and example

* unit test, modified params

* add test setup for hugging face tests

* fix lints

* fix import order

* refactor, doc, lints

* refactor, doc comments

* change test file

* update types

* add hugging face pipeline support

* integration test for pipeline

* add doc, gs link

* test raises exception

* fix python lints

* add inference fn

* update doc

* docs, lint

* docs, lint

* remove optional from inference_fn

* add enum for tasks

* update pydoc

* update pydoc

* doc, formatting changes

* fix doc

* fix optional in doc

* pin model version
@tvalentyn tvalentyn added this to the 2.51.0 Release milestone Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants