Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing offline mode for pipeline (when inferring task). #21113

Merged
merged 5 commits into from
Jan 17, 2023

Conversation

Narsil
Copy link
Contributor

@Narsil Narsil commented Jan 13, 2023

What does this PR do?

pipe = pipeline(model="xx")

Was actually using network even when TRANSFORMERS_OFFLINE=1 was used.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@Narsil Narsil requested a review from sgugger January 13, 2023 18:17
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 13, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

Since I'm looking at the test as a result, I have comments not directly linked to your changes but that should be addressed (in another PR if you prefer).

The weird things is that:

  • mocking socket tests an internet connection failing, this is independent from TRANFORMERS_OFFLINE and will work if the model is cached (see comment in test_offline_mode).
  • adding TRANSFORMERS_OFFLINE tells the library to not connect online. This can be used even if there is internet. This should never call any request so there is no need to mock anything (of mock to check that there was indeed no call).

So the current test_offline_mode should be split in two, with one test just doing TRANSFORMERS_OFFLINE=1 and somehow testing there are no calls to the Hub, and a second test test_cached_is_used_when_offline where we do the mock but don't touch TRANSFORMERS_OFFLINE. The test you're adding should be like test_offline_mode and not mock anything (unless it's to check there are no calls to the internet).

src/transformers/pipelines/__init__.py Outdated Show resolved Hide resolved
@Narsil Narsil requested a review from sgugger January 15, 2023 09:38
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
@Narsil
Copy link
Contributor Author

Narsil commented Jan 16, 2023

So the current test_offline_mode should be split in two, with one test just doing TRANSFORMERS_OFFLINE=1 and somehow testing there are no calls to the Hub, and a second test test_cached_is_used_when_offline where we do the mock but don't touch TRANSFORMERS_OFFLINE. The test you're adding should be like test_offline_mode and not mock anything (unless it's to check there are no calls to the internet).

For the added test this is indeed the only thing proving it fails when the code doesn't check the offline mode.

For the other tests if I understand correctly it's TRANSFORMERS_OFFLINE=1 -> Users asks us to not touch internet, regardless of if internet is available or not. We should FAIL if we're hitting the internet (hence the mock).

If internet is not available, regardless of TRANSFORMERS_OFFLINE we should default to the cached version.

That's ok for the from_pretrained but I don't think this is doable with the pipeline task, because it's not included directly in the model + config, right ? Only the README.md has that information, of which we do not have a cached version, correct ? (Don't think we should either).

If that's correct, then I'm ok with splitting the tests, but the mock should still probably be in both tests, 1 to fake a buggy internet, the other to make sure we trigger an failure when we actually use internet even after been explicitly asked not to do it, no ?

(We could change the mock errors strings to reflect that difference)

@sgugger
Copy link
Collaborator

sgugger commented Jan 16, 2023

Agreed! And yes the pipeline task for those tests needs to be passed, it can't be retrieved in offline mode/internet fails.

@Narsil
Copy link
Contributor Author

Narsil commented Jan 16, 2023

Made the changes is that what you had in mind ?

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's perfect. Thanks a lot!

Comment on lines 73 to 75
# this test is a bit tricky since TRANSFORMERS_OFFLINE can only be changed before
# `transformers` is loaded, and it's too late for inside pytest - so we are changing it
# while running an external program
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we don't need this comment anymore since we are not touching the env variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Narsil Narsil merged commit 25ddd91 into huggingface:main Jan 17, 2023
venkat-natchi pushed a commit to venkat-natchi/transformers that referenced this pull request Jan 22, 2023
…21113)

* Fixing offline mode for pipeline (when inferring task).

* Update src/transformers/pipelines/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updating test to reflect change in exception.

* Fixing offline mode.

* Clean.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
miyu386 pushed a commit to miyu386/transformers that referenced this pull request Feb 9, 2023
…21113)

* Fixing offline mode for pipeline (when inferring task).

* Update src/transformers/pipelines/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updating test to reflect change in exception.

* Fixing offline mode.

* Clean.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants