Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quant_pre_process on Windows will raise PermissionError when trying to remove temporary directory #17627

Closed
guotuofeng opened this issue Sep 20, 2023 · 5 comments
Assignees
Labels
model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. platform:windows issues related to the Windows platform quantization issues related to quantization

Comments

@guotuofeng
Copy link
Contributor

Describe the issue

In ORT 1.16.0, the quant_pre_process will raise the following exception when Olive is running in Windows.

The quant_pre_process works fine for ORT 1.15.1.

Traceback (most recent call last):
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\shutil.py", line 616, in _rmtree_unsafe
    os.unlink(fullname)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\CLOUDT~1\\AppData\\Local\\Temp\\pre.quant.om44m0mg\\bert.embeddings.LayerNorm.bias'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\tempfile.py", line 802, in onerror
    _os.unlink(path)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\CLOUDT~1\\AppData\\Local\\Temp\\pre.quant.om44m0mg\\bert.embeddings.LayerNorm.bias'
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\a\_work\1\s\***\passes\onnx\quantization.py", line 442, in _quant_preprocess
    quant_pre_process(
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\site-packages\onnxruntime\quantization\shape_inference.py", line 133, in quant_pre_process
    model = onnx.load(inferred_model_path)
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\tempfile.py", line 827, in __exit__
    self.cleanup()
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\tempfile.py", line 831, in cleanup
    self._rmtree(self.name)
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\tempfile.py", line 813, in _rmtree
    _shutil.rmtree(name, onerror=onerror)
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\shutil.py", line 740, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\shutil.py", line 618, in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\tempfile.py", line 805, in onerror
    cls._rmtree(path)
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\tempfile.py", line 813, in _rmtree
    _shutil.rmtree(name, onerror=onerror)
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\shutil.py", line 740, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\shutil.py", line 599, in _rmtree_unsafe
    onerror(os.scandir, path, sys.exc_info())
  File "C:\hostedtoolcache\windows\Python\3.8.16\x64\lib\shutil.py", line 596, in _rmtree_unsafe
    with os.scandir(path) as scandir_it:
NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\\Users\\CLOUDT~1\\AppData\\Local\\Temp\\pre.quant.om44m0mg\\bert.embeddings.LayerNorm.bias'

To reproduce

https://aiinfra.visualstudio.com/PublicPackages/_build/results?buildId=357498&view=logs&jobId=aaa819ba-d3b3-5f6a-9167-927525a92668&j=aaa819ba-d3b3-5f6a-9167-927525a92668&t=11258177-43ed-56c7-2af4-75f6baa69786

Urgency

The issue cannot be reproed locally. But it will be reproed 100% in CI pipelines.

Platform

Windows

OS Version

10

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@github-actions github-actions bot added model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. platform:windows issues related to the Windows platform quantization issues related to quantization labels Sep 20, 2023
@guotuofeng
Copy link
Contributor Author

I am currently use microsoft/Olive@86c461a to workaround the issue

guotuofeng added a commit to microsoft/Olive that referenced this issue Sep 20, 2023
## Describe your changes
- move the onnxruntime import for calibrate to vitis execution to fix
the onnxruntime 1.6 compatibility issue with openvino
- pin resnet ORT version to wait
microsoft/onnxruntime#17619 to be fixed.
- disable vitis test for ORT 1.16 since the calibrator API is changed.
- pin ort extension to 0.8.0 since the check_model will fail with latest
version.
- copy quant_pre_process code to olive to work-around the Windows temp
folder clean permission error.
microsoft/onnxruntime#17627

## Checklist before requesting a review
- [ ] Add unit tests for this change.
- [x] Make sure all tests can pass.
- [ ] Update documents if necessary.
- [x] Format your code by running `pre-commit run --all-files`
- [ ] Is this a user-facing change? If yes, give a description of this
change to be included in the release notes.

## (Optional) Issue link
@xadupre xadupre self-assigned this Sep 20, 2023
@trajepl
Copy link
Contributor

trajepl commented Sep 21, 2023

@trajepl
Copy link
Contributor

trajepl commented Sep 21, 2023

Somehow when the context of temp directory is end, the ort session do not be cleaned which is occupying the external data file.
I have several quick tries:

  1. save model with external data file.
  2. create a session and key it alive
    • I cannot remove the external data file manually in windows as it is opened.
  3. del sess:
    • I can remove the external data file successfully.

@yufenglee
Copy link
Member

@trajepl, could you please create a PR to fix the issue?

@trajepl
Copy link
Contributor

trajepl commented Sep 26, 2023

@trajepl, could you please create a PR to fix the issue?

Sure. Created one into main branch. #17697. Please help take a look~ :)

Oh, besides, I have no idea about how to write a sufficient test case for this issue. Just test it in Olive side. microsoft/Olive#609

snnn pushed a commit that referenced this issue Sep 26, 2023
### Description
<!-- Describe your changes. -->

Fix for this issue which raise the error of FileNotAccessd in windows
when the context of TemporaryDirectory finished.
#17627

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
#17627
snnn pushed a commit that referenced this issue Sep 29, 2023
### Description
<!-- Describe your changes. -->

Fix for this issue which raise the error of FileNotAccessd in windows
when the context of TemporaryDirectory finished.
#17627

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
#17627
kleiti pushed a commit to kleiti/onnxruntime that referenced this issue Mar 22, 2024
### Description
<!-- Describe your changes. -->

Fix for this issue which raise the error of FileNotAccessd in windows
when the context of TemporaryDirectory finished.
microsoft#17627

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
microsoft#17627
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. platform:windows issues related to the Windows platform quantization issues related to quantization
Projects
None yet
Development

No branches or pull requests

4 participants