Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the OEM encoding for bat files #943

Closed
wants to merge 1 commit into from

Conversation

pfmoore
Copy link
Member

@pfmoore pfmoore commented Jul 23, 2016

No description provided.

@pfmoore
Copy link
Member Author

pfmoore commented Jul 23, 2016

Note: if the path to the virtualenv contains characters that can't be encoded in the OEM characterset, this PR will cause the creation of the virtualenv to fail, because of an encoding error in the creation of activate.bat. This needs to be fixed - but in this case activate.bat simply can't work, as bat files must be encoded in the OEM code page.

I suspect the best we can do in this case would be to write the bat files with the "replace" error handler, and issue a warning that the user will have to manually handle the PATH, as the activate scripts won't work.

@yan12125
Copy link

There's a more generic approach in my PR #900. I use sys.getfilesystemencoding(), while I'm not sure whether its value is always the same as sys.stdout.encoding or not.

@pfmoore
Copy link
Member Author

pfmoore commented Jul 26, 2016

Thanks for the pointer, I hadn't spotted your PR. I've added comments there, but no, you can't use sys.getfilesystemencoding() for batch files on Windows.

@Ivoz
Copy link

Ivoz commented Aug 1, 2016

@pfmoore do you know if the same logic should apply to .ps1 files, or are they ok as UTF-8?

@pfmoore
Copy link
Member Author

pfmoore commented Aug 1, 2016

.ps1 files can be UTF-8, but they need a BOM.

@Ivoz
Copy link

Ivoz commented Aug 5, 2016

Since all 3 references to content in the function body are to call it with encode(encoding), why not write binary_content = content.encode(encoding) at the top of the function and then reference binary_content in those 3 places. Just bit of basic getting-rid-of-repetition.

@techtonik
Copy link

if the path to the path to the virtualenv contains characters that can't be encoded in the OEM characterset

If those characters are invalid, then virtualenv dir won't be created before any files are created there.

if is_win:

If directory was created, then the file path is right and should be written as-is. If it is not in bytes, then I guess any encoding can damage it if it remaps bytes. Does cp437 remap bytes?

But it may be that #950 is more important and people simple don't create virtualenvs with non-ascii names on Windows.

Use any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255), except for the following:
The following reserved characters:
< (less than)
> (greater than)
: (colon)
" (double quote)
/ (forward slash)
\ (backslash)
| (vertical bar or pipe)
? (question mark)
* (asterisk)
* Integer value zero, sometimes referred to as the ASCII NUL character.
* Characters whose integer representations are in the range from 1 through 31, except for alternate data streams where these characters are allowed.

https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#file_and_directory_names

@yan12125
Copy link

@techtonik For #950, my pull request #900 may work. It's messy and imperfect, but it should work in simple cases.

#900 contains lots of fixes, and #943 is part of it. It would be great if #943 is merged first. My pull request can be cleaner.

@techtonik
Copy link

It's messy and imperfect, but it should work in simple cases.

If that's the reason it won't be merged, then it doesn't help. =)

@yan12125
Copy link

I guess it's because my PR should be separated into multiple parts.

@yan12125
Copy link

By the way, this PR looks quite fine. Just replace sys.stdout.encoding with ctypes.windll.kernel32.GetOEMCP(). The former exists only if stdout is a tty.

Copy link
Contributor

@gaborbernat gaborbernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will require a rebase on master, and some test to validate the changes. Thanks!

@gaborbernat
Copy link
Contributor

Closing due to inactivity.

@pfmoore pfmoore deleted the bat_encoding branch September 26, 2020 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants