Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quote path settings containing # and spaces #1698

Closed
wants to merge 3 commits into from
Closed

Conversation

jayvdb
Copy link

@jayvdb jayvdb commented Oct 19, 2020

This is limited to settings of type "path".
While it is limited to # and (space), other characters which break shlex could be added.

Fixes #763 and #924

Contribution checklist:

(also see CONTRIBUTING.rst for details)

  • wrote descriptive pull request text
  • added/updated test(s)
  • updated/extended the documentation
  • added relevant issue keyword
    in message body
  • added news fragment in changelog folder
    • fragment name: <issue number>.<type>.rst for example (588.bugfix.rst)
    • <type> is must be one of bugfix, feature, deprecation,breaking, doc, misc
    • if PR has no issue: consider creating one first or change it to the PR number after creating the PR
    • "sign" fragment with "by :user:<your username>"
    • please use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files - by :user:superuser."
    • also see examples
  • added yourself to CONTRIBUTORS (preserving alphabetical order)

@@ -390,7 +391,7 @@ def get(self, name, default=None):
return os.environ.get(name, default)
self._lookupstack.append(name)
try:
res = self.reader._replace(val)
res = self.reader._replace(val, unquote_path=False)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How quoted paths should appear in the setenv results is an aspect I havent explored yet, and would likely build on #1691

Despite stripping quotes as early as possible, I expect there are quite a few corner cases where the quotes are going to causes new complications, which is why I have ensured the quotes are only used when # is in the path, which already didnt work, so corner cases breaking isnt a regression. Adding to the quoted path chars is less clear, as that probably worked for many simple cases, and only failed in corner cases as seen at #924.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests for setting paths in setenv have been added.

Spaces in paths also added.

Merging #1691 first would allow for more exploration of possible edge cases using escaping.

@jayvdb jayvdb force-pushed the i763 branch 2 times, most recently from 7ce627e to 3ecdd5f Compare October 19, 2020 17:49
@jayvdb jayvdb changed the title Quote paths containing # Quote path settings containing # and spaces Oct 19, 2020
@@ -1895,7 +1915,7 @@ def getargvlist(cls, reader, value, replace=True):
current_command += line

if is_section_substitution(current_command):
replaced = reader._replace(current_command, crossonly=True)
replaced = reader._replace(current_command, crossonly=True, unquote_path=False)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unquote_path=False is added liberally to aid reviewing, so all invocations of _replace(..) can be seen.
IMO they should removed before being merged.

new_word = reader._replace(word, unquote_path=False)
new_word = reader._replace(new_word, unquote_path=False)
if not has_dual_quote:
new_word = new_word.replace("''", "'")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has the potential for changing the behaviour of any existing tox.ini which used quotes and also had quotes inserted via substitution. The most obvious case would be an env var of ' and commands using the envvar and quotes. Then, does stripping dual quotes change the behaviour of those previous usages? In theory, this removal would be simply merging quoted strings. Need to think more about possible breakages.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this simple replace would break \\''. Easy fixed with a re.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\\'' now fixed. No test coverage yet. Other obscure oddities may still exist.

@@ -1864,6 +1878,12 @@ def _replace_substitution(self, match):
val = self._substitute_from_other_section(sub_key)
if callable(val):
val = val()
if isinstance(val, PathBase):
val = str(val)
# XXX handle ' and " in paths
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has partially been done.
I was thinking of switching quoting char to avoid clashes with '. The quoting could even be done with an invalid filename char (e.g. non-printables), as long as it makes it through the regex engine.

argv = conf.commands
assert argv[0] == ["cmd1", "hello"]

if sys.platform != "win32":
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting windows specific behaviour is happening here, cutting the dir at #. May be a bug in test harness code, given that the next stanza is OK.

@jayvdb jayvdb force-pushed the i763 branch 3 times, most recently from 1a9f5ac to 266b1ab Compare October 20, 2020 07:21
@@ -0,0 +1 @@
Allow {/} to refer to os.sep. - by :user:`jayvdb`
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is #1701

Without this, the tests need to do hacks to convert / to \ on Windows. It gets messy because the bugs related to this are about real native paths, including constructed path based on preset paths.

Copy link
Member

@gaborbernat gaborbernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rebase this?

@jayvdb
Copy link
Author

jayvdb commented Oct 21, 2020

Sure, but now that #1691 is merged (thx), I'd like to also explore how all of these chars appear in the commands subprocess runtime environment variables.
Also now worth verifying that { in paths doesnt do anything odd.

@jayvdb
Copy link
Author

jayvdb commented Oct 23, 2020

#1713 solves many of these issues, in as much as that type=path args pass through without escaping except anything inside {..}. There are a few tests from this PR which were omitted, related to round tripping paths through setenv, but it feels it can solve the underlying issues much cleaner and clearer (and thus easier to document), but will have some problems unsolved.

@gaborbernat
Copy link
Member

@jayvdb this is merge conflicting.

@gaborbernat
Copy link
Member

I'll close this for now until you can attend to the PR again. Once you've done we can reopen.

@gaborbernat gaborbernat closed this Nov 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tox fails when running in a path containing a hash
2 participants