Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix space in graph line string simpler approach #2644

Conversation

ColemanTom
Copy link
Contributor

@ColemanTom ColemanTom commented Apr 30, 2018

This fixes #2534. It is a much simpler and less strict approach than was used in #2628 (which can be closed in favour of this one if preferred). The simplest approach of looking for TaskID.NAME_RE does not work because you can have hyphens with numbers after them quite reasonably, at least in a parameter. I couldn't think of a clean way to only have this checkout outside of parameter expansion, or to ignore triggers like succeed-all, so it hits them all. Thus, why I had to put in extra regex to ignore -[0-9] with spaces around the minus sign, those are obviously fine to have happen.

I will note though, that because of the approach described above, if something like task -23 or task + 23 is done, it won't catch it, and it will just pass. But, given a task name must start with a \w, it is probably not too bad, and less of a problem than currently exists. I don't have this failure mode covered in the tests.

I'll also admit here, I don't actually have myself setup to run the full test battery, so I've only run these unit tests and will be relying on Travis to point out any problems.

@ColemanTom
Copy link
Contributor Author

Out of curiousity, what is Codacy using for its review? pycodestyle doesn't show the issue it mentions for me. Should I just make the test method that it shows up an @classmethod?

@hjoliver
Copy link
Member

If you expand the issue identified in Codacy, it explains more about it, and right at the bottom it says PyLint is the culprit. Yes, looks like you can just make it a class method.

@hjoliver
Copy link
Member

Travis CI is fine for the tests in this case.

@hjoliver
Copy link
Member

My initial feeling - without a close look yet - is I prefer this simpler PR on grounds of clarity and maintainability.

@ColemanTom ColemanTom force-pushed the fix_space_in_graph_line_string_simpler_approach branch from 49ab749 to 7854c5c Compare April 30, 2018 03:31
@ColemanTom
Copy link
Contributor Author

I agree. The other one was not maintainable. But, it was very interesting for me to make. I definitely learnt a bit about regex.

I have updated with @classmethod and am waiting to see the travis-ci results.

@ColemanTom ColemanTom force-pushed the fix_space_in_graph_line_string_simpler_approach branch from 7854c5c to eff6b3c Compare April 30, 2018 04:39
@ColemanTom
Copy link
Contributor Author

I'm confused about the pycodestyle failure in the travis ci process. From what I can see looking at the suggested approaches for indentation, it should work? When I run pycodestyle on my local it doesn't warn about anything either.

@matthewrmshin matthewrmshin added this to the next release milestone Apr 30, 2018
@matthewrmshin matthewrmshin added the bug Something is wrong :( label Apr 30, 2018
@matthewrmshin
Copy link
Contributor

(I don't understand the Travis CI failure either - pycodestyle on it may be a bit broken again. I have just restarted it one more time to see if it happens again.)

@matthewrmshin
Copy link
Contributor

All looking good again.

Copy link
Contributor

@matthewrmshin matthewrmshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. Otherwise looking good. Thanks.

# Apparently this is the fastest way to strip all whitespace!:
line = "".join(line.split())
if not line:
modifiedLine = self.__class__.REC_COMMENT.sub('', line)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -270,6 +290,18 @@ def parse_graph(self, graph_string):
# If debugging, print the final result here:
# self.print_triggers()

@classmethod
def _report_invalid_lines(cls, lines):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add doc string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I had sent this question already, but apparently I didn't.

Is there a preferred pydoc style? reST? Google? Numpy? I think I've seen Google a few times, but this file I don't think matches any of the standard ones?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We follow PEP8 for style. The docstring section https://www.python.org/dev/peps/pep-0008/#documentation-strings links out to PEP257. However, we're not very consistent on exact docstring format (blanks lines before and after final triple-quotes for example) - if you write one that looks more or less like others in graph_parser.py that'll be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've put in something.

(?!\s*[\-+]\s*[0-9]) # allow spaces before/after -+ if numbers follow
\s+ # do not allow 'task<space>task'
{task_suf}
'''.format(task=TaskID.NAME_RE, task_suf=TaskID.NAME_SUFFIX_RE), re.X)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether it is worth using .format here, which may complicate things.

@hjoliver
Copy link
Member

hjoliver commented May 1, 2018

Note we might want to go with #2628 instead of this, for #2646 (comment)

Currently, the graph parsing will not raise an error when there are
spaces between task names, as spaces get stripped out
`foo => bar baz` becomes `foo => barbaz`.
It can be captured if that task does not exist, but if there was a
task with that name, it could lead to some odd results.

This change takes a simple approach to stop the space separated task
validation problems, and also to prevent bad spaces in parameters and
around triggers. It will not capture all potential validation problems,
some of which are covered by later validation checks.

It was not possible to just do task names and to ignore paramters and
triggers, as they generally follow the same format.
@ColemanTom ColemanTom force-pushed the fix_space_in_graph_line_string_simpler_approach branch from eaf63f4 to 912074c Compare May 3, 2018 00:59
Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

@hjoliver hjoliver merged commit 2259df5 into cylc:master May 3, 2018
@ColemanTom ColemanTom deleted the fix_space_in_graph_line_string_simpler_approach branch March 12, 2019 04:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :(
Projects
None yet
Development

Successfully merging this pull request may close these issues.

graph parsing: space removal bug
3 participants