Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option for bypassing commit filtering for asv publish. #748

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

germanjoey
Copy link

Hello!

My team's been using asv for our project for the last couple of months and we really love it. However, we've noticed a problem where our graphs would occasionally lose all of their past data despite the results files still remaining present. Some investigation revealed the culprit: asv publish was filtering out benchmarks from commits that didn't appear via git rev-parse --first-parent branch. Our repo has a structure such that a condition like that doesn't make sense for us, so we'd rather just have all benchmarks shown.

As for the option's name, I went with "no-commit-filtering" to maintain the same naming scheme as "no-pull", but am happy to change it to whatever y'all think makes sense.

Joe Ryan added 4 commits September 30, 2018 21:05
The "no-commit-filtering" option will ensure that every commit that has a
corresponding benchmark will appear in the output graphs, instead of just
commits that are reachable via "git rev-parse --first-parent" from one of
the listed branches.
… repo.

The commit structure is from test_repo.py, which also has a nice diagram of it.
@pv
Copy link
Collaborator

pv commented Oct 3, 2018

The concept here is more about history "linearization" than "filtering". The graphs and analysis applied to them requires a linear ordering of commits, and there's probably no one best way to do this with significantly branched history.

I guess what you intend here is having Git.get_branch_commits run without the --first-parent flag --- instead of putting all commits to all branches as this PR seems to do now (which doesn't make sense to me)? Or is the repository without long-lived branches?

Ideally, I'd like to see a bit more thought out proposal here with some room left for the different ways one might like to do the linearization.

@germanjoey
Copy link
Author

Hello, I'm looking at the AppVeyor test failure but I don't quite understand it. It doesn't look like it's related to the code in this PR... can anyone confirm this? If there's some problem with the PR that I need to address, I'd like to get it taken care of. :)

@pv
Copy link
Collaborator

pv commented Oct 3, 2018

The appveyor one looks like unrelated maybe temporary conda failure, I don't understand it either.

@germanjoey
Copy link
Author

I guess what you intend here is having Git.get_branch_commits run without the --first-parent flag --- instead of putting all commits to all branches as this PR seems to do now (which doesn't make sense to me)?

Yeah, that's essentially what our goal was. Our thought was that "if we have a benchmark, we want it plotted" because we only had nightly benchmarks run against one particular branch. However, as you say, if a project had benchmarks on more than one branch, our feature certainly wouldn't be very useful!

Ideally, I'd like to see a bit more thought out proposal here with some room left for the different ways one might like to do the linearization.

Thinking about this issue a bit more after reading your comment, it does sound like the best way to address the problem of "what to plot when a project has various benchmark results in a highly branched repo" is to change the behavior of "get_hashes_from_range" and then git/mercurial deal with it.

I notice that both asv/plugins/git.py and asv/plugins/mercurial.pyhave hardcoded default arguments;--first-parentfor git, and{0}, -revfor mercurial. So, perhaps we could change the--no-commit-filteringflag to a--commit-filterargument that defaults toNone. For each branch in conf.branches, asv publish` would then:

a.) if --commit-filter is not specified, the respective defaults for each repo are used
b.) otherwise, the value for --commit-filter is used instead.

If the loading results stage was split into two parts, so that commits could be reordered according to what was returned by get_branch_commits, this would even address the ordering problem that you mentioned as the user would then be able to change the sort order according to what flags were passed. (e.g. by passing --date-order or --author-date-order or --topo-order to git rev-list`).

What do you think?

@pv
Copy link
Collaborator

pv commented Oct 13, 2018

Yes, I'd prefer to do it in get_branch_commits or get_hashes_from_range.

As for the user interface, I'd prefer a flag along the lines of --commit-history=first-parent/date-order, ... --- or a config file option, as I think for a given repository you don't need to change the value of the option very often.

PR #751 seems to be trying to do the same thing as this...

@pv pv added the needs-work The PR cannot be merged as is, further work required (explained in comments) label Oct 31, 2018
@trexfeathers
Copy link

It is clearly a difficult problem to provide flexibility while ensuring a linear commit history. But what about relying on the user to provide arguments resulting in a linear history, with a warning in the API?

asv run

  --commit-history ARGS
                        Use this string of `git rev-list` arguments when deriving
                        the commit sequence from `range`. MUST result in a
                        LINEAR commit sequence, otherwise unpredictable
                        behaviour will follow.  Defaults to "--first-parent".

asv publish

  --commit-history ARGS
                        Use this string of `git rev-list` arguments when deriving
                        the commit sequence from `range`, and when finding
                        commits belonging to a published branch. MUST
                        result in a LINEAR commit sequence, otherwise
                        unpredictable behaviour will follow.  Defaults to
                        "--first-parent".

There may be other relevant asv ... commands too, I'm not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-work The PR cannot be merged as is, further work required (explained in comments)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants