Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Spans from HIR -- 1/N -- Span collection #72878

Closed
wants to merge 10 commits into from

Conversation

cjgillot
Copy link
Contributor

@cjgillot cjgillot commented Jun 1, 2020

First step in rust-lang/compiler-team#294
Split out of #72015

This PR collects the spans in the AST when lowering it to HIR.
This collection constructs a HirId -> Span map as each HirId is assigned.
This map is rendered accessible to the world through a query,
and replaces the implementation of tcx.hir().span(...).

This PR needs a perf run.

r? @pnkfelix

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 1, 2020
@jonas-schievink
Copy link
Contributor

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Jun 1, 2020

⌛ Trying commit 547e5b78d962249e8b4f7152b5878a84eee00acf with merge 93a501ef44530cdc593a276b6b29f34bf28ad520...

@bors
Copy link
Contributor

bors commented Jun 1, 2020

☀️ Try build successful - checks-azure
Build commit: 93a501ef44530cdc593a276b6b29f34bf28ad520 (93a501ef44530cdc593a276b6b29f34bf28ad520)

@rust-timer
Copy link
Collaborator

Queued 93a501ef44530cdc593a276b6b29f34bf28ad520 with parent b85e3fe, future comparison URL.

@@ -195,7 +200,7 @@ impl<'a, 'hir> NodeCollector<'a, 'hir> {
(self.map, svh)
}

fn insert_entry(&mut self, id: HirId, entry: Entry<'hir>, hash: Fingerprint) {
fn insert_entry(&mut self, span: Span, id: HirId, entry: Entry<'hir>, hash: Fingerprint) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. Why does the span come before the id in this API? The id is the "key" in the map, right? So I would have the span come at some point after it in the parameter list, just like the other "payloads"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the API from the insert function. This span parameter is only used to assert the consistency of the span map. It will be removed later.

@@ -648,14 +668,14 @@ impl<'a, 'hir> LoweringContext<'a, 'hir> {
})
}

fn next_id(&mut self) -> hir::HirId {
fn next_id(&mut self, span: Span) -> hir::HirId {
Copy link
Member

@pnkfelix pnkfelix Jun 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to decide whether it is not intuitive for fn next_id to take a span as a parameter.

I think I understand the intent you have here: Every call to next_id is matched up with some portion of the HIR that is being generated via the lowering of the AST, and each such portion of the HIR is meant to always have a span associated with it. So in that sense, it makes sense to pass along the span so that you can enforce the invariant that all HIR entries have a span.

But at the same time, when I read the invocations written as self.next_id(span), my own intuitive reading is "this is generating an id from the span somehow" (which is certainly not what is happening).

I'll keep musing on this. In the meantime, I would suggest you add a doc-comment to fn next_id saying what the span parameter is. (I also played with the idea of suggesting renaming the method, but I wasn't really happy with any of the alternatives I came up with...)

@pnkfelix
Copy link
Member

pnkfelix commented Jun 1, 2020

I've looked at the first two commits, and they lead me to one overall question:

The end goal here is to refactoring the data structure to remove a field from a set of structures and replace that field with a mapping elsewhere. Normally I'm all for trying to work incrementally and making each commit as small as possible. In this case, it seems like you have made one PR dedicated to adding the necessary mapping, but not removing the span field yet.

But when it comes to something like this, where it might make sense, even in terms of simplifying the review, to do both the removal and the addition in one step? This would presumably make it trivial to confirm "yes, we clearly have added all the instances of the mapping where we need it, because every spot where the field access was removed, there is a corresponding access to the new mapping." ?

Or does that make the PR so huge as to make the review process unmanageable?


Another reason to do both in one PR is that it will produce more realistic end numbers for the perf runs. Right now, this PR's version of the compiler is going to pay in memory usage to maintain both the separate mapping of HirID to Span, as well as the spans embedded in the HIR itself. So that means some amount of wasted memory due to that redundancy, and that presumably will cause some amount of a speed hit that will go away later when the embedded span fields are removed. It would be potentially better to couple both those changes into one PR so that we get a more accurate idea of the real end effect on performance.

@pnkfelix pnkfelix added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 1, 2020
@pnkfelix
Copy link
Member

pnkfelix commented Jun 1, 2020

(The question in my previous comment is important enough, in terms of it implying a big revision to this PR, that I'm pausing my review to wait for @cjgillot to answer the question. Thus I changed the S-waiting tags accordingly.)

@cjgillot
Copy link
Contributor Author

cjgillot commented Jun 1, 2020

During the development of this PR, I found easiest to do the collection first, put some assertions I did not change the behaviour, and then proceed with each HIR type one by one.

The current end state is at #72015. That PR is already quite large (+3200, -2700). Furthermore, it does not remove all spans yet: HIR types that do not have a HirId are in progress.

The goal splitting is also to manage the effect on performance. I expect the reduction in invalidation to happen very late. In the mean time, using an exterior map gated behind a query will create slowdown. Those intermediate slowdowns deserve investigation. Otherwise, I would just be pushing the problem into another data structure.

Does this make sense to you @pnkfelix? How would you rather proceed?

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 93a501ef44530cdc593a276b6b29f34bf28ad520, comparison URL.

@pnkfelix
Copy link
Member

pnkfelix commented Jun 4, 2020

@cjgillot okay, I'll trust your judgement about how to structure the changes.

@pnkfelix pnkfelix added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 4, 2020
),
VariantData::Unit(id) => hir::VariantData::Unit(self.lower_node_id(id, DUMMY_SP)),
VariantData::Unit(id) => hir::VariantData::Unit(self.lower_node_id(id, span)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fixes on lines 722 and 724 seem important, or at least worth considering trying to land on their own in a separate PR?

That is, it seems like this could cause an interesting change to the diagnostic output in some cases, and it would be good to vet that change independently of the others changes in this PR, which I had hoped would be a refactoring....

(I understand the temptation to fold fixes like this in as a drive-by change, especially since I'm not sure how best to make this change without making the other changes to e.g. thread the span argument around. But I'm worried about coupling it with such a big change overall, especially one that we might choose to revert later on as we gather more experience with the change...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behaviour does not change here. This choice of span for the VariantData is the one already implemented in hir::map::NodeCollector::visit_variant.

_ => {}
};
let stored_span = self.spans.entry(hir_id).or_insert(span);
if *stored_span == DUMMY_SP {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keep using is_dummy() here? (And !is_dummy() below?)

I don't have a preference personally, but I also wonder when I see a change from an existing code style like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this particular instance of comparison with DUMMY_SP was removed in a follow-up commit. But the question still stands over all, since the PR in general does inject more comparisons with DUMMY_SP, for better or for worse.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases, the HIR owners are assigned a HirId before the AST traversal. When this HirId is created, we do not have access to the proper Span. I use DUMMY_SP as a placeholder for delayed initialisation. I will add a comment about that.

@pnkfelix
Copy link
Member

pnkfelix commented Jun 5, 2020

Okay, I've left my feedback. I think this all seems relatively fine; I'm not 100% sure if the ordering of the changes is ideal (e.g. would it be better to switch to an IndexVec earlier in the commit series?)

The biggest issue I have is that I was expecting this to be a pure-refactoring, but there are places where dummy spans are being replaced with spans that are newly threaded through.

I personally advise that we try to avoid coupling large refactorings with drive-by enduser-visible improvements like that. I would prefer to land the improvements in a separate PR.

(Of course, with the bors turn-around time the way it is, separate PR's may end up coupled together anyway in a rollup. But I'd prefer that decision be made at rollup time, rather than at PR authorship time, unless there is some strong reason that the drive-by improvements be coupled with the refactoring.)

((And its also possible that I am wrong that the drive-by improvements are end-user visible? The fact that none of the ui tests needed updated is a hint that the drive-by improvements don't end up in any diagnostics? Or it could just be a sign of weakness in our ui testing...))

@cjgillot
Copy link
Contributor Author

cjgillot commented Jun 7, 2020

I re-shuffled the commits to avoid the back and forth with the modifications. The only differences are a few comments.

@cjgillot cjgillot force-pushed the nospan-lowering branch 2 times, most recently from 2840a46 to 90d802e Compare June 7, 2020 13:34
@cjgillot
Copy link
Contributor Author

cjgillot commented Jan 6, 2021

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@bors
Copy link
Contributor

bors commented Jan 6, 2021

⌛ Trying commit 7b9999c with merge 434d20b9fd160c5b91331a8761accd9640a75aec...

@bors
Copy link
Contributor

bors commented Jan 6, 2021

☀️ Try build successful - checks-actions
Build commit: 434d20b9fd160c5b91331a8761accd9640a75aec (434d20b9fd160c5b91331a8761accd9640a75aec)

@rust-timer
Copy link
Collaborator

Queued 434d20b9fd160c5b91331a8761accd9640a75aec with parent 8fec6c7, future comparison URL.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 6, 2021
@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (434d20b9fd160c5b91331a8761accd9640a75aec): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 7, 2021
@bjorn3
Copy link
Member

bjorn3 commented Jan 7, 2021

Regressions of up to 5.6%.

@camelid camelid added A-HIR Area: The high-level intermediate representation (HIR) S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 22, 2021
@Dylan-DPC-zz
Copy link

@cjgillot waiting on conflicts to be resolved

@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 14, 2021
@crlf0710 crlf0710 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 5, 2021
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 22, 2021
@cjgillot cjgillot closed this Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-HIR Area: The high-level intermediate representation (HIR) S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.