Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc_codegen_llvm: give names to non-alloca variable values. #64149

Merged
merged 1 commit into from
Sep 7, 2019

Conversation

eddyb
Copy link
Member

@eddyb eddyb commented Sep 4, 2019

These names only matter when looking at LLVM IR, but they can help.

When one value is used for multiple variables, I decided to combine the names.
I chose , as a separator but maybe = or (space) are more appropriate.
(LLVM names can contain any characters - if necessary they end up having quotes)

As an example, this function:

#[no_mangle]
pub fn test(a: u32, b: u32) -> u32 {
    let c = a + b;
    let d = c;
    let e = d * a;
    e
}

Used to produce this LLVM IR:

define i32 @test(i32 %a, i32 %b) unnamed_addr #0 {
start:
  %0 = add i32 %a, %b
  %1 = mul i32 %0, %a
  ret i32 %1
}

But after this PR you get this:

define i32 @test(i32 %a, i32 %b) unnamed_addr #0 {
start:
  %"c,d" = add i32 %a, %b
  %e = mul i32 %"c,d", %a
  ret i32 %e
}

cc @nagisa @rkruppe

@rust-highfive
Copy link
Collaborator

r? @zackmdavis

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 4, 2019
@hanna-kruppe
Copy link
Contributor

I'm excited to have fewer unnamed values. However, I'm not sure how I feel about squashing all names together in a (potentially) long string.

While I can see it being sometimes valuable to immediately see that e.g. two source level variables are mapped to the same SSA values, it could also be irrelevant noise. At the same time, LLVM optimizations won't even try to do something similar (they'll just pick one of the multiple equal values, and not even necessarily preserve the name if transforming the instructions at all). While this limits the noise, it also means the "ah, these two variables got combined into one" effect is much less reliable when you look at IR after some LLVM optimizations, which in my experience is >95% of the time.

I think it would be good enough if we kept using only one name, at least if together with other changes to ensure it is a meaningful name (i.e., avoid nonsense like the arg0 in that one test), ideally one coming from the source code where possible.

@nagisa
Copy link
Member

nagisa commented Sep 4, 2019

I’m ambivalent. If this is helping @eddyb to do their debugging, I’m all for making this happen. I can always use fewer-names or whatever the argument is called to clean the IR up.

@eddyb
Copy link
Member Author

eddyb commented Sep 4, 2019

The reason I did this wasn't so much about reading LLVM IR, tbh, but rather I wanted to try out making something debuginfo-related more uniform (value names, in this case) between arguments and other variables.

I suppose a more interesting thing to do here would be to add llvm.dbg.value calls, but I want to do that after #56231 (whereas what I'm working on now should land before #56231, to make it easier to move to the new representation).

If this is undesirable, I can probably continue without, or maybe just remove the second commit.

@hanna-kruppe
Copy link
Contributor

The first commit is good IMO, both for IR readability and for debuginfo. If nobody cares strongly for the second commit then let's just drop the second one and merge the first. (Well, and probably rewrite the FIXME that presupposes we'll want to combine names).

Copy link
Contributor

@hanna-kruppe hanna-kruppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me except one possible nit, if the "arg or inst" check can be an assertion please make it one.

let old_name = unsafe {
CStr::from_ptr(llvm::LLVMGetValueName(value))
};
match old_name.to_str() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was gonna suggest writing this match as if old_name.is_empty() for readability (+ to avoid having to care about UTF-8) but actually CStr doesn't have that 🤷‍♀

FYI while annoyed about this I also noticed that we can use better APIs for touching LLVM value names (see #64223) but that's unrelated to this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes I forgot to bring that up! I also noticed those functions.

llvm::LLVMIsAInstruction(value).is_some()
};
if !param_or_inst {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first instinct would be to make this a bug!(), not a silent nop. Are there any callers (now or after more debuginfo work) that might good reason need to pass in something other than an instruction or argument?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yeah, Constants (which would be a nop), or, worse, GlobalValues (functions/statics), which would get their mangled names replaced.

That is, I'm pretty sure let x = foo as fn(); and let y = &STATIC; would both allow accidental renaming of foo's or STATIC's symbol names without this return.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. Well, silent nop it is then.

@hanna-kruppe
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Sep 6, 2019

📌 Commit eedf555 has been approved by rkruppe

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 6, 2019
Centril added a commit to Centril/rust that referenced this pull request Sep 7, 2019
rustc_codegen_llvm: give names to non-alloca variable values.

These names only matter when looking at LLVM IR, but they can help.

When one value is used for multiple variables, I decided to combine the names.
I chose `,` as a separator but maybe `=` or ` ` (space) are more appropriate.
(LLVM names can contain any characters - if necessary they end up having quotes)

As an example, this function:
```rust
#[no_mangle]
pub fn test(a: u32, b: u32) -> u32 {
    let c = a + b;
    let d = c;
    let e = d * a;
    e
}
```
Used to produce this LLVM IR:
```llvm
define i32 @test(i32 %a, i32 %b) unnamed_addr #0 {
start:
  %0 = add i32 %a, %b
  %1 = mul i32 %0, %a
  ret i32 %1
}
```
But after this PR you get this:
```llvm
define i32 @test(i32 %a, i32 %b) unnamed_addr #0 {
start:
  %"c,d" = add i32 %a, %b
  %e = mul i32 %"c,d", %a
  ret i32 %e
}
```

cc @nagisa @rkruppe
bors added a commit that referenced this pull request Sep 7, 2019
Rollup of 10 pull requests

Successful merges:

 - #63919 (Use hygiene for AST passes)
 - #63927 (Filter linkcheck spurious failure)
 - #64149 (rustc_codegen_llvm: give names to non-alloca variable values.)
 - #64192 (Bail out when encountering likely missing turbofish in parser)
 - #64231 (Move the HIR CFG to `rustc_ast_borrowck`)
 - #64233 (Correct pluralisation of various diagnostic messages)
 - #64236 (reduce visibility)
 - #64240 (Include compiler-rt in the source tarball)
 - #64241 ([doc] Added more prereqs and note about default directory)
 - #64243 (Move injection of attributes from command line to `libsyntax_ext`)

Failed merges:

r? @ghost
@bors bors merged commit eedf555 into rust-lang:master Sep 7, 2019
@eddyb eddyb deleted the llvm-var-names branch September 11, 2019 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants