Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add #[inline] modifier to TypeId::of #92135

Merged
merged 1 commit into from
Dec 24, 2021
Merged

Add #[inline] modifier to TypeId::of #92135

merged 1 commit into from
Dec 24, 2021

Conversation

AngelicosPhosphoros
Copy link
Contributor

It was already inlined but it happened only in 4th InlinerPass on my testcase.
With #[inline] modifier it happens on 2nd pass.

Closes #74362

It was already inlined but it happened only in 4th InlinerPass on my testcase.
With `#[inline]` modifier it happens on 2nd pass.

Closes #74362
@rust-highfive
Copy link
Collaborator

r? @yaahc

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 20, 2021
@AngelicosPhosphoros
Copy link
Contributor Author

AngelicosPhosphoros commented Dec 20, 2021

I tested using this code.

code

I really have macros which generates something similar, btw. I use | operator because it is faster on debug builds.

use std::any::TypeId;

struct A0;
struct A1;
struct A2;
struct A3;
struct A4;
struct A5;
struct A6;
struct A7;
struct A8;
struct A9;
struct A10;
struct A11;
struct A12;
struct A13;
struct A14;
struct A15;
struct A16;
struct A17;
struct A18;
struct A19;

pub fn is_unique_pub() -> bool {
    let t0 = TypeId::of::<A0>();
    let t1 = TypeId::of::<A1>();
    let t2 = TypeId::of::<A2>();
    let t3 = TypeId::of::<A3>();
    let t4 = TypeId::of::<A4>();
    let t5 = TypeId::of::<A5>();
    let t6 = TypeId::of::<A6>();
    let t7 = TypeId::of::<A7>();
    let t8 = TypeId::of::<A8>();
    let t9 = TypeId::of::<A9>();
    let t10 = TypeId::of::<A10>();
    let t11 = TypeId::of::<A11>();
    let t12 = TypeId::of::<A12>();
    let t13 = TypeId::of::<A13>();
    let t14 = TypeId::of::<A14>();
    let t15 = TypeId::of::<A15>();
    let t16 = TypeId::of::<A16>();
    let t17 = TypeId::of::<A17>();
    let t18 = TypeId::of::<A18>();
    let t19 = TypeId::of::<A19>();

    let b1 = t1 == t0;
    let b2 = (t2 == t0) | (t2 == t1);
    let b3 = (t3 == t0) | (t3 == t1) | (t3 == t2);
    let b4 = (t4 == t0) | (t4 == t1) | (t4 == t2) | (t4 == t3);
    let b5 = (t5 == t0) | (t5 == t1) | (t5 == t2) | (t5 == t3) | (t5 == t4);
    let b6 = (t6 == t0) | (t6 == t1) | (t6 == t2) | (t6 == t3) | (t6 == t4) | (t6 == t5);
    let b7 =
        (t7 == t0) | (t7 == t1) | (t7 == t2) | (t7 == t3) | (t7 == t4) | (t7 == t5) | (t7 == t6);
    let b8 = (t8 == t0)
        | (t8 == t1)
        | (t8 == t2)
        | (t8 == t3)
        | (t8 == t4)
        | (t8 == t5)
        | (t8 == t6)
        | (t8 == t7);
    let b9 = (t9 == t0)
        | (t9 == t1)
        | (t9 == t2)
        | (t9 == t3)
        | (t9 == t4)
        | (t9 == t5)
        | (t9 == t6)
        | (t9 == t7)
        | (t9 == t8);
    let b10 = (t10 == t0)
        | (t10 == t1)
        | (t10 == t2)
        | (t10 == t3)
        | (t10 == t4)
        | (t10 == t5)
        | (t10 == t6)
        | (t10 == t7)
        | (t10 == t8)
        | (t10 == t9);
    let b11 = (t11 == t0)
        | (t11 == t1)
        | (t11 == t2)
        | (t11 == t3)
        | (t11 == t4)
        | (t11 == t5)
        | (t11 == t6)
        | (t11 == t7)
        | (t11 == t8)
        | (t11 == t9)
        | (t11 == t10);
    let b12 = (t12 == t0)
        | (t12 == t1)
        | (t12 == t2)
        | (t12 == t3)
        | (t12 == t4)
        | (t12 == t5)
        | (t12 == t6)
        | (t12 == t7)
        | (t12 == t8)
        | (t12 == t9)
        | (t12 == t10)
        | (t12 == t11);
    let b13 = (t13 == t0)
        | (t13 == t1)
        | (t13 == t2)
        | (t13 == t3)
        | (t13 == t4)
        | (t13 == t5)
        | (t13 == t6)
        | (t13 == t7)
        | (t13 == t8)
        | (t13 == t9)
        | (t13 == t10)
        | (t13 == t11)
        | (t13 == t12);
    let b14 = (t14 == t0)
        | (t14 == t1)
        | (t14 == t2)
        | (t14 == t3)
        | (t14 == t4)
        | (t14 == t5)
        | (t14 == t6)
        | (t14 == t7)
        | (t14 == t8)
        | (t14 == t9)
        | (t14 == t10)
        | (t14 == t11)
        | (t14 == t12)
        | (t14 == t13);
    let b15 = (t15 == t0)
        | (t15 == t1)
        | (t15 == t2)
        | (t15 == t3)
        | (t15 == t4)
        | (t15 == t5)
        | (t15 == t6)
        | (t15 == t7)
        | (t15 == t8)
        | (t15 == t9)
        | (t15 == t10)
        | (t15 == t11)
        | (t15 == t12)
        | (t15 == t13)
        | (t15 == t14);
    let b16 = (t16 == t0)
        | (t16 == t1)
        | (t16 == t2)
        | (t16 == t3)
        | (t16 == t4)
        | (t16 == t5)
        | (t16 == t6)
        | (t16 == t7)
        | (t16 == t8)
        | (t16 == t9)
        | (t16 == t10)
        | (t16 == t11)
        | (t16 == t12)
        | (t16 == t13)
        | (t16 == t14)
        | (t16 == t15);
    let b17 = (t17 == t0)
        | (t17 == t1)
        | (t17 == t2)
        | (t17 == t3)
        | (t17 == t4)
        | (t17 == t5)
        | (t17 == t6)
        | (t17 == t7)
        | (t17 == t8)
        | (t17 == t9)
        | (t17 == t10)
        | (t17 == t11)
        | (t17 == t12)
        | (t17 == t13)
        | (t17 == t14)
        | (t17 == t15)
        | (t17 == t16);
    let b18 = (t18 == t0)
        | (t18 == t1)
        | (t18 == t2)
        | (t18 == t3)
        | (t18 == t4)
        | (t18 == t5)
        | (t18 == t6)
        | (t18 == t7)
        | (t18 == t8)
        | (t18 == t9)
        | (t18 == t10)
        | (t18 == t11)
        | (t18 == t12)
        | (t18 == t13)
        | (t18 == t14)
        | (t18 == t15)
        | (t18 == t16)
        | (t18 == t17);
    let b19 = (t19 == t0)
        | (t19 == t1)
        | (t19 == t2)
        | (t19 == t3)
        | (t19 == t4)
        | (t19 == t5)
        | (t19 == t6)
        | (t19 == t7)
        | (t19 == t8)
        | (t19 == t9)
        | (t19 == t10)
        | (t19 == t11)
        | (t19 == t12)
        | (t19 == t13)
        | (t19 == t14)
        | (t19 == t15)
        | (t19 == t16)
        | (t19 == t17)
        | (t19 == t18);
    b1 | b2
        | b3
        | b4
        | b5
        | b6
        | b7
        | b8
        | b9
        | b10
        | b11
        | b12
        | b13
        | b14
        | b15
        | b16
        | b17
        | b18
        | b19
}

I generated log of compilation using command rustc +stage1 check_type_id.rs --crate-type=rlib -Copt-level=3 -Cllvm-args='-print-after-all' -Z no-parallel-llvm 2>old.ll.

Old log contains 98537 lines. New log (where TypeId::of inlined earlier) contains only 35163 lines.
I think, more early inlining would allow compiler catch more opportunities to optimize code.

You can see my log in attachment.

compilation_log.zip

I don't expect any regresses because this function was already inlined anyway.

@joshtriplett
Copy link
Member

Seems reasonable. This might even be a good candidate for inline(always).

@bors d+

@AngelicosPhosphoros
Copy link
Contributor Author

AngelicosPhosphoros commented Dec 24, 2021

@joshtriplett

This might even be a good candidate for inline(always).

After NewPM landing, #[inline] wouldn't have any difference with #[inline(always)] so I think it is not needed.

@bors d+

Didn't you mean r+?

@dtolnay
Copy link
Member

dtolnay commented Dec 24, 2021

@bors r+

@bors
Copy link
Contributor

bors commented Dec 24, 2021

📌 Commit 756d163 has been approved by dtolnay

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 24, 2021
@bors
Copy link
Contributor

bors commented Dec 24, 2021

⌛ Testing commit 756d163 with merge 3236b66769132290b3e35ed5440aa57aa14fb342...

@rust-log-analyzer
Copy link
Collaborator

The job x86_64-msvc-2 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
9 note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
10 


The actual run.stderr differed from the expected run.stderr.
Actual run.stderr saved to D:\a\rust\rust\build\x86_64-pc-windows-msvc\test\ui\panics\panic-short-backtrace-windows-x86_64\panic-short-backtrace-windows-x86_64.run.stderr
error: 1 errors occurred comparing run output.
status: exit code: 101
status: exit code: 101
command: PATH="D:\a\rust\rust\build\x86_64-pc-windows-msvc\stage2\lib\rustlib\x86_64-pc-windows-msvc\lib;C:\Program Files (x86)\Windows Kits\10\bin\x64;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22000.0\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64;D:\a\rust\rust\build\x86_64-pc-windows-msvc\stage0-bootstrap-tools\x86_64-pc-windows-msvc\release\deps;D:\a\rust\rust\build\x86_64-pc-windows-msvc\stage0\bin;D:\a\rust\rust\ninja;D:\a\rust\rust\msys2\mingw64\bin;C:\hostedtoolcache\windows\Python\3.10.1\x64\Scripts;C:\hostedtoolcache\windows\Python\3.10.1\x64;C:\msys64\usr\bin;D:\a\rust\rust\sccache;C:\Program Files\MongoDB\Server\5.0\bin;C:\aliyun-cli;C:\vcpkg;C:\cf-cli;C:\Program Files (x86)\NSIS;C:\tools\zstd;C:\Program Files\Mercurial;C:\hostedtoolcache\windows\stack\2.7.3\x64;C:\cabal\bin;C:\ghcup\bin;C:\tools\ghc-9.2.1\bin;C:\Program Files\dotnet;C:\mysql\bin;C:\Program Files\R\R-4.1.2\bin\x64;C:\SeleniumWebDrivers\GeckoDriver;C:\Program Files (x86)\sbt\bin;C:\Program Files (x86)\GitHub CLI;C:\Program Files\Git\bin;C:\Program Files (x86)\pipx_bin;C:\hostedtoolcache\windows\go\1.15.15\x64\bin;C:\hostedtoolcache\windows\Python\3.7.9\x64\Scripts;C:\hostedtoolcache\windows\Python\3.7.9\x64;C:\hostedtoolcache\windows\Ruby\2.5.9\x64\bin;C:\tools\kotlinc\bin;C:\hostedtoolcache\windows\Java_Temurin-Hotspot_jdk\8.0.312-7\x64\bin;C:\npm\prefix;C:\Program Files (x86)\Microsoft SDKs\Azure\CLI2\wbin;C:\ProgramData\kind;C:\Program Files\Eclipse Foundation\jdk-8.0.302.8-hotspot\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\ProgramData\Chocolatey\bin;C:\Program Files\Docker;C:\Program Files\PowerShell\7;C:\Program Files\Microsoft\Web Platform Installer;C:\Program Files\dotnet;C:\Program Files\Microsoft SQL Server\130\Tools\Binn;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\170\Tools\Binn;C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit;C:\Program Files (x86)\Microsoft SQL Server\110\DTS\Binn;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn;C:\Program Files (x86)\Microsoft SQL Server\130\DTS\Binn;C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn;C:\Program Files (x86)\Microsoft SQL Server\150\DTS\Binn;C:\Program Files\nodejs;C:\Program Files\OpenSSL\bin;C:\Strawberry\c\bin;C:\Strawberry\perl\site\bin;C:\Strawberry\perl\bin;C:\ProgramData\chocolatey\lib\pulumi\tools\Pulumi\bin;C:\Program Files\TortoiseSVN\bin;C:\Program Files\CMake\bin;C:\ProgramData\chocolatey\lib\maven\apache-maven-3.8.4\bin;C:\Program Files\Microsoft Service Fabric\bin\Fabric\Fabric.Code;C:\Program Files\Microsoft SDKs\Service Fabric\Tools\ServiceFabricLocalClusterManager;C:\Program Files\Git\cmd;C:\Program Files\Git\mingw64\bin;C:\Program Files\Git\usr\bin;C:\tools\php;C:\Program Files (x86)\sbt\bin;C:\SeleniumWebDrivers\ChromeDriver;C:\SeleniumWebDrivers\EdgeDriver;C:\Program Files\Amazon\AWSCLIV2;C:\Program Files\Amazon\SessionManagerPlugin\bin;C:\Program Files\Amazon\AWSSAMCLI\bin;C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\bin;C:\Program Files (x86)\Microsoft BizTalk Server;C:\Program Files\LLVM\bin;C:\Users\runneradmin\.dotnet\tools;C:\Users\runneradmin\.cargo\bin;C:\Users\runneradmin\AppData\Local\Microsoft\WindowsApps" "D:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\test\\ui\\panics\\panic-short-backtrace-windows-x86_64\\a.exe"
------------------------------------------

------------------------------------------
stderr:
stderr:
------------------------------------------
thread 'main' panicked at 'd was called', D:\a\rust\rust\src/test\ui\panics\panic-short-backtrace-windows-x86_64.rs:48:5
   0: std::panicking::begin_panic
   1: d
   2: c
   3: b
---

Some tests failed in compiletest suite=ui mode=ui host=x86_64-pc-windows-msvc target=x86_64-pc-windows-msvc


command did not execute successfully: "D:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\stage0-tools-bin\\compiletest.exe" "--compile-lib-path" "D:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\stage2\\bin" "--run-lib-path" "D:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\stage2\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib" "--rustc-path" "D:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\stage2\\bin\\rustc.exe" "--src-base" "D:\\a\\rust\\rust\\src/test\\ui" "--build-base" "D:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\test\\ui" "--stage-id" "stage2-x86_64-pc-windows-msvc" "--suite" "ui" "--mode" "ui" "--target" "x86_64-pc-windows-msvc" "--host" "x86_64-pc-windows-msvc" "--llvm-filecheck" "D:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\llvm\\build\\bin\\FileCheck.exe" "--nodejs" "C:\\Program Files\\nodejs\\node" "--npm" "C:\\Program Files\\nodejs\\npm" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0  -Lnative=D:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\native\\rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0  -Lnative=D:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\native\\rust-test-helpers" "--docck-python" "C:\\hostedtoolcache\\windows\\Python\\3.10.1\\x64\\python3.exe" "--lldb-python" "C:\\hostedtoolcache\\windows\\Python\\3.10.1\\x64\\python3.exe" "--gdb" "C:\\msys64\\usr\\bin\\gdb" "--llvm-version" "13.0.0-rust-1.59.0-nightly" "--llvm-components" "aarch64 aarch64asmparser aarch64codegen aarch64desc aarch64disassembler aarch64info aarch64utils aggressiveinstcombine all all-targets analysis arm armasmparser armcodegen armdesc armdisassembler arminfo armutils asmparser asmprinter avr avrasmparser avrcodegen avrdesc avrdisassembler avrinfo binaryformat bitreader bitstreamreader bitwriter bpf bpfasmparser bpfcodegen bpfdesc bpfdisassembler bpfinfo cfguard codegen core coroutines coverage debuginfocodeview debuginfodwarf debuginfogsym debuginfomsf debuginfopdb demangle dlltooldriver dwarflinker dwp engine executionengine extensions filecheck frontendopenacc frontendopenmp fuzzmutate globalisel hexagon hexagonasmparser hexagoncodegen hexagondesc hexagondisassembler hexagoninfo instcombine instrumentation interfacestub interpreter ipo irreader jitlink libdriver lineeditor linker lto m68k m68kasmparser m68kcodegen m68kdesc m68kdisassembler m68kinfo mc mca mcdisassembler mcjit mcparser mips mipsasmparser mipscodegen mipsdesc mipsdisassembler mipsinfo mirparser msp430 msp430asmparser msp430codegen msp430desc msp430disassembler msp430info native nativecodegen nvptx nvptxcodegen nvptxdesc nvptxinfo objcarcopts object objectyaml option orcjit orcshared orctargetprocess passes powerpc powerpcasmparser powerpccodegen powerpcdesc powerpcdisassembler powerpcinfo profiledata remarks riscv riscvasmparser riscvcodegen riscvdesc riscvdisassembler riscvinfo runtimedyld scalaropts selectiondag sparc sparcasmparser sparccodegen sparcdesc sparcdisassembler sparcinfo support symbolize systemz systemzasmparser systemzcodegen systemzdesc systemzdisassembler systemzinfo tablegen target textapi transformutils vectorize webassembly webassemblyasmparser webassemblycodegen webassemblydesc webassemblydisassembler webassemblyinfo webassemblyutils windowsmanifest x86 x86asmparser x86codegen x86desc x86disassembler x86info xray" "--cc" "" "--cxx" "" "--cflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--channel" "nightly" "--color" "always"


Build completed unsuccessfully in 0:38:06
Build completed unsuccessfully in 0:38:06
make: *** [Makefile:74: ci-subset-2] Error 1

@bors
Copy link
Contributor

bors commented Dec 24, 2021

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Dec 24, 2021
@dtolnay
Copy link
Member

dtolnay commented Dec 24, 2021

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 24, 2021
@dtolnay dtolnay assigned dtolnay and unassigned yaahc Dec 24, 2021
@bors
Copy link
Contributor

bors commented Dec 24, 2021

⌛ Testing commit 756d163 with merge 475b00a...

@joshtriplett
Copy link
Member

@AngelicosPhosphoros d+ gives you the ability to r+ it yourself when ready. d is "delegate".

@bors
Copy link
Contributor

bors commented Dec 24, 2021

☀️ Test successful - checks-actions
Approved by: dtolnay
Pushing 475b00a to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 24, 2021
@bors bors merged commit 475b00a into rust-lang:master Dec 24, 2021
@rustbot rustbot added this to the 1.59.0 milestone Dec 24, 2021
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (475b00a): comparison url.

Summary: This change led to large relevant regressions 😿 in compiler performance.

  • Large regression in instruction counts (up to 4.4% on full builds of keccak)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression

@rustbot rustbot added the perf-regression Performance regression. label Dec 25, 2021
@AngelicosPhosphoros
Copy link
Contributor Author

O_o

@AngelicosPhosphoros AngelicosPhosphoros deleted the typeid_inline_74362 branch December 25, 2021 00:52
@AngelicosPhosphoros
Copy link
Contributor Author

I don't understand how this change can affect keccak because it doesn't ever use TypeId.

@joshtriplett
Copy link
Member

It appears to affect several computation-heavy programs. The question is, does the additional compile time buy us runtime performance?

Is this generating better code, or just taking longer?

@AngelicosPhosphoros
Copy link
Contributor Author

I would check that today.

@AngelicosPhosphoros
Copy link
Contributor Author

Did it really became slower?
I tested a compiler using this script on keccak benchmark from perf repository (which degraded mostly according to perf results).

Script code
import sys
import subprocess


def parse_stdout_time(output)->float:
    output = output.strip()
    last = output.split()[-1]
    last = last.decode('ascii')
    assert last[-1] == 's', last
    last = last[:-1]
    parsed = float(last)
    return parsed

def do_check()->float:
    res = subprocess.run(["cargo", "+stage1", "check"], capture_output=True)
    return parse_stdout_time(res.stderr)

def do_build()->float:
    res = subprocess.run(["cargo", "+stage1", "bench", "--no-run"], capture_output=True)
    return parse_stdout_time(res.stderr)

def do_clean():
    subprocess.run(["cargo", "+stage1", "clean"], capture_output=True)

NUM_TRIES = 51

def print_agg(measures):
    mean = sum(measures) / len(measures)
    median = sorted(measures)[len(measures)//2]
    variance = sum((x-mean)*(x-mean) for x in measures)/len(measures)
    stddev = variance**(1/2)
    labels = ['mean', 'median', 'stddev', 'variance']
    values = [mean, median, stddev, variance]
    f = '\t'.join(f'{s}: {v}' for s, v in zip(labels, values))
    print(f"Got {f}")

def try_many_times(command):
    do_clean()
    measures = []
    for i in range(NUM_TRIES):
        print(f"Running test #{i}")
        measures.append(command())
        print(f'#{i} finished in\t{measures[-1]}')
        do_clean()
    print_agg(measures)


if '--check' in sys.argv:
    print("Would run check")
    command = do_check
elif '--build' in sys.argv:
    print("Would run bench --no-run")
    command = do_build
else:
    print("Please, pass either '--check' or '--build'")
    exit(1)

try_many_times(command)
So results from 51 invokations.

Results on older compiler:
check: Got mean: 7.4715686274509805 median: 7.46 stddev: 0.047748218281905536 variance: 0.0022798923490964977
bench --no-run: Got mean: 8.36529411764706 median: 8.36 stddev: 0.042902211706618225 variance: 0.0018405997693194901

Results with my change:
check: Got mean: 7.447647058823529 median: 7.45 stddev: 0.02517298858090916 variance: 0.000633679354094583
bench --no-run: Got mean: 8.36470588235294 median: 8.36 stddev: 0.04026154401140422 variance: 0.001620991926182239

I don't see any slowdown here.
Can we rerun perf on this PR somehow?

@Mark-Simulacrum
Copy link
Member

Mark-Simulacrum commented Dec 25, 2021

Wall-time comparisons are pretty likely to look roughly neutral -- the instruction count regressions are small enough that wall times are not likely to significantly change. I am able to reproduce these locally though (in cachegrind), and since as you mention the TypeId::of method isn't being directly used, that likely means there's no runtime improvement as a result of this change.

FWIW in the example you gave, it seems to me that a better change than just marking as inline may be to modify the MIR const prop pass to be able to evaluate intrinsics, which may be enough that the LLVM IR has constants directly emitted rather than function calls for this case.

@AngelicosPhosphoros
Copy link
Contributor Author

FWIW in the example you gave, it seems to me that a better change than just marking as inline may be to modify the MIR const prop pass to be able to evaluate intrinsics, which may be enough that the LLVM IR has constants directly emitted rather than function calls for this case.

It is inlined if I enable MIR inlining but enabling MIR inlining leads to worse performance overall for some reason.
Also, adding MIR optimization requires much more effort than adding #[inline] :)

@AngelicosPhosphoros
Copy link
Contributor Author

I am able to reproduce these locally though (in cachegrind),

May you please share some guide how to do this?

I run command from perf page: ./target/release/collector diff_local cachegrind +aad4f1039fd5d6bf961ed08eebc6eb69b577f1be +475b00aa4037461b506539a06d15ca6091b461a7 --include keccak --builds Check --runs Full

Got this results:

cat results/cgann-aad4f1039fd5d6bf961ed08eebc6eb69b577f1be-475b00aa4037461b506539a06d15ca6091b461a7-keccak-Check-Full
--------------------------------------------------------------------------------
Files compared:   results/cgfilt-aad4f1039fd5d6bf961ed08eebc6eb69b577f1be-keccak-Check-Full; results/cgfilt-475b00aa4037461b506539a06d15ca6091b461a7-keccak-Check-Full
Command:          /home/angel/.rustup/toolchains/aad4f1039fd5d6bf961ed08eebc6eb69b577f1be/bin/rustc --crate-name keccak src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,future-incompat --crate-type lib --emit=dep-info,metadata -C embed-bitcode=no -C debuginfo=2 -C metadata=7e4bc056058c3e31 -C extra-filename=-7e4bc056058c3e31 --out-dir /tmp/.tmpeGwhH8/target/debug/deps -C linker=clang++-11 -L dependency=/tmp/.tmpeGwhH8/target/debug/deps -Adeprecated -Aunknown-lints -Zincremental-verify-ich; /home/angel/.rustup/toolchains/475b00aa4037461b506539a06d15ca6091b461a7/bin/rustc --crate-name keccak src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,future-incompat --crate-type lib --emit=dep-info,metadata -C embed-bitcode=no -C debuginfo=2 -C metadata=7e4bc056058c3e31 -C extra-filename=-7e4bc056058c3e31 --out-dir /tmp/.tmp1P7Vd0/target/debug/deps -C linker=clang++-11 -L dependency=/tmp/.tmp1P7Vd0/target/debug/deps -Adeprecated -Aunknown-lints -Zincremental-verify-ich
Data file:        results/cgdiff-aad4f1039fd5d6bf961ed08eebc6eb69b577f1be-475b00aa4037461b506539a06d15ca6091b461a7-keccak-Check-Full
Events recorded:  Ir
Events shown:     Ir
Event sort order: Ir
Thresholds:       0.1
Include dirs:
User annotated:
Auto-annotation:  off

--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
960,870,160  PROGRAM TOTALS

--------------------------------------------------------------------------------
Ir           file:function
--------------------------------------------------------------------------------
963,910,535  ???:<rustc_data_structures::obligation_forest::ObligationForest<rustc_trait_selection::traits::fulfill::PendingPredicateObligation>>::process_obligations::<rustc_trait_selection::traits::fulfill::FulfillProcessor, rustc_data_structures::obligation_forest::Outcome<rustc_trait_selection::traits::fulfill::PendingPredicateObligation, rustc_infer::traits::FulfillmentErrorCode>>
 -1,120,020  ???:<rustc_middle::mir::Statement as rustc_serialize::serialize::Encodable<rustc_metadata::rmeta::encoder::EncodeContext>>::encode

@Mark-Simulacrum
Copy link
Member

That's pretty much the same as what I see - it would take more investigation to identify why this patch is leading to that regression, but in general it indicates that the regression on the benchmark is at least somewhat real.

@AngelicosPhosphoros
Copy link
Contributor Author

Created PR to revert this.
#92291

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 29, 2021
…ert_92135, r=joshtriplett

Reverts rust-lang#92135 because perf regression

Please, start a perf test for this.

r? `@joshtriplett` You approved original PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mark TypeId::of as inline to avoid monomorphizing unnecessary code
10 participants