Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove HelperMethodFrames from Object methods #106497

Merged
merged 4 commits into from
Aug 16, 2024

Conversation

AaronRobinsonMSFT
Copy link
Member

Split GetHashCode into fast/slow managed functions.
Split GetType into fast/slow managed functions.

Split GetHashCode into fast/slow managed functions.
Split GetType into fast/slow managed functions.
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

AaronRobinsonMSFT and others added 2 commits August 15, 2024 13:33
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
@AaronRobinsonMSFT
Copy link
Member Author

@EgorBot -arm64 -x64 -perf

using System;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

public class Bench
{
    [Benchmark]
    public void WB()
    {
        Foo foo = new Foo();
        for (long i = 0; i < 200000000; i++)
            foo.GetHashCode();
    }
}

internal class Foo
{
}

@EgorBot
Copy link

EgorBot commented Aug 15, 2024

Benchmark results on Arm64

BDN_Artifacts.zip

@AaronRobinsonMSFT
Copy link
Member Author

@EgorBot -arm64 -perf

using System;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

public class Bench
{
    [Benchmark]
    public void WB()
    {
        Foo foo = new Foo();
        for (long i = 0; i < 200000000; i++)
            foo.GetHashCode();
    }
}

internal class Foo
{
}

@EgorBot
Copy link

EgorBot commented Aug 15, 2024

Benchmark results on Arm64
BenchmarkDotNet v0.14.0, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
Unknown processor
  Job-MGOTHK : .NET 9.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
  Job-NOQPBJ : .NET 9.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
Method Toolchain Mean Error Ratio
WB Main 667.8 ms 0.06 ms 1.00
WB PR 734.6 ms 0.08 ms 1.10

BDN_Artifacts.zip

Flame graphs: Main vs PR 🔥
Hot asm: Main vs PR
Hot functions: Main vs PR

For clean perf results, make sure you have just one [Benchmark] in your app.

@EgorBo
Copy link
Member

EgorBo commented Aug 16, 2024

@EgorBot -intel -arm64 --disasm

using System;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Bench>(args: args);

public class Bench
{
    object o = new object();

    [Benchmark]
    public void ObjGetHashCode() => o.GetHashCode();
    
    [Benchmark]
    public Type ObjGetType() => o.GetType();

    [Benchmark]
    public void HoistGetType()
    {
        object _o = o;
        for (long i = 0; i < 100000; i++)
            _ = _o.GetType(); // should be hoisted
    }

    static Bench B {get;} = new Bench();

    [Benchmark]
    public Type ConstantFoldGetType() => B.GetType(); // should be optimized to typeof(B)
}

@dotnet dotnet deleted a comment from EgorBot Aug 16, 2024
@dotnet dotnet deleted a comment from EgorBot Aug 16, 2024
@dotnet dotnet deleted a comment from EgorBot Aug 16, 2024
@EgorBot
Copy link

EgorBot commented Aug 16, 2024

Benchmark results on Intel
BenchmarkDotNet v0.14.0, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
Intel Xeon Platinum 8370C CPU 2.80GHz, 1 CPU, 16 logical and 8 physical cores
  Job-REILCE : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-URRUAI : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Method Toolchain Mean Error Ratio Code Size
ObjGetHashCode Main 2.3967 ns 0.0188 ns 1.00 43 B
ObjGetHashCode PR 1.4386 ns 0.0019 ns 0.60 180 B
ObjGetType Main 1.4154 ns 0.0010 ns 1.00 16 B
ObjGetType PR 2.0540 ns 0.0086 ns 1.45 49 B
HoistGetType Main 28,689.4815 ns 3.3192 ns 1.00 25 B
HoistGetType PR 28,704.7140 ns 5.8637 ns 1.00 58 B
ConstantFoldGetType Main 0.2684 ns 0.0003 ns 1.00 11 B
ConstantFoldGetType PR 0.3045 ns 0.0005 ns 1.13 11 B

BDN_Artifacts.zip

@EgorBot
Copy link

EgorBot commented Aug 16, 2024

Benchmark results on Arm64
BenchmarkDotNet v0.14.0, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
Unknown processor
  Job-GBUKGF : .NET 9.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
  Job-NMAOTM : .NET 9.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
Method Toolchain Mean Error Ratio Code Size
ObjGetHashCode Main 2.9551 ns 0.0011 ns 1.00 68 B
ObjGetHashCode PR 3.6410 ns 0.0016 ns 1.23 280 B
ObjGetType Main 2.7466 ns 0.0019 ns 1.00 24 B
ObjGetType PR 1.4229 ns 0.0017 ns 0.52 116 B
HoistGetType Main 33,397.7015 ns 4.2124 ns 1.00 40 B
HoistGetType PR 33,395.3323 ns 7.7174 ns 1.00 132 B
ConstantFoldGetType Main 0.0508 ns 0.0013 ns 1.00 28 B
ConstantFoldGetType PR 0.4529 ns 0.0412 ns 8.91 28 B

BDN_Artifacts.zip

// Returns a Type object which represent this object instance.
[Intrinsic]
[MethodImpl(MethodImplOptions.InternalCall)]
public extern Type GetType();
public unsafe Type GetType()
Copy link
Member

@EgorBo EgorBo Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not inlined, then it becomes an indirect call just like any other managed call (while previously it used to be a direct FCall)

if it's inlined (shouldn't happen today I guess due to small callsite size vs a lot of unknown call inside of - it will scare the inliner) - it might ruin some jit opts where jit treats GetType as a
special intrinsic and relies on it even in late phases.

We can optimize 1) if it becomes an issue on VM side (tell jit that certain managed methods can be direct call)

@EgorBo
Copy link
Member

EgorBo commented Aug 16, 2024

@EgorBot -intel -arm64 -perf --disasm

using System;
using BenchmarkDotNet.Attributes;

public class Bench
{
    object o = new object();

    [Benchmark]
    public Type ObjGetType() => o.GetType();
}

@dotnet dotnet deleted a comment from EgorBot Aug 16, 2024
@dotnet dotnet deleted a comment from EgorBot Aug 16, 2024
@EgorBot
Copy link

EgorBot commented Aug 16, 2024

Benchmark results on Intel
BenchmarkDotNet v0.14.0, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
Intel Xeon Platinum 8370C CPU 2.80GHz, 1 CPU, 16 logical and 8 physical cores
  Job-PFJNGN : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-SRGWKM : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Method Toolchain Mean Error Ratio Code Size
ObjGetType Main 1.416 ns 0.0003 ns 1.00 16 B
ObjGetType PR 2.048 ns 0.0080 ns 1.45 49 B

BDN_Artifacts.zip

Flame graphs: Main vs PR 🔥
Hot asm: Main vs PR
Hot functions: Main vs PR

For clean perf results, make sure you have just one [Benchmark] in your app.

@EgorBot
Copy link

EgorBot commented Aug 16, 2024

Benchmark results on Arm64
BenchmarkDotNet v0.14.0, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
Unknown processor
  Job-PXXUSW : .NET 9.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
  Job-TZYKGG : .NET 9.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
Method Toolchain Mean Error Ratio Code Size
ObjGetType Main 2.747 ns 0.0034 ns 1.00 24 B
ObjGetType PR 1.424 ns 0.0018 ns 0.52 116 B

BDN_Artifacts.zip

Flame graphs: Main vs PR 🔥
Hot asm: Main vs PR
Hot functions: Main vs PR

For clean perf results, make sure you have just one [Benchmark] in your app.

@EgorBo
Copy link
Member

EgorBo commented Aug 16, 2024

I see stable results on EgorBot for GetType

x64

Regressed due to indirect vs direct call (+ a small CQ issue in the managed impl):

##### Main

; Bench.ObjGetType()
       push      rax
       mov       rdi,[rdi+8]
       call      System.Object.GetType() ;;;; direct call!
       nop
       add       rsp,8
       ret
; Total bytes of code 16

##### PR

; Bench.ObjGetType()
       push      rax
       mov       rdi,[rdi+8]
       call      qword ptr [708A18A0D7B8]; System.Object.GetType() ;;;; indirect
       nop
       add       rsp,8
       ret
; Total bytes of code 17

; System.Object.GetType()
       push      rbx
       mov       rbx,rdi
       mov       rdi,[rbx]
       mov       rax,[rdi+20]
       add       rax,10  ;;; should be part of the addressing mode?
       mov       rax,[rax]
       test      rax,rax
       jne       short M01_L00
       call      qword ptr [708A18A0D7D0]; System.Object.<GetType>g__GetTypeWorker|1_0(System.Runtime.CompilerServices.MethodTable*)
M01_L00:
       nop
       pop       rbx
       ret
; Total bytes of code 32

Arm64

Actually improved! (reproducable)

##### Main

; Bench.ObjGetType()
       stp       x29, x30, [sp, #-0x10]!
       mov       x29, sp
       ldr       x0, [x0, #8]
       bl        #0xf64041142658
       ldp       x29, x30, [sp], #0x10
       ret       
; Total bytes of code 24

##### PR

; Bench.ObjGetType()
       stp       x29, x30, [sp, #-0x10]!
       mov       x29, sp
       ldr       x0, [x0, #8]
       movz      x1, #0xcdc8
       movk      x1, #0xdd36, lsl #16
       movk      x1, #0xf947, lsl #32
       ldr       x1, [x1]
       blr       x1; System.Object.GetType()
       ldp       x29, x30, [sp], #0x10
       ret       
; Total bytes of code 40

; System.Object.GetType()
       stp       x29, x30, [sp, #-0x20]!
       str       x19, [sp, #0x18]
       mov       x29, sp
       mov       x19, x0
       ldr       x0, [x19]
       ldr       x1, [x0, #0x20]
       add       x1, x1, #0x10
       ldr       x1, [x1]
       cbnz      x1, M01_L00
       movz      x1, #0xcde0
       movk      x1, #0xdd36, lsl #16
       movk      x1, #0xf947, lsl #32
       ldr       x1, [x1]
       blr       x1; System.Object.<GetType>g__GetTypeWorker|1_0(System.Runtime.CompilerServices.MethodTable*)
       mov       x1, x0
M01_L00:
       mov       x0, x1
       ldr       x19, [sp, #0x18]
       ldp       x29, x30, [sp], #0x20
       ret       
; Total bytes of code 76

I don't have a good explanation why

@AaronRobinsonMSFT AaronRobinsonMSFT merged commit 652a150 into dotnet:main Aug 16, 2024
88 of 90 checks passed
@AaronRobinsonMSFT AaronRobinsonMSFT deleted the hmf_object_methods branch August 16, 2024 14:29
@github-actions github-actions bot locked and limited conversation to collaborators Sep 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants