Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codegen: don't use init check for consts that are declared before read #9801

Merged
merged 1 commit into from
Oct 6, 2020

Conversation

asterite
Copy link
Member

@asterite asterite commented Oct 5, 2020

The way constant work crystal is:

  • If the constant is a number literal, or something that can be computed at compile-time, like a math operation, then the constant value is assigned to an LLVM global that's marked as constant. Accessing this constant is just accessing the LLVM global
  • All other constants are lazily initialized: whenever we access them we check if they were initialized (in a thread-safe way) and if not, we initialize them

The problem is that this prevents LLVM from inlining the second group of constants, which is a shame. You'd also expect constants to be inlined whenever possible.

The reason why all constants are lazily initialized is because of hoisting. This works:

puts A

A = 1

This works fine. The above is a simple example but puts A could be some initialization done by a library where constants that are defined in other files are used.

This PR changes the codegen part to track whenever a constant is read. When we reach a constant declaration and it hasn't been read yet, it means that it's safe to initialize it right away, not lazily. This makes startup a bit slower, but accessing a constant is much faster (about 3 times) and we don't pay this price on every access.

For instance, a Gameboy Advance emulator used constants in a very tight loop and it resulted in very poor frames per second (FPS), something like 30. When using a compiler with this PR the FPS doubled for me.

Here's another benchmark:

require "benchmark"

RANGE = 3..7

a = 1
v = ARGV[0].to_i

Benchmark.ips do |x|
  x.report("constant") do
    case v
    when RANGE
      a &+= 1
    end
  end
  x.report("inline") do
    case v
    when 3..7
      a &+= 1
    end
  end
  x.report("manual") do
    if 3 <= v <= 7
      a &+= 1
    end
  end
end

puts a

Before:

 constant 291.75M (  3.43ns) (± 8.16%)  0.0B/op   2.66× slower
   inline 769.21M (  1.30ns) (± 6.62%)  0.0B/op   1.01× slower
   manual 777.14M (  1.29ns) (± 7.07%)  0.0B/op        fastest

After:

 constant 605.09M (  1.65ns) (± 4.55%)  0.0B/op   1.15× slower
   inline 697.44M (  1.43ns) (± 5.34%)  0.0B/op        fastest
   manual 601.71M (  1.66ns) (± 5.20%)  0.0B/op   1.16× slower

We can do a similar optimization for class variables, but I'll send it as a separate PR if this PR is merged.

@asterite asterite marked this pull request as ready for review October 5, 2020 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants