Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert optimization of constants and class vars #10330

Closed
wants to merge 2 commits into from

Conversation

asterite
Copy link
Member

@asterite asterite commented Jan 29, 2021

As I explain here, it seems these changes are the reason the 0.36.0 compiler is so slow.

Reverts #9801
Reverts #9995

I didn't use git revert because then an optimization to not use an "init" flag for string constants came in, and that's fine.

Fixes https://forum.crystal-lang.org/t/specs-on-crystal-0-36-0-are-taking-a-long-time/2910

@asterite asterite added topic:compiler performance kind:regression Something that used to correctly work but no longer works labels Jan 29, 2021
@asterite asterite added this to the 0.36.1 milestone Jan 29, 2021
@kostya
Copy link
Contributor

kostya commented Jan 29, 2021

I not think this was small optimization, some critical code can be 3.4 times faster:

little bench from real code:

COMBS = UInt8[1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0,
  0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
  0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
  0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0].to_unsafe

INVERTED = UInt8[10, 9, 8, 11, 6, 5, 4, 11, 2, 1, 0, 7, 14, 13, 12, 15].to_unsafe

@[AlwaysInline]
private def combo_index(v : UInt8, other : UInt8) : UInt8
  (v << 4) | other
end

@[AlwaysInline]
def good?(v : UInt8, other : UInt8) : Bool
  COMBS[combo_index(v, INVERTED[other])] != 0
end

N = (ARGV[0]? || 100000000).to_i

t = Time.local
c = 0_u64
N.times do |i|
  d1 = i & 15
  d2 = rand(16)
  c += 1 if good?(d1.to_u8, d2.to_u8)
end
p c
p Time.local - t

crystal 0.35.1 -> 0.86s
crystal 0.36.0 -> 0.25s

@asterite
Copy link
Member Author

I see. We can always reintroduce this optimization later while also making sure we don't make compile times incredibly slow. But right now the priority is to release 0.36.1 soon.

@j8r
Copy link
Contributor

j8r commented Jan 29, 2021

Couldn't the optimization only be done in release mode?

@asterite
Copy link
Member Author

@j8r Well, macro runs are compiled in release mode, so...

@j8r
Copy link
Contributor

j8r commented Jan 29, 2021

Or maybe like for gcc/clang with -O2, or an other compile-time optimization flag 😄
Not needed if it is possible to have both a fast program and compile fast as you said.

However, not sure it will be always the case in the future...

@bcardiff
Copy link
Member

At this point is reverting to get a reasonable experience. The motivation for the optimization still holds but the current efforts were not enough. Once they are worked out fully they can come back. It's not that bad :-)

@mattrberry
Copy link
Contributor

I agree that the performance improvements here likely aren't worth the compile-time hit. I'd also just like to throw my perspective in. These optimizations had bought me 15%-20% better performance for CryBoy on 0.36.0 vs 0.35.1, which is pretty significant. Even with this rollback, it'd be nice to find some other way to bring these optimizations back at a later point

@asterite
Copy link
Member Author

I still don't know why LLVM takes a lot more time to optimize this.

One thing I can try is to move the initialization of each constant/class_var to a separate function. Maybe like that it's easier for LLVM, I don't know. I have to try it.

But we can always try that later, before or after 1.0. Though if I have time before the next patch release, I'll do it!

@asterite
Copy link
Member Author

Please don't merge this just yet 🙏

@asterite
Copy link
Member Author

I have a separate PR where the optimization remains and the compile times are not affected (I think) 😸

@asterite
Copy link
Member Author

Closed in favor of #10334

@asterite asterite closed this Jan 30, 2021
@asterite asterite deleted the revert/optimize-constants-and-class-vars branch January 30, 2021 12:01
@asterite asterite removed this from the 0.36.1 milestone Jan 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:regression Something that used to correctly work but no longer works performance topic:compiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants