-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code, which is never executed, slows down everything by factor 3 #9610
Comments
cc @DrTodd13 |
I tried it on AMD Epyc now: pardictimpl_slow.py
pardictimpl_fast.py
numba -s
|
I did additional debugging. I found a workaround: https://github.com/TheTesla/py-par-dict/blob/master/pardictimpl_slow_fix.py
What actually happensIf the never executed code is commented out, the variable What is also a bit weird: Just pulling in the |
I did some profiling and it seems that it does 308,789 extra allocations which is the equivalent number of allocations as running the code (so you basically run the function almost twice), and kind of aligns with the timings of the benchmarking you're doing. I also recommend you to not use time use something else like
|
I implemented an approach of lock-free parallel dictionary write. But I found weird performance issues. The total time needed to complete increased with the number of threads. I found out, this issue can be solved by removing code, which is never executed in some applications.
This is with the code, which is never used, commented in:
https://github.com/TheTesla/py-par-dict/blob/master/pardictimpl_slow.py
This is with the code, which is never used, commented out:
https://github.com/TheTesla/py-par-dict/blob/master/pardictimpl_fast.py
Normally, there shouldn't be any difference. Both should behave like the fast one.
numba -s
The text was updated successfully, but these errors were encountered: