You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#107 has frontend benchmarks that do not instantiate the entire runtime.
Low hanging fruit:
Currently, 13% of the time is spent in iostreams for creating the CDAG debug labels. Maybe this should happen in graph printing, not generation (there is already a PoC implementation for that)
malloc/free pairs have significant impact deep in dependency checking. We can potentially eliminate a lot of convenience allocations (get_accessed_buffers et al)
Requires further investigation:
Use a bump allocator for intrusive graphs to improve locality, e.g. a heuristically sized pool per horizon + fallback as needed (maybe also for the dependencies vectors?)
command / task map rehashing is also a noticeable factor, but memory is abundant - reserve() them to avoid stalls
The text was updated successfully, but these errors were encountered:
I think this is actually (reasonably) complete at this point. The low hanging fruit has been addressed by various changes and I don't think we are substantially limited by "frontend" performance in real-world benchmarks right now.
#107 has frontend benchmarks that do not instantiate the entire runtime.
Low hanging fruit:
get_accessed_buffers
et al)Requires further investigation:
reserve()
them to avoid stallsThe text was updated successfully, but these errors were encountered: