Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Stop passing CompAllocator pointers around #15025

Merged
merged 1 commit into from
Jun 30, 2018
Merged

Conversation

mikedn
Copy link

@mikedn mikedn commented Nov 14, 2017

Currently CompAllocator is passed around by pointer even though (when MEASURE_MEM_ALLOC is not defined) it is a pointer sized object and thus it would make more sense to pass around by value. This is problematic because:

  • It tends to require the allocator to be dynamically allocated. It is possible to avoid dynamic allocation (e.g. as in optVnCopyProp) but this needs care to avoid creating objects that depend on the allocator but have a longer lifetime. Now, dynamic allocation wouldn't be so bad in itself but if you do allocate dynamically you need to cache the allocator as well (e.g. LinearScan::getAllocator). In all, it's a bit of hassle to use CompAllocator and I suspect that this is one reason why quite a bit of JIT memory ends up in CMK_Generic/Unknown or other "general purpose" buckets.
  • It introduces extra indirections - jitstd::list contains a pointer to a CompAllocator which contains a pointer to a Compiler which contains a pointer to an ArenaAllocator. Out of 8 instructions required for an allocation 2 are wasted following through indirections.

One problem with passing the existing CompAllocator around by value is that when MEASURE_MEM_ALLOC is enabled the size of the CompAllocator increases and together with the size of any objects that contains an allocator object (e.g. jitstd::vector). This will end up skewing memory statistics somewhat.

The solution is to always make CompAllocator pointer sized. When MEASURE_MEM_ALLOC is not enabled it should contain a pointer to ArenaAllocator. When MEASURE_MEM_ALLOC is enabled it should contain a pointer to an ArenaAllocator wrapper object that holds a pointer to the real arena and the memory kind value. There would be one such wrapper object per memory kind value.

Initially I tried to handle this in a more surgical manner but things got a bit ugly - CompAllocator relied on Compiler::compGetMem and it was compGetMem that took care of tracking memory stats when it's ArenaAllocator that should be doing that. So I ended up moving MemStats to ArenaAllocator.

And then moving mem stats to arena got complicated too due to PooledAllocator. Its existence meant that memory stats had to be reset every time the pooled allocator was reused. But the whole idea of a pooled allocator doesn't really make sense, you can't and don't need to pool the allocator because pretty much it's entire state needs to be reset anyway. What needs to be pooled is just a 64k memory page. So no more PooledAllocator, an ArenaAllocator is always created when a method is compiled and on first allocation request it will attempt to grab the pooled page if available.

Since changes around CompAllocator already affected various allocation sites I used the opportunity to do a bit of cleanup:

  • CompAllocator's allocation interface was a leftover from the days of IAllocator and it's rather clumsy. I changed it to look more like std::allocator interface - a single allocate function that takes a count parameter that indicates how many objects to allocate. Avoids the need for sizeof(X) and cast from void* to X*. It's not exactly like std::allocator, that one is a template and I see no reason to mirror that.
  • Got rid of compGetMem and compGetMemArray. There weren't too many uses of these functions and they're just redundant.
  • Changed SmallHashTable to use CompAllocator like other JIT collection classes do. It was using a special, global wrapper of Compiler::compGetMem.

There's one small change related compGetMemArray that's worth mentioning - the integer overflow check. The old code did numElem > (MAX_MEMORY_PER_ALLOCATION / elemSize) where MAX_MEMORY_PER_ALLOCATION is 512 megabytes, this not only checked for integer overflow but also capped the allocation size to 512MB. The new code uses SIZE_MAX, it's faster on 64bit targets because in many cases the number of elements is unsigned and the C++ compiler eliminates the entire check. But this version does not limit the allocation size, if this is considered useful (e.g. it may stop the JIT before consuming all available memory) then such a check should be added to ArenaAllocator::allocateNewPage, where it has minimal perf impact.

PIN shows a 0.15% improvement in instructions retired: https://1drv.ms/x/s!Av4baJYSo5pjgr1zpvRR2DPBIP_RSA

@erozenfeld
Copy link
Member

@mikedn This is marked [WIP]. Is this ready for review?

@mikedn
Copy link
Author

mikedn commented Apr 13, 2018

AFAIR the main remaining work here was extending the PR description to explain some of these changes. I'll have to do that and figure out if there's anything else...

@mikedn mikedn force-pushed the alloc-val branch 3 times, most recently from 1f15f7a to b4aa832 Compare April 17, 2018 17:13
@mikedn mikedn changed the title [WIP] Stop passing CompAllocator pointers around Stop passing CompAllocator pointers around Apr 17, 2018
@mikedn
Copy link
Author

mikedn commented Apr 17, 2018

This will probably need to be squashed but for review it may be easier to look at the individual commits (though GitHub displays them in a weird order on the conversation page).

The most sensitive change is around allocator pooling. The rest of the PR consists of many trivial changes (e.g. CompAllocator* becomes CompAllocator, compGetMem becomes allocate<T>) and the moving of the mem stats code from Compiler to ArenaAllocator.

I tried to make a diff of old and new mem stats dumps but discovered that in the old version CompMemKind tagged allocations weren't always recorded correctly (e.g. some allocation were made before initializing mem stats) so the diff is messy.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.... @dotnet/jit-contrib anyone else want to take a look?

@mikedn
Copy link
Author

mikedn commented Jun 27, 2018

Not sure what a good place for coding guidelines is today - I'll use this to point out that using the new operator should be preferred - new (getAllocator(CMK_FooBar)) FooBar and not getAllocator(CMK_FooBar).allocate<FooBar>(1).

allocate (formerly compGetMem) should be used only when you specifically don't want the default constructor to run and that should not be a common case. Typical users of allocate are container class such as vector as these need to allocate memory before actually adding elements.

@AndyAyersMS
Copy link
Member

@BruceForstall any advice?

Wonder if we can make the allocate accessible to friend classes only. Or maybe just give it a scarier sounding name.

@echesakov
Copy link

I think it's technically possible to do something like

template<class ReturnType, class CallerType, class = CanUseAllocate<CallerType>>
ReturnType* allocate<FooBar>(int num, CallerType* callerPointer)

which later be used as getAllocator(CMK_FooBar).allocate<FooBar>(1, this).

Then we can restrict what types of callers are allowed to use allocate in "declarative" way

template<class CallerType>
struct CanUseAllocate;

template<>
struct CanUseAllocate<Compiler>
{
};

But I guess this would add too much complexity...

@BruceForstall
Copy link
Member

Re: coding conventions

We still have https://github.com/dotnet/coreclr/blob/master/Documentation/coding-guidelines/clr-jit-coding-conventions.md, although the formatting parts have been obsoleted / overridden by jit-format. The non-formatting parts are still relevant.

@BruceForstall
Copy link
Member

@mikedn Did you run various asm diffs combos on this to verify no diffs?

@AndyAyersMS
Copy link
Member

I have the changes built locally, so I'll run some diffs.

@BruceForstall
Copy link
Member

BruceForstall commented Jun 27, 2018

Ok, I ran both x86 and x64 Checked PMI diffs on frameworks and saw no diffs. (Did this before I saw your message, Andy)

@AndyAyersMS
Copy link
Member

@mikedn want to squash and add something to the coding conventions doc? Then I think we're set to merge.

@mikedn
Copy link
Author

mikedn commented Jun 28, 2018

Wonder if we can make the allocate accessible to friend classes only. Or maybe just give it a scarier sounding name.

I think it's technically possible to do something like...

Yep but it sounds a bit too much in this case. At the end of the day new is typical way to allocate in C++. It's just the long, legacy history of the JIT that may require nudging people in the right direction, hopefully updating coding conventions and PR reviews would be enough.

Ok, I ran both x86 and x64 Checked PMI diffs on frameworks and saw no diffs. (Did this before I saw your message, Andy)

Thanks! (for doing my work while I was sleeping :)

want to squash and add something to the coding conventions doc? Then I think we're set to merge.

Sure, will do that this evening (morning for you suppose)

Side note - I realized that the change from operator new (CompAllocator*) to operator new (CompAllocator) will allow the removal of the jitstd::placement_t hack. Maybe one day we'll be able to use the real STL and then this will be useful (AFAIR STL does not allow placement new operator replacement).

@mikedn mikedn force-pushed the alloc-val branch 3 times, most recently from 9f87094 to bed12e2 Compare June 28, 2018 16:25
@mikedn
Copy link
Author

mikedn commented Jun 28, 2018

Woops, I updated coding conventions but now a CoreFX build failed with an assert in allocator. Looks like a shutdown race...

@AndyAyersMS
Copy link
Member

@dotnet/jit-contrib any last-minute reservations about me merging this... ?

@BruceForstall
Copy link
Member

LGTM

@AndyAyersMS
Copy link
Member

@mikedn maybe you can squash and reword the one surviving commit description? There's only fragments per commit so I didn't feel comfortable just stitching one up from those.

Passing CompAllocator objects by value is advantageous because it no longer needs to be dynamically allocated and cached. CompAllocator instances can now be freely created, copied and stored, which makes adding new CompMemKind values easier.

Together with other cleanup this also improves memory allocation performance by removing some extra levels of indirection that were previously required - jitstd::allocator had a pointer to CompAllocator, CompAllocator had a pointer to Compiler and Compiler finally had a pointer to ArenaAllocator. Without MEASURE_MEM_ALLOC enabled, both jitstd::allocator and CompAllocator now just contain a pointer to ArenaAllocator. When MEASURE_MEM_ALLOC is enabled CompAllocator also contains a pointer but to a MemStatsAllocator object that holds the relevant memory kind. This way CompAllocator is always pointer sized so that enabling MEASURE_MEM_ALLOC does not result in increased memory usage due to objects that store a CompAllocator instance.

In order to implement this, 2 additional signficant changes have been made:
* MemStats has been moved to ArenaAllocator, it's after all the allocator's job to maintain statistics. This also fixes some issues related to memory statistics, such as not tracking the memory allocated by the inlinee compiler (since that one used its own MemStats instance).
* Extract the arena page pooling logic out of the allocator. It doesn't make sense to pool an allocator, it has very little state that can actually be reused and everyting else (including MemStats) needs to be reset on reuse. What really needs to be pooled is just a page of memory.

Since this was touching allocation code the opportunity has been used to perform additional cleanup:
* Remove unnecessary LSRA ListElementAllocator
* Remove compGetMem and compGetMemArray
* Make CompAllocator and HostAllocator more like the std allocator
* Update HashTable to use CompAllocator
* Update ArrayStack to use CompAllocator
* Move CompAllocator & friends to alloc.h
@mikedn
Copy link
Author

mikedn commented Jun 30, 2018

@AndyAyersMS Squashed and reworded. Hopefully CI won't fail again :)

Thanks everyone for review & feedback!

@AndyAyersMS AndyAyersMS merged commit c2baf04 into dotnet:master Jun 30, 2018
@AndyAyersMS
Copy link
Member

Beautiful, thanks !

@mikedn mikedn deleted the alloc-val branch September 28, 2019 19:17
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Passing CompAllocator objects by value is advantageous because it no longer needs to be dynamically allocated and cached. CompAllocator instances can now be freely created, copied and stored, which makes adding new CompMemKind values easier.

Together with other cleanup this also improves memory allocation performance by removing some extra levels of indirection that were previously required - jitstd::allocator had a pointer to CompAllocator, CompAllocator had a pointer to Compiler and Compiler finally had a pointer to ArenaAllocator. Without MEASURE_MEM_ALLOC enabled, both jitstd::allocator and CompAllocator now just contain a pointer to ArenaAllocator. When MEASURE_MEM_ALLOC is enabled CompAllocator also contains a pointer but to a MemStatsAllocator object that holds the relevant memory kind. This way CompAllocator is always pointer sized so that enabling MEASURE_MEM_ALLOC does not result in increased memory usage due to objects that store a CompAllocator instance.

In order to implement this, 2 additional signficant changes have been made:
* MemStats has been moved to ArenaAllocator, it's after all the allocator's job to maintain statistics. This also fixes some issues related to memory statistics, such as not tracking the memory allocated by the inlinee compiler (since that one used its own MemStats instance).
* Extract the arena page pooling logic out of the allocator. It doesn't make sense to pool an allocator, it has very little state that can actually be reused and everyting else (including MemStats) needs to be reset on reuse. What really needs to be pooled is just a page of memory.

Since this was touching allocation code the opportunity has been used to perform additional cleanup:
* Remove unnecessary LSRA ListElementAllocator
* Remove compGetMem and compGetMemArray
* Make CompAllocator and HostAllocator more like the std allocator
* Update HashTable to use CompAllocator
* Update ArrayStack to use CompAllocator
* Move CompAllocator & friends to alloc.h

Commit migrated from dotnet/coreclr@c2baf04
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants