Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for DATAS assert during initialization stage #106752

Merged
merged 1 commit into from
Aug 22, 2024

Conversation

Maoni0
Copy link
Member

@Maoni0 Maoni0 commented Aug 21, 2024

we are hitting this assert in gc_thread_function

assert ((n_heaps <= heap_number) || !gc_t_join.joined());

during the init stage. this is because the main thread (that does the GC init) is calling change_heap_count which acts like h0's thread while h0's thread can be anywhere before waiting on ee_suspend_event. so we could be in this situation -

  1. main thread calls GCHeap::Init and creates h0's GC thread
  2. main thread calls gc_heap::change_heap_count and this will change n_heaps to 1, and after there's another join -
    // join the last time to change the heap count again if needed.
    if (new_n_heaps < old_n_heaps)
    {
        gc_t_join.join (this, gc_join_merge_temp_fl);
        if (gc_t_join.joined ())
        {
            dprintf (9999, ("now changing the join heap count to the smaller one %d", new_n_heaps));
            gc_t_join.update_n_threads (new_n_heaps);

            gc_t_join.restart ();
        }
    }

when the main thread and all other heaps' GC threads have joined, it changes gc_t_join.joined_p to true

  1. h0's thread runs and sees that n_heaps is 1, and gc_t_join.joined_p is true which triggers the assert.

so this only happens during the init stage on h0's thread.

Copy link
Contributor

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

@cshung
Copy link
Member

cshung commented Aug 21, 2024

With this change, #106602 should be fixed as well.

Copy link
Member

@mangod9 mangod9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like only a debug issue, so probably not worth porting to 9, unless CI continues to fail consistently there?

@Maoni0
Copy link
Member Author

Maoni0 commented Aug 21, 2024

Looks like only a debug issue, so probably not worth porting to 9, unless CI continues to fail consistently there?

agreed, the only reason to port this to 9 is if we want to avoid consistent CI failures

@Maoni0 Maoni0 merged commit 10107d3 into dotnet:main Aug 22, 2024
88 of 90 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Sep 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants